This page intentionally left blank
From Speech Physiology to Linguistic Phonetics
This page intentionally left blan...
115 downloads
1449 Views
3MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
This page intentionally left blank
From Speech Physiology to Linguistic Phonetics
This page intentionally left blank
From Speech Physiology to Linguistic Phonetics
Alain Marchal
First published in France in 2007 by Hermes Science/Lavoisier entitled: La production de la parole © LAVOISIER, 2007 First published in Great Britain and the United States in 2009 by ISTE Ltd and John Wiley & Sons, Inc. Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address: ISTE Ltd 27-37 St George’s Road London SW19 4EU UK
John Wiley & Sons, Inc. 111 River Street Hoboken, NJ 07030 USA
www.iste.co.uk
www.wiley.com
© ISTE Ltd, 2009 The rights of Alain Marchal to be identified as the author of this work have been asserted by him in accordance with the Copyright, Designs and Patents Act 1988. Library of Congress Cataloging-in-Publication Data Marchal, Alain. [Production de la parole. English] From speech physiology to linguistic phonetics / Alain Marchal. p. cm. Includes bibliographical references and index. ISBN 978-1-84821-113-1 1. Speech. 2. Phonetics. I. Title. P95.M3213 2009 612.7'8--dc22 2009017089 British Library Cataloguing-in-Publication Data A CIP record for this book is available from the British Library ISBN: 978-1-84821-113-1 Printed and bound in Great Britain by CPI Antony Rowe, Chippenham and Eastbourne.
Table of Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ix
Chapter 1. Respiration and Pulmonary Initiation . 1.1. The rib cage. . . . . . . . . . . . . . . . . . . . . 1.2. Lungs . . . . . . . . . . . . . . . . . . . . . . . . 1.3. Normal respiration. . . . . . . . . . . . . . . . . 1.3.1. Inhalation. . . . . . . . . . . . . . . . . . . . 1.3.2. Exhalation . . . . . . . . . . . . . . . . . . . 1.4. Respiration muscles . . . . . . . . . . . . . . . . 1.4.1. Inhalation muscles . . . . . . . . . . . . . . 1.4.2. Exhalation muscles . . . . . . . . . . . . . . 1.5. Pulmonary capacity and pulmonary volume . 1.6. Respiration in phonation . . . . . . . . . . . . . 1.6.1. The respiratory cycle . . . . . . . . . . . . . 1.6.2. Control of exhalation. . . . . . . . . . . . . 1.6.3. Subglottal pressure . . . . . . . . . . . . . . 1.6.4. Subglottal pressure and stress. . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
1 2 3 5 5 5 5 6 7 10 11 11 13 16 20
Chapter 2. Phonation and the Larynx . . . . . 2.1. The larynx . . . . . . . . . . . . . . . . . . 2.1.1. External configuration of the larynx . 2.1.2. Internal configuration . . . . . . . . . 2.2. The laryngeal cartilages . . . . . . . . . . 2.2.1. The cricoid cartilage . . . . . . . . . . 2.2.2. The thyroid cartilage . . . . . . . . . . 2.2.3. The arytenoid cartilages . . . . . . . . 2.2.4. The epiglottic cartilage. . . . . . . . . 2.3. Joints and ligaments. . . . . . . . . . . . . 2.3.1. Intrinsic joints and ligaments . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
23 23 24 26 30 30 31 31 32 32 32
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
vi
From Speech Physiology to Linguistic Phonetics
2.3.2. The membranes and the extrinsic ligaments 2.4. The larynx muscles . . . . . . . . . . . . . . . . . 2.4.1. The intrinsic muscles. . . . . . . . . . . . . . 2.4.2. The extrinsic muscles . . . . . . . . . . . . . 2.5. Innervation of the larynx . . . . . . . . . . . . . . 2.6. The mucous membrane of the larynx. . . . . . . 2.7. Phonation . . . . . . . . . . . . . . . . . . . . . . . 2.7.1. Opening and closing of the glottis . . . . . . 2.7.2. Vocal fold vibration . . . . . . . . . . . . . . 2.7.3. Voice registers . . . . . . . . . . . . . . . . . 2.7.4. Head voice? . . . . . . . . . . . . . . . . . . . 2.7.5. Efficiency of the vocal generator. . . . . . . 2.7.6. The evaluation of phonation: voice quality . 2.8. The linguistic functions of laryngeal activity . . 2.8.1. Glottal states and phonation types . . . . . . 2.8.2. Tone and intonation . . . . . . . . . . . . . . 2.8.3. Glottal articulation . . . . . . . . . . . . . . . 2.9. Phonetic features. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
34 35 35 42 50 50 50 51 52 56 57 58 59 59 59 63 63 63
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
65 66 67 75 77 78 83 88
Chapter 4. Articulation: The Labio-Mandibular System. 4.1. The lips: anatomical and functional description . . . 4.1.1. Lip closure . . . . . . . . . . . . . . . . . . . . . . . 4.1.2. Lip protrusion . . . . . . . . . . . . . . . . . . . . . 4.1.3. Lip rounding . . . . . . . . . . . . . . . . . . . . . . 4.1.4. Raising the upper lip . . . . . . . . . . . . . . . . . 4.1.5. Lowering the lower lip. . . . . . . . . . . . . . . . 4.1.6. Lip spreading . . . . . . . . . . . . . . . . . . . . . 4.1.7. Lowering the corners of the mouth . . . . . . . . 4.1.8. Raising the corners of the mouth . . . . . . . . . . 4.2. The jaw . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1. Muscles of the lower jaw . . . . . . . . . . . . . . 4.2.2. The suprahyoid muscles . . . . . . . . . . . . . . . 4.3. Linguistic functions of lip movement . . . . . . . . . 4.3.1. Vowels . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
97 98 99 99 99 100 100 101 101 104 106 106 109 110 110
Chapter 3. Articulation: Pharynx and Mouth. . 3.1. The oral cavity . . . . . . . . . . . . . . . . . 3.1.1. The tongue . . . . . . . . . . . . . . . . . 3.1.2. Tongue control . . . . . . . . . . . . . . 3.2. The pharynx . . . . . . . . . . . . . . . . . . 3.2.1. The rhino-pharynx . . . . . . . . . . . . 3.2.2. The hypopharynx and the oropharynx. 3.2.3. The role of the pharynx in speech . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
Table of Contents
4.3.2. Consonants. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4. Motor coordination between the lips and the lower jaw. . . . . . . . . .
vii
111 114
Chapter 5. Elements of Articulatory Typology . . . . . . . . . . . . . . . . . 5.1. Aerodynamic mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1. Pulmonary initiation . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.2. The larynx . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.3. The supralaryngeal articulators . . . . . . . . . . . . . . . . . . . . . 5.2. Phonatory modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1. Voicing or modal voice . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2. Voicelessness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.3. Breathy mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.4. The murmur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.5. Laryngealization or “creaky” mode . . . . . . . . . . . . . . . . . . 5.2.6. Whisper mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.7. Glottal closure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3. Articulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1. The dimensions of the articulatory description of speech sounds.
. . . . . . . . . . . . . . .
117 117 117 119 119 121 122 122 122 124 124 125 125 125 125
Chapter 6. The Articulatory Description of Vowels and Consonants. 6.1. Vowels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.1. Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.2. Articulatory region/zone. . . . . . . . . . . . . . . . . . . . . . . 6.1.3. Vocalic aperture. . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.4. The vowel space: cardinal vowels . . . . . . . . . . . . . . . . . 6.1.5. The temporal dimension . . . . . . . . . . . . . . . . . . . . . . . 6.1.6. Dynamic aspects . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.7. Secondary articulation . . . . . . . . . . . . . . . . . . . . . . . . 6.1.8. Tension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2. Consonants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1. Articulation mode. . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2. Description of consonantal articulations . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
133 134 134 135 135 136 137 138 140 140 141 141 142
Chapter 7. Coarticulation and Co-production . . . 7.1. Translation models . . . . . . . . . . . . . . . . 7.1.1. From plan to execution . . . . . . . . . . . 7.1.2. Feature spreading . . . . . . . . . . . . . . . 7.1.3. Limits to translation theories . . . . . . . . 7.2. Action models . . . . . . . . . . . . . . . . . . . 7.2.1. Control of coordinated movement . . . . . 7.2.2. Degrees of freedom. . . . . . . . . . . . . . 7.2.3. Coordinative structures . . . . . . . . . . . 7.3. Towards a direct theory of speech production
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
153 155 157 157 159 161 162 163 164 165
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
viii
From Speech Physiology to Linguistic Phonetics
7.3.1. The coordinative structures of speech . . . . . . . 7.3.2. The supervisory system . . . . . . . . . . . . . . . 7.4. The nature of coarticulation phenomena . . . . . . . . 7.4.1. The allophone as phonetic entity . . . . . . . . . . 7.4.2. The allophone as phonological entity . . . . . . . 7.4.3. Coarticulation: a redundant concept? . . . . . . . 7.4.4. Criteria for evaluating a model of coarticulation 7.5. Interpretation of coarticulation phenomena . . . . . . 7.6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
167 170 173 173 173 174 176 177 177
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
179
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
209
Preface
Scientific disciplines are generally defined by reference to the methods that they use. Phonetics, by contrast, is rather defined by its object: the scientific study of speech. It calls on the methods of physiology, for speech is the product of mechanisms which are basically there to ensure survival of the human being; on the methods of physics, since the means by which speech is transmitted is acoustic in nature; on methods of psychology, as the acoustic speech-stream is received and processed by the auditory and neural systems; and on methods of linguistics, because the vocal message is made up of signs which belong to the codes of language. Given this, phonetics finds itself at the intersection between human and social sciences, health sciences and the sciences of information technology and communication. Spoken communication has its roots in what we are accustomed to call the audio-phonatory loop. This arises in the speaker who intends to impart a message; it is followed by the selection and organization of linguistic signs, the construction of a motor plan, and the execution of motor commands resulting in a series of transformations of the geometry of the vocal tract and the transmission of an intelligible acoustic signal, from which the listener retrieves the meaning of the message by means of hearing, stimulation of the auditory and peripheral nerves, perception and linguistic analysis. Speech constitutes a favored means of human communication by virtue of its “apparent” ease and because of the speed of information transmission that it enables. Thus, an average output of 20 phonemes per second allows not less than 150 words per minute that humans can produce for communication purposes. In addition to its semantic function, speech also conveys information about the speaker himself: his geographic origin, his social orientation, his emotions and his attitudes.
x
From Speech Physiology to Linguistic Phonetics
Speech is unique to humans. Nevertheless, from an anatomical point of view, there are no organs dedicated solely to this function. The organs employed in the act of speech are borrowed from the respiratory, laryngeal and digestive systems. They primarily serve other functions such as the exchange of gases for respiration, the protection of airways and lungs, mastication and swallowing. The articulatory processes consist of the manipulation of respiratory and laryngeal structures and of the vocal tract in order to create speech sounds, modulate, amplify and filter them. Speech production involves the coordinated contraction of more than 200 muscles, including those of the lips, the jaw, the tongue, the velum, the pharynx and the larynx, as well as those concerned with respiration. The activity of the muscles involved in speech is initiated and controlled by more than 1,400 nervous impulses per second, originating in the motor areas of the cerebral cortex. These travel along the motor pathways, including those descending (upper motor neurons) from the central nervous system to the lower tract served by the peripheral nerves (including certain cranial and spinal nerves). The number of degrees of freedom to be controlled is very large. Phonetic sciences are concerned with discovering what types of control are in place to ensure the production of intelligible meaningful speech. Particular attention is thus paid to the study of the biological bases of language. This preoccupation is not new and can be clearly discerned in the works of grammarians from the 5th century onwards. For example, in the 8 books of the Astadhyahyi, the Hindu grammarian PƗnini proposed a phonetic and phonological classification of Sanskrit based on articulation. This treatise must encourage respect for the pronunciation of the language of the gods. In the Edda, Snorri Stuluson placed the principles of opposition and commutation on an articulatory basis that phonology would rediscover more than a millennium later. In the Grammatica lingua anglicanae (1652), J. Wallis described the production of isolated sounds for deaf-mutes. Today, technological developments in medical imaging, progress in the observation of organs and muscles facilitated by new tools, and an interest in articulation stimulated by vocal technologies such as speech synthesis and automatic speech recognition have all provided a new impetus to phonetic research and produced significant advances in the understanding of the mechanisms involved in speech production. The main question to be answered concerns the relationship between the physiological aspects of the vocal apparatus and their role in achieving phonetic goals. Humans make use of only some of the phonatory potential of the larynx and the configuration possibilities of the supraglottal cavities. Which principles inform
Preface
xi
the selection of a finite ensemble of phonetic segments out of the whole extent of anthropophonic capabilities? How did phonological systems evolve over time? What were the influences that shaped the modifications? In introducing this topic, we recall that the phonetic domain is vast and interdisciplinary in nature. In this work we deal principally with the aspects that connect physiology and speech production. We indicate the methods and techniques used to observe the activity of organs in speech production. We also present the current state of knowledge in linguistic usage based on the possibilities offered by articulation and the phonatory apparatus. Speech is the result of a neuromotor activity. It is initiated by a current of air generated by the lungs and transformed at the level of the larynx by the action of the vocal folds, and directed towards the nasal or oral cavities by the velum or soft palate. Finally, the air current is very precisely shaped at different places in the mouth by the tongue until it emerges from the vocal tract through the double shutter known as the lips (see Figure 1). Power supply
Modulator
Filter
Respiratory Muscles
Vocal folds
Velum Lips Tongue Speech Sounds
LUNGS
LARYNX
VOCAL TRACT
Pulmonic initiator
Phonatory mechanism
Articulatory system
Figure 1. Basic diagram of the speech production process
A natural plan thus suggests itself for this book. We will follow the phonatory air-current from the lungs to the lips and address in turn the issues regarding respiration, phonation and articulation. We will indicate how the muscles and organs thus mobilized contribute to the distinction between phonemes and ensure the stability of phonological contrasts.
xii
From Speech Physiology to Linguistic Phonetics
The temporal dimension of speech production is of vital importance. Given that the articulators move relatively slowly and that the speed of their movement varies greatly from one articulator to another, it follows that a rapid succession of segments can only occur if articulatory movements are synchronized. It is therefore necessary to describe the role of motor coordination in realizing phonetic targets. Following the principles set by action theory, there are several theories, such as articulatory phonology and the optimality theory, which have tried to give a more adequate account of the dynamic aspects of linguistic systems by taking into account the articulatory and perceptual constraints that govern speech production. This volume adopts the same epistemological tradition and aims to provide a foundation for uniting phonetic and phonological descriptions on biological and articulatory bases.
Chapter 1
Respiration and Pulmonary Initiation
To understand the process of speech production, it is necessary to have a good knowledge of the mechanisms that come into play during normal breathing and to know how it is affected by phonation. Until the 17th century, man’s knowledge of respiration was limited to his belief that the prime purpose of breathing was to cool the blood. Not until the 20th century did any work on ventilatory mechanics develop. In 1925, F. Rohrer published his treatise on respiratory movements, which forms the basis of respiratory physiology. W. Fenn extended this work in the 1940s. Finally, we are indebted to Ladefoged et al. (1957) for the first phonetic studies examining the relationship between respiration and phonation. This was the first clear evocation of the way in which respiration is modified to accommodate speech production. The vital function of respiration is to ensure the exchange of gases between air and blood. The respiratory cycle comprises two phases: inhalation and exhalation. Inhalation allows a certain quantity of air to be stored in the lungs, bringing oxygen to the organism. The function of exhalation is to empty the lungs and expel gaseous waste from the body, in particular the carbon dioxide accumulated by the blood. The majority of speech sounds are produced during exhalation. Sounds may occasionally be partly realized on an ingressive air-stream; in very rare cases, they are made solely in this way. The respiratory system comprises the lungs, the tracheo-bronchial tract, the larynx, the upper airways (pharynx, nose, mouth) and the rib cage (see Figure 1.1).
2
From Speech Physiology to Linguistic Phonetics
Figure 1.1. The respiratory system
1.1. The rib cage The structural supports for respiration comprise: 1) the bony thorax; 2) the visceral thorax; 3) the respiration muscles. The rib cage and its muscles behave like a pump which breathes air in and out of the respiratory system via the upper and lower airways. The rib cage is made up of 12 spinal vertebrae, 12 pairs of ribs, and the sternum or breastbone. It is bounded at the top by the neck and at the bottom by the diaphragm (see Figure 1.2). The ribs constitute a barrel-shaped protective shield around the thorax, the rib cage. At the back, the head of each rib is joined to the spinal column by sliding joints. At the front, the first seven ribs are attached directly to the sternum by means of costal cartilage. The next three are attached to the lower extremity of the sternum by the cartilage of the seventh rib. The last two ribs (the “floating” ribs) have no anterior attachment; their costal cartilage is embedded in muscle fibers.
Respiration and Pulmonary Initiation
3
Figure 1.2. The thoracic cavity
Because of their shape and mode of attachment, front and back, raising of the ribs will cause an increase in thoracic volume in two directions: transverse and lateral. The simultaneous forward and upward movement of the sternum results in an increase of the anteroposterior diameter. The vertical dimension can be altered by movements of the diaphragm which lower the abdominal internal organs. 1.2. Lungs The lungs, situated in the rib cage, have the shape of air-filled pyramids. There are two of them, one on the left, the other on the right, separated by the mediastinum. To look at, they resemble two wet sponges, with tree-like branches and twigs. They are divided into two bronchial tubes which then subdivide into bronchioles and alveoli. The two lungs are enveloped in a serous membrane: the pleura. This has two layers: the internal layer or visceral pleura which covers and clings to the lungs, and the external layer or parietal pleura which is fixed to the internal wall of the rib cage
4
From Speech Physiology to Linguistic Phonetics
and the upper edge of the diaphragm. The serous liquid which is excreted by the pleura allows the layers to slide over one another. The pleura ensures the functional coupling between the chest wall and the lungs. Pulmonary tissue is extremely elastic and can therefore follow the movements of the rib cage. Elasticity is generally defined as the property which certain materials have of resuming their shape when the force which has deformed them has ceased to act. The point at which the material fails to resume its natural shape is called its limit of elasticity. If the intensity and the duration of the deforming force remains below this limit, the amount of deformation follows Hooke’s law: it is proportional to the force and stays the same, except for the sign, when the sign of the force changes. The lungs and thorax have elastic properties. These organs undergo elongation due to an external force (muscular activity) in the inhalation phase. When this force stops, they resume their natural position and shape; the effect of this is to compress the lungs, thus reducing the pulmonary volume and forcing out the air previously inhaled. This property of elasticity plays a big role in normal respiration. Under normal circumstances, the lungs exert a continuous effect of aspiration inside the thorax. In living subjects, lung elasticity forces can be estimated by measuring the pleural pressure. It appears that the pressure is negative during inhalation, i.e. lower than atmospheric pressure. This is because of the elastic traction of the lung on the visceral sheet of the pleura; the amount of pressure varies according to the stages of the respiratory cycle. This intra-pleural pressure is exerted over the organs contained in the thorax, in particular the heart, the thoracic channel and the esophagus, where it can be measured more easily. The pressure of air in the lungs depends on the force exerted on the thoracic walls by the molecules of air inside them. When the dimensions of a container are enlarged, its volume increases, the molecules of air become more spaced out, and air pressure falls. Conversely, when the dimensions are reduced, the volume decreases, the air molecules become compressed and the pressure increases (Boyle’s law). In order for air to emerge from the lungs, a difference in air pressure must be created between the air contained in them and the atmosphere outside. An increase in pulmonary volume provides a lowering of pressure which results in the drawing in of air from outside. Conversely, a decrease in pulmonary volume induces in return an increase in pressure which pushes the air out.
Respiration and Pulmonary Initiation
5
1.3. Normal respiration 1.3.1. Inhalation Inhalation is the result of raising and widening the rib cage, effected by the contraction of the external intercostal muscles and the flattening of the diaphragm, which presses down on the abdominal viscera. This action, because of the functional coupling between the lungs and the thorax, lowers the intra-pleural pressure. When the force is great enough to overcome the elastic resistance of the pulmonary tissue, the lungs fill by aspiration (siphoning in air) and the pulmonary volume increases. The amount of air inhaled is in the region of half a liter. 1.3.2. Exhalation When the inhalation impulse ceases, the lungs deflate and return to the rest position. This constitutes a return to equilibrium. Normal exhalation is an entirely passive process caused by the elastic recoil of the pulmonary tissue and the ribs, from their weight and the pressure exerted by the abdominal organs. The combination of these forces constitutes what is called the pressure of relaxation. The pressure of relaxation is the pressure of the air that could be measured in the inflated lungs if the intercostal muscles were relaxed and if the air were prevented from escaping from the lungs. Exhalation in resting respiration is an involuntary activity. The ratio between inhalation and exhalation is 1:1. The typical rate of normal respiration is 12 to 18 cycles per minute. In the adult, mean values are 0.3-0.5 liters per second (l/s) for rate of flow, 500 cm3 for volume, and 1-3 cm H2O for pressure. These values change and increase with work. Thus, with forced inhalation and severe muscular effort during exhalation, the rate of flow can increase to more than 50 l/s and intra-pulmonary pressure can go up to 100 cm H2O. 1.4. Respiration muscles The three dimensions of the rib cage (vertical, transversal and anteroposterior; see Figure 1.3) increase during inhalation and decrease during exhalation. Two groups of muscles are involved in the different stages of respiration: the muscles of inhalation and the muscles of exhalation (see Tables 1.1 and 1.2).
6
From Speech Physiology to Linguistic Phonetics
Figure 1.3. The vertical, transverse and antero-posterior dimensions of the thoracic cavity increase during inhalation and decrease during exhalation
1.4.1. Inhalation muscles 1.4.1.1. The diaphragm The diaphragm is the chief inhalation muscle. This muscle is domed at the top and its fibers are attached to the base of the sternum, the lumbar vertebrae and to the inner surfaces of the cartilages of the lower ribs. It separates the thoracic cavity from the abdominal cavity. When the diaphragm contracts, the effect is to flatten the dome and push the abdominal organs down; this enlarges the thoracic cavity in the vertical dimension. The diaphragm can also help to raise the lower ribs to some extent. 1.4.1.2. The external intercostals The external intercostals run between the ribs and connect the lower edge of each rib vertically and horizontally with the upper edge of the rib immediately below. Their function is to strengthen the thoracic walls so that they do not bulge through the ribs. Their action aims at overcoming the forces of relaxation. Because of their origin and insertion, their contraction makes the ribs rotate outwards and upwards, increasing the anteroposterior dimension (Dickson and Dickson, 1995). In normal respiration, the expansion of the lungs is caused by the contraction of the diaphragm and the external intercostals. In forced respiration, a certain number of supplementary muscles come into play and increase the movement to raise the
Respiration and Pulmonary Initiation
7
clavicles, alter the curvature of the ribs and increase still further their elevation. The major and minor pectoral muscles and the scalene muscles are those principally involved (see Figure 1.4).
Figure 1.4. Action of the principal muscles of inhalation (from Hardcastle, 1976)
1.4.2. Exhalation muscles In normal respiration, exhalation is an entirely passive phenomenon. When the inhalatory effort ceases, the combination of the forces of relaxation is all that is needed to make the lungs deflate and return to their rest position. To prolong the exhalation phase in forced respiration, supplementary pressure must be exerted on the rib cage. This action results from the working of three groups of muscles (see Figure 1.5): – the thoracic muscles with the internal intercostals, the subcostals and the transverse thoracic; – the abdominal muscles, i.e. the transverse abdominal, the internal oblique, the external oblique and the rectus abdominis; – the dorsal muscles with the great dorsal and the iliocostal. The internal intercostals are the most important of the exhalation muscles. These muscles lie under the external intercostals and their fibers are at right angles to those
8
From Speech Physiology to Linguistic Phonetics
of the external intercostals. They are situated along a line from lower back to frontback. When they contract, this orientation sets in motion a lowering of the ribs. The contraction of the internal and external obliques, together with the rectus abdominis and the transverse thoracic, helps to reinforce this action and produces compression in the abdomen which makes the diaphragm rise, thus reducing the vertical dimension of the rib cage.
Figure 1.5. Action of the principal muscles of exhalation (from Hardcastle, 1976)
Accessory
Principal
INSPIRATORY
Diaphragm External intercostals
EXPIRATORY
Internal intercostals
Interchondral part of internal intercostals Scalenes
Transversus abdominis
Pectoralis major
External obliques
Pectoralis minor
Internal obliques
Sternocleidomastoid
Rectus abdominis
Table 1.1. Principal and accessory muscles of respiration
Upper border of rib below
Second or third rib below
Lower border of each rib Inferiorly and medially
Downwards
Obliquely upwards and forwards
Posterior part of rib
Subcostal groove of one rib
Pull down the rib cage in forced exhalation during phonation Control subglottal pressure In synergy with external intercostals, strengthen the intercostal spaces
ACTION Separates the thoracic cavity from the abdominal cavities Draws the central tendon down and forward. Increases the vertical dimension of the thoracic cavity Strengthen the thoracic wall Lift the ribs. Increase width of thoracic cavity. Assist in checking recoil of lungs during expiration Can elevate the ribs
Table 1.2. Action of the principal muscles of respiration
Top of rib below
INSERTION Central tendon Aponeurosis of the diaphragm
ORIGIN COURSE Lower tip of the sternum Upwards and medially First 3, 4 lumbar vertebrae Ribs 7-12
Respiration and Pulmonary Initiation 9
10
From Speech Physiology to Linguistic Phonetics
1.5. Pulmonary capacity and pulmonary volume Pulmonary volume corresponds to the quantity of air that the lungs can contain, whereas pulmonary capacity refers to their functional limits. Ventilation amplitude, on the other hand, refers to the oxygen requirements of the organism. The total pulmonary volume is obtained after a forced inhalation. It corresponds to the total lung capacity. After a forced exhalation, a certain amount of air always remains in the lungs: this is the residual volume. The difference between the maximal volume and the residual volume corresponds to the vital capacity. The vital capacity is important for determining how long phonation can be maintained whether for singing or speaking. The difference between the inhaled and exhaled volumes in normal respiration is the tidal volume. The expiratory reserve represents the difference between the residual volume and the tidal volume (see Table 1.3). 100%
100% Inspiratory reserve 2 liters
Total lung capacity 5 liters
60%
50%
50%
40% 10%
20% 10%
Tidal volume 0.5 liters Expiratory reserve 1.5 liters
Vital capacity 4 liters
Residual volume 1 liter
Table 1.3. Lung volume as a percentage of the total lung capacity and of vital capacity
Respiration and Pulmonary Initiation
11
1.6. Respiration in phonation The sounds of speech are produced by the rigorous and precise use of air generated by the lungs. Respiration that is called “resting” or normal is an automatic aerodynamic phenomenon; respiration in the act of speech is very finely controlled. A sufficient quantity of air must be inhaled to allow for breathing and simultaneously producing a complete utterance without the need for taking a breath at an inappropriate moment. Exhalation must provide an output of air sufficient to maintain stable subglottal pressure for the whole duration of the utterance. The act of speaking can be thought of as a resistance to the flow of exhaled air (Slifka, 2003). Respiration must thus be modified to increase the volume of available air by an increase of inhalation and by control of exhalation to prolong and regulate the output of air. 1.6.1. The respiratory cycle The respiratory cycle is profoundly altered by speech production (see Figure 1.6). The ratio between inhalation and exhalation, which is 1:1 at rest, takes on a ratio of 1:4 and can rise as high as 1:10 during speech production. During speech, inhalation is much faster, to avoid lengthy interruptions. To achieve this, respiration occurs largely via the mouth rather than the nose, and there is more use of the diaphragm and internal intercostals. The pulmonary volume required to initiate speech is approximately double that of resting respiration and half that of vital capacity, i.e. about a liter. Exhalation in speech is no longer automatic but controlled. Its duration lengthens from 2-3 seconds to 15-20 seconds, varying according to the length of the utterance. The pulmonary volume used is not very different from the functional residual capacity.
Figure 1.6. Inhalation and exhalation during quiet respiration, speech respiration and forced respiration
12 From Speech Physiology to Linguistic Phonetics
Respiration and Pulmonary Initiation
13
1.6.2. Control of exhalation According to Ladefoged (1967), exhalation control is essentially provided in the following way: at the start of exhalation, the inhalation muscles (the external intercostals) continue to be active to slow the lowering of the rib cage which, because of its weight and the forces of elasticity in the lungs, would otherwise tend to return too quickly to its resting state. The activity of the external intercostals decreases progressively according to pulmonary volume and ceases when the quantity of air necessary for phonation cannot be provided. At this point the exhalation muscles come into play. Increasingly strong contractions of the internal intercostals compress the rib cage and force out the air remaining in the lungs. Towards the end of exhalation, their action is reinforced by the oblique muscles and the diaphragm. This very linear idea of the organization of muscular activity has been revised by Hoshiko (1960), Adam and Munro (1973) and Marchal (1988) in the light of the general theory of coordinated movement. It is therefore in the synergy of muscular activity during speech that Hoshiko (1962, p. 118) sees the essential difference between resting respiration and phonatory respiration: “The electromyograms secured from the intercostal muscles suggest that the function of these muscles does not have a strictly one to one relation with the kinds of movements exhibited during vegetative activity. For speech activity, the intercostal muscles appear to act synergistically.” Adam and Munro (1973) reach the same conclusion, as does Marchal (1988, p. 9): “One must envisage the existence of a control process which harmonises the different elements of muscular activity, the facilitating or inhibiting muscular functions, or in other words all kinetic impulses.” This idea, which is similar to the theory of action in control of respiratory movements, could succeed in restoring credit to certain observations by Stetson (1951), who examined the relationship between articulation and phonation, using fairly primitive instrumental techniques. 1.6.2.1. Muscular activity and syllables Stetson’s (1951) work described the existence of alternating actions in the internal and external intercostals in delimiting syllables. According to this author, the syllable would be initiated by a contraction of the internal intercostals. This would push out some air, which would be interrupted by a contraction of the external intercostals. These opposing actions by the external and internal intercostals
14
From Speech Physiology to Linguistic Phonetics
would be used to create a series of thoracic thrusts, thanks to this alternating mechanism: ballistic pulses corresponding to syllables. In summary, Stetson saw a physiological correlate of the accentuation of syllables in the activity of the abdominal muscles reinforcing the internal intercostals, whereas Fonagy (1958, p. 53) saw the physiological manifestation of accent in the increase of electromyographic activity in the internal intercostals. Ladefoged (1962) disagreed with this theory of the syllable, which was not supported by experimentally robust data. Lebrun (1966) was of the opinion that muscular activity has been more inferred from observation of the ribcage muscles than directly measured. Indeed, it remains difficult to establish an unequivocal relationship between syllables and muscular activity. Stetson moreover conceded this, observing that this relationship was not systematic, and that there were cases where there were differences between activity peaks and numbers of syllables. Beyond this finding, the most interesting part is the hypothesis put forward by Stetson to explain this fact and the role he attributed to consonantal closure (p. 58): “The rapid rise of pressure of the syllable is generated by the intercostal muscles of the chest, on the other hand syllables like ‘pay, day, die’ are released by a consonant […] the intercostals act as before but the consonant constriction occurs at the same time, so that the air-pressure develops behind the consonant closure.” The hypothesis of an aerodynamic influence by consonantal closure appears again in this passage: “in very rare cases, it may be that the chest movement is a continuous, slow “controlled” movement of expiration, and that the syllable is due to the holistic stroke of the consonant.” This hypothesis was partly taken up by Pike (1955) and by Catford (1977, p. 91): “So far we have spoken as if only the pulmonic initiator were involved in these activities, this of course is not necessarily so […] in actual practice, so far as we know ‘stress’, ‘feet’, and ‘syllables’ are normally functions of pulmonic initiation […] however, it is perfectly possible to produce sequences of ‘feet’ and ‘syllables’ purely by glottal pressure.” Marchal (1988, p. 6) looks at the asynchronous peaks of activity in the diaphragm and the internal intercostals, which he interprets as a response to the need to modify the supply of air according to the impedance of the larynx and the vocal tract. This implies that the diaphragm has a role in the exhalation involved in both speech and singing (Sundberg et al., 1999; Lindblom and Sundberg, 2005) up to the end of the exhalation phase. These findings support Zinkin (1958), for whom the control of the phonatory air supply and of subglottal air pressure is due to the controlled behavior of the diaphragm. It appears that the curve of the diaphragm does not return to rest in a linear way during phonatory exhalation. The speed of the
Respiration and Pulmonary Initiation
15
rise of the diaphragm varies according to the phonetic structure of the utterance, thus helping to enable instant modulation of the intensity of each phoneme. These results also partially explain a certain number of contradictions that arose in Stetson’s work; in particular, why variations in intra-oral pressure do not result in changes in subglottal pressure. Knowing that intensity and f0 are a function of transglottal pressure and laryngeal tension, any model that does not allow for control by the inspiratory muscles cannot explain the independence of f0 and intensity at times when intra-oral air pressure varies as a function of changes in impedance caused by consonantal closures, anymore than it can explain the absence of continuous variations of intensity. As for the question of a connection between peaks of internal intercostal activity and the presence of syllables, we have only been able to make such a connection for slow read speech (as in lists of words and nonsense words) and in syllables accentuated for phrasal emphasis. It therefore appears that this question deserves more attention and that the summary outlined here is inadequate. In fact, the data often suggests that vowels are marked by a high point in the diaphragm and consonants more by diaphragm depression and increased activity in the external intercostals. At a normal rate and for open syllables, an almost syllabic division between the secondary patterns of EMG activity can be seen. Where there are closed syllables or combinations of consonants followed by liquids, a peak following the consonant can be seen, as if there were a [ԥ] that is however not visible on the acoustic trace. In rapid output, a group of consonants no longer corresponds to a single depression. Should we therefore see an exceptional structure in the vowelconsonant combination (Lenneberg, 1967)? The question is open. These results clearly show that the activity of the respiratory muscles is not as serialized in time as the scheme of activity proposed by Ladefoged would suggest. Their activity is synergistic. Although they reach opposite conclusions, both Ladefoged and Stetson subscribe to the same theoretical model. They see control of respiration in phonation as an extension of the properties of the intercostal muscles in resting respiration, with the internal intercostals (exhalation muscles) creating an airflow that is controlled (Ladefoged) or stopped (Stetson) by the external intercostals (inhalation muscles). This idea has currently yielded to a theory which examines movement as a function of the nature of neuromuscular coordination and of its purpose, while at the same time taking account of sensory-motor and cognitive factors. The need to coordinate muscular activity is directly connected to the heterogenity of neuromotor command and influences. To achieve precise movement, taking into account multiple constraints, we must envisage a control mechanism that harmonizes all the different
16
From Speech Physiology to Linguistic Phonetics
elements of muscular activity, muscular functions that are both facilitating and inhibitory, i.e. a mechanism that controls all kinetic influences. The functional grouping of muscles controlled by such a system is known as a coordinative structure. Theories of action propose that the activity of opposing muscles that are part of a coordinative structure is complementary, and that their fine adjustment happens automatically as part of peripheral conditions. In the case of phonation, it appears that the respiratory coordinative structure initiated at the start of an utterance (with the baseline corresponding to breath groups) acts locally (peaks in the diaphragm and the external intercostals) on the resistances brought to bear on the airflow at the levels of the glottis and the oral cavity (the voiced-voiceless and vowel-consonant distinctions). 1.6.3. Subglottal pressure Variations in subglottal pressure play a central role in speech production. This pressure has to be sufficiently strong to overcome the resistance to airflow presented by the glottis and upper airways. It must also be controlled to ensure both the stability of phonation and a response to the global demands posed by the evolution of prosodic parameters, principally of intensity and f0. Several methods, both direct and indirect, have been used to measure subglottal pressure. 1.6.3.1. Measurement of subglottal pressure 1.6.3.1.1. Direct methods The catheter Van den Berg (1956) was the first to invent a direct way of measuring subglottal pressure during speech production. He used an open catheter made of polyethylene which was introduced via the nose into the pharynx, then sucked into the glottis with a very strong inbreath. The vocal cord region was slightly anaesthetized by the catheter. Pressure was registered by an optical manometer. This technique is often difficult for the speaker to tolerate (nausea can result) and there is a serious risk of disrupting phonation. This technique is therefore seldom used for phonetic studies. The intratracheal needle Direct recoding of subglottal pressure can also be obtained by using a very fine intratracheal needle. This is inserted into the trachea at a point two rings below the cricoid cartilage. This method has been used in a large number of studies (Lieberman, 1968; Strik and Boyes, 1992, 1995). Its main advantage is that it
Respiration and Pulmonary Initiation
17
provides an immediate direct pressure reading. It is however invasive and, because of the risk of infection, recordings require an appropriate medical infrastructure, which makes it cumbersome to use. In practice, it proves hard to convince professional speakers – and, even more so, singers – that the procedure is harmless. 1.6.3.1.2. Indirect methods Measurement of esophageal pressure Esophageal pressure features as a good estimate of subglottal pressure in some studies on isolated vowels and short read phrases (Van de Berg, 1956; Strenger, 1960). These authors used a small rubber balloon, about 10 cm long and 1 cm in diameter with about 1 ml of air in it, inserted via the nose into the esophagus by means of a fine catheter for a length of 34 cm from the nostrils. The balloon thus reached the lower third of the esophagus, slightly above the point where the trachea forks. The balloon pressed against the sensitive membrane that is the posterior wall of the trachea. The increase of pressure in the balloon was seen as directly relating to subglottal pressure. This method was in fact subject to an important error: it did not take into account of the effect of the forces of relaxation and elasticity which affect the balance of air pressure in the respiratory organs. Several studies therefore found a difference between esophageal pressure and directly measured subglottal pressure at the end of the expiratory phase. Research into pulmonary physiology shows that intrapleural pressure equates to intrathoracic pressure. Moreover, it has been established that esophageal pressure is a good indication of intrapleural pressure. This amounts to saying that esophageal pressure equates to subglottal pressure plus the pressure resulting from the forces of elasticity in the lungs. When measuring esophageal pressure, it is therefore appropriate to keep adjusting the values by referring to pulmonary volume. Only the use of a body plethysmograph allows continuous feedback as to pulmonary volume without interfering with speech. This indirect method of measuring subglottal pressure has the advantage of being not very invasive, but it requires a large array of equipment available only in a hospital setting. This feature surely explains the small number of studies which have used it (Marchal, 1976; Binazzi et al., 2006). Measurement of intra-oral air-pressure Because of the difficulties posed by the direct methods and the esophageal method of measuring subglottal pressure, several studies have relied on intra-oral pressure. It is indeed the case that when the vocal tract is completely closed, pressure is equalized in the whole of the vocal tract below the place of closure. This
18
From Speech Physiology to Linguistic Phonetics
is what happens in the case of a voiceless plosive consonant: in these circumstances, intrapulmonary pressure is the same as intra-oral pressure and equates to subglottal pressure (Kitajima and Fujita, 1990; Hertegard et al., 1995; Giovanni et al., 2000). The measure of intra-oral pressure is thus necessarily of limited practicality and can rarely be used to study variations of subglottal pressure in continuous speech. 1.6.3.2. Values of subglottal pressure In resting respiration, the values of subglottal pressure during exhalation vary from 1 to 3 cm of water. They can rise to 100 cm during violent exhalatory efforts, such as coughing. Phonation initiation requires pressure above 2 cm of water and the current values in normal speech are in the region of 2-15 cm of water. Similarly, pressure varies according to linguistics needs. Several studies have examined the relationship between sub-glottal pressure on the one hand and, on the other hand, intensity, f0 and a range of variations occasioned by the prosodic organization of the utterance. 1.6.3.3. Subglottal pressure and intensity The classic experiments by Muller (1837) using excised larynxes constituted the first work to show the effects of an increase in subglottal pressure on intensity. Piquet and Ducroix (1956) made one of the first very fast color films on the movement of the vocal folds during the course of a laryngectomy. During this operation, they also diverted air outside the larynx using a canula introduced into the opening that had been made to carry out the operation. As a result of their experiment, for which they have been heavily criticized since, they affirmed that the vocal folds could vibrate in the absence of any current of air. As far as they were concerned, the vocal folds were responsible for variations in intensity. Van den Berg (1956) measured the relationship between the level of sound, subglottal pressure and the average output of air for the vowel /a/ with different fundamental tones, and with chest voice, head voice and falsetto voice. His results allowed him to calculate the power and efficiency of the glottal voice generator. He confirmed that the behavior of the glottis as a generator of sound is quadratic rather than linear for the vowel /a/. The studies of Ladefoged and McKinney (1963), Isshiki (1964) and Strik and Boves (1992) show that there is a very strong relationship between subglottal pressure and intensity. Intensity is nearly proportional to the square of the pressure across the whole range of voice registers: INT x SGP. 3.3O7 They also show, as does Titze (1989), that even if it is the most important, it is not the only factor to influence vocal intensity. Laryngeal adjustment and the
Respiration and Pulmonary Initiation
19
impedance of the vocal tract also play a part. This observation is supported by Marchal and Carton (1980) and by Lecuit and Demolin (1998), who find distinct regression curves according to the vowels and four levels of f0. Ladefoged and Kinney (1963) also find a relationship between sound pressure, perception of intensity and subglottal pressure. Their conclusions are based on hearing tests and a proportional linear relationship between perceived intensity and subglottal pressure. This result suggests that the subjects who did these tests were particularly aware of the physiological effort. 1.6.3.4. Subglottal pressure and fundamental frequency That there should be a strong positive relationship between subglottal pressure and f0 is highly likely, since the latter is largely conditioned by transglottal pressure, i.e. the difference between pressure above and pressure below the vocal folds. On average, it seems there is an increase of 5 Hz per cm H2O. Even so, laryngeal tension and voice register play a very important part. Fundamental frequency variations therefore seem significantly less important for high chest voice (1-3 Hz per cm H2O) and low chest voice (2-6 Hz per cm H2O) than for falsetto voices (5-10 Hz per cm H2O) (Titze, 1989). Fundamental frequency variation does not depend exclusively on subglottal pressure (Plant and Younger, 2000). A rise in frequency can also result from increased laryngeal tension; when subglottal pressure lowers towards the end of an utterance, f0 can rise, as is particularly apparent in interrogative utterances with rising intonation. Strik and Boves (1992) model the relationship between subglottal pressure and laryngeal adjustments in the control of f0. 1.6.3.5. Subglottal pressure and the spectrum Papers by Shutte (1992), Sundberg et al. (1999) and Sjölander and Sundberg (2004) examine the relations between subglottal pressure, the quality of the glottal source and the spectrum. In particular, they measured F1 energy in singers and concluded that there was a linear relationship: when subglottal pressure doubled, it produced a rise of 12 dB.
20
From Speech Physiology to Linguistic Phonetics
1.6.4. Subglottal pressure and stress The difficulty of identifying stable acoustic correlates for the exponency of intonational accent has led to questions reagrding the possibility of defining it at a substantive level from the physiological point of view. The notion of exhalatory effort has often been advanced to explain why one segment in the speech-stream should have dynamic value. Studies have focused on the activity of the respiration muscles on the one hand and on the links between variations in subglottal pressure and the perception of accent on the other hand: lexical accent, an accent known as “emphasis”, and phrasal accent. Research has chiefly focused on French and English. The example given by Lieberman (1965) of the difference in realization between “light housekeeper” and “lighthouse keeper”, in which accent shifts from “light” to “house”, is good illustration of the view that accented syllables are marked by a peak in respiratory effort reflected by a peak in subglottal pressure (see Figure 1.7). This link between a rapid increase in subglottal pressure and syllable accentuation is also found in French for syllables marked for stylistic effect (Benguerel, 1973; Marchal, 1976). It would be tempting to see in this link a confirmation of the motor theory of perception according to which the listener is aware of the physiological effort of speech production. However, we think that variation in subglottal pressure is probably an indicator, but not the only one. Moreover, these same studies find that in French there is an absence of such a link for phrasal accents, which are never associated with any significant variation in subglottal pressure.
Respiration and Pulmonary Initiation
Figure 1.7. Stress and subglottal pressure (from Lieberman, 1967)
21
This page intentionally left blank
Chapter 2
Phonation and the Larynx
2.1. The larynx The larynx is an extreme modification of the top section of the airways, occupying the middle and forepart of the neck, in front of the last four cervical vertebrae. It is situated under the hyoid bone and the tongue and follows their movements. From it is formed the lower part of the anterior wall of the pharynx, sitting at the top of the trachea. Through its position and configuration, the larynx constitutes a crossroads of airways (trachea, larynx and nasal cavities) and digestive passages (oral cavity, pharynx and esophagus). The larynx is primarily a sphincter which protects the airways by closing the trachea, thus preventing water and food from reaching the lungs. The larynx is also a valve which controls the output and pressure of air in the lower respiratory passages. It is this faculty of varying the opening of the airways and of controlling the acoustic energy that is exploited for phonation. The larynx is composed of a cartilaginous skeleton, ligaments, muscles and a mucous membrane.
24
From Speech Physiology to Linguistic Phonetics
2.1.1. External configuration of the larynx 2.1.1.1. Anterior side At the top, the anterior part of the epiglottis overhangs the upper edge of the thyroid cartilage; further down can be found the front face of the thyroid cartilage with the insertion points for the sternothyroid and thyrohyoid muscles. The cricothyroid space is taken up with a membrane and the cricothyroid muscles. At the bottom is the cricoid arch (see Figures 2.1 and 2.2).
Figure 2.1. Anterior view of the larynx
Phonation and the Larynx
25
Figure 2.2. Lateral view of the larynx
2.1.1.2. Posterior side The upper orifice of the larynx, oval in shape, faces back and upwards, and is bounded anteriorly and superiorly by the epiglottis, at the back by the processes of the corniculate and arytenoid cartilages and laterally by the aryepiglottic folds (see Figure 2.3). At the bottom of the larynx, the passage narrows to form the glottis. On each side of it, there are two recesses: the Santorini and Morgagni tubercules. At the base, there is a recess formed from the posterior side of the arytenoid cartilages, the posterior plate of the cricoid cartilage, the crico-arytenoid and interarytenoid muscles.
26
From Speech Physiology to Linguistic Phonetics
Figure 2.3. Posterior view of the larynx
2.1.2. Internal configuration The boundaries formed by the vestibular and vocal folds divide the cavity of the larynx into three parts (see Figure 2.4). 2.1.2.1. The upper part Also known as the vestibule, the upper part of the larynx is shaped like an upturned barrel. The epiglottis and the epiglottal ligaments form the front wall. The
Phonation and the Larynx
27
vestibule is bounded laterally by the internal part of the aryteno-epiglottal folds at the top and by the upper inner part of the vestibular folds at the bottom, and posteriorly by the mucosa overlying the arytenoid cartilages and the arytenoid muscles. 2.1.2.2. The middle part 2.1.2.2.1. The glottal spaces The middle part is the space bordered by the free edges of the vocal folds; it includes the glottis and two lateral processes, the ventricles of Morgagni. The glottis corresponds to the space delineated by the edges of the lower vocal folds and the posterior part of the vocal processes of the arytenoid cartilages. Functionally, the glottis can be divided into two parts: – The membranous or vocal glottis, which corresponds to the front two thirds of the space bounded by the free edges of the vocal folds. It contains the lower thyroarytenoid muscle, the vocalis muscle (the internal layer of the thyroarytenoid muscle) and an associated ligament, the vocal ligament. – The cartilaginous or respiratory glottis, bounded by the vocal processes and the medial surfaces of the arytenoid cartilages. Between the vocal folds and the vestibular folds are the ventricules of Morgagni. These constitute a fairly deep cavity. The size and shape of the glottal opening (also called rima glottidis) vary according to the state of the various different muscles concerned with moving the laryngeal cartilages and especially with the abduction, adduction and tension of the vocal folds. – The membranous glottis is the site of the principal changes effected by movements of the vocal folds. The dimensions of the cartilaginous glottis vary little, except in violent constrictions and compressions of the sphincter, as happen during coughing, for example. When the glottis is maximally open, its volume is about half the volume of the trachea. The glottis thus offers permanent resistance to airflow.
28
From Speech Physiology to Linguistic Phonetics
Figure 2.4. Coronal section through the larynx
2.1.2.2.2. The vocal folds The vocal folds are twin infoldings of mucous membranes and muscular fibers stretched horizontally across the larynx. They are located below the epiglottis. They are attached at the back to the vocal processes of the arytenoid cartilages and at the front to the thyroid cartilage. The length of the vocal folds is 18-24 mm in a man and 14-19 mm in a woman. The vocal folds have a complex structure (Hirano, 1981). They do not simply consist of a muscle and a ligament, but are rather more a set of layers of muscles, ligaments and membranes, each with different mechanical and vibratory properties (see Figure 2.5). Although the mucous membrane constitutes only a small part of the vocal folds, a number of ultrafast films has shown that it plays a critical role. The mucous surface is more responsive than the internal muscular body and is responsible for the essential characteristics of the vibratory movement. The malleability of the epithelium and of the connective tissue that supports it (lamina propria) is responsible for the complex vibratory movements that happen when phonation occurs.
Phonation and the Larynx
29
Figure 2.5. Layered structure of the vocal fold
The lamina propria is connective tissue that joins the epithelium to the muscles immediately beneath it. Totalling a thickness of 1.2 mm, it consists of three layers, upper, middle and lower: – The upper layer is a loose collection of elastic and collagenous fibers and is more mobile than the lower layers. It comprises the Reinke space, which allows the mucous membrane to vibrate independently of muscular movement. – The collagenous fibers of the middle layer constitute the essential part of the conus elasticus, the main fibrous membrane of the lower glottis. – The lowest layer of the lamina propria constitutes the vocal ligament. The thyroarytenoid muscle forms the muscular part of the vocal fold. It runs from the front to the back. It is attached to the thyroid cartilage and ends on the vocal process of the arytenoids. The most central fibers constitute the vocalis muscle. 2.1.2.3. The lower part The laryngeal cavity widens towards the bottom, i.e the location of the cricothyroid membrane and of the cricoid cartilage.
30
From Speech Physiology to Linguistic Phonetics
2.2. The laryngeal cartilages The larynx comprises a skeletal structure with five principal cartilages (see Figure 2.6). Fibrous connective tissue covers the walls of the laryngeal cavity. The cartilages are connected by joints and ligaments, and operated by a set of muscles.
Figure 2.6. The larynx seen from the back and from the right side
2.2.1. The cricoid cartilage The trachea channels air to and from the lungs. It consists of a pile of cartilaginous rings. The last ring differs from the others: it is heavier and more mobile and it is the first element of the larynx: the cricoid cartilage (from the Greek cricos, a ring). The cricoid cartilage sits at the base of the larynx. It has the shape of a signet ring with the signet projecting backwards. Two parts can be distinguished: the rear part or plate and the anterior part or cricoid arch, which slopes down from back to front. The cricoid cartilage supports the thyroid cartilage and the arytenoids (see Figure 2.7). Its upper edge presents four articulatory surfaces: two at the side for the thyroid
Phonation and the Larynx
31
and two at the back for the arytenoids. The upper edge of the cricoid arch provides an anchor point for each side of the lateral cricoarytenoid muscle. The rear side of the cricoid plaque anchors the posterior cricoarytenoid muscle. 2.2.2. The thyroid cartilage The thyroid cartilage (from the Greek tureos or shield) is shaped like a ship’s prow. It is made up of two quadrilateral blades joined dihedrally at their front edges and open at the back. This protrusion is better known as the Adam’s apple. The two blades have two horns extending from the back edge. The upper horns, or thyroid horns, are longer and form anchoring points for the thyrohyoid ligament. The posterior edge of each blade articulates with the cricoid cartilage inferiorly at a joint called the cricothyroid joint.
Figure 2.7. The thyroid and the cricoid cartilages
2.2.3. The arytenoid cartilages The arytenoid cartilages (see Figure 2.8) are two symmetric cartilaginous pieces. They are shaped like a triangular pyramid. The arytenoids rest on the upper lateral edge of the cricoid plate.
32
From Speech Physiology to Linguistic Phonetics
The lower part of the arytenoid is concave for articulation with the convex surface of the corresponding cricoid cartilage. Its lateral angle is the location of the muscular process. A pyramid shape projects from the base of the front part of the cartilage: this is the vocal process, on which the bottom part of the lower vocal fold is attached.
Figure 2.8. The arytenoid cartilages
2.2.4. The epiglottic cartilage This is a small, slim, supple blade of elastic cartilage, oval in shape and enlarged at the top. The epiglottic cartilage sits in the upper front part of the larynx, behind the thyroid cartilage, to which it is attached by a ligament. It joins to the tongue by three ligaments. During swallowing, the epiglottis folds back to cover the entrance to the larynx, preventing food and drink from entering the windpipe. 2.3. Joints and ligaments The different cartilages are linked by joints and ligaments. 2.3.1. Intrinsic joints and ligaments Cricothyroid joints Cricothyroid joints connect the inferior horns of the thyroid cartilage to the articulatory facets of the cricoid ring. They allow for rotation of the thyroid cartilage around a horizontal axis which runs between the inferior horns of the thyroid
Phonation and the Larynx
33
cartilages. Contraction of the cricothyroid muscle rotates the thyroid cartilage forward, thus bringing the vocal folds into tension (see Figure 2.9).
Figure 2.9. Motion at the cricothyroid joint: the cricoid rotates, the arch moves up, the lamina moves back
The cricothyroid membrane The cricothyroid membrane runs from the bottom of the arytenoid cartilage to the upper edge of the cricoid ring. The cricoarytenoid joints The cricoarytenoid joints connect the base of the arytenoid cartilage with the upper edge of the cricoid plate. These play a pre-eminent role in controlling the degree of opening of the glottis. The saddle shape of the articulatory surfaces affords two kinds of motion: a sliding movement and a rotating movement. The posterior and lateral cricoarytenoid ligaments constrain the sliding motion that the arytenoid cartilages can make. Contraction of the posterior cricoarytenoid muscle exerts a backward and downward force on the muscular process of the arytenoid cartilage. This rotates the arytenoids posteriorly over the top of the cricoarytenoid joint. The two arytenoid cartilages rotate outwards and away from one another. This abducts the vocal folds. By drawing the apex of each arytenoid cartilage towards the muscular process of the other arytenoid cartilage in a rotating movement, contraction of the interarytenoid muscles adducts the vocal folds (see Figure 2.10).
34
From Speech Physiology to Linguistic Phonetics
Figure 2.10. Gliding motion of the arytenoids
The thyroepiglottic ligament The thyroepiglottic ligament joins the lower part of the epiglottis to the angle formed by the two laminae of the thyroid cartilage, below the superior thyroid notch. The thyroarytenoid ligaments (or elastic membrane of the larynx) The thyroarytenoid ligaments are two bands enclosed within the vocal folds. They join the vocal process of the arytenoid cartilage to the reflex angle of the thyroid cartilage. 2.3.2. The membranes and the extrinsic ligaments The thyrohoid membrane and the thyrohyoid ligaments The membrane and the thyrohyoid ligaments join the upper edge of the thyroid cartilage to the inner edge of the greater horns of the hyoid bone. The hyoepiglottic membrane The hyoepiglottic membrane is an elastic band which extends from the anterior side of the epiglottis to the rear upper edge of the hyoid bone. The cricotracheal membrane The cricotracheal membrane joins the lower edge of the cricoid cartilage to the first ring of the trachea.
Phonation and the Larynx
35
2.4. The larynx muscles There are two groups of muscles: muscles arising outside the larynx which connect it to the neighboring organs (extrinsic muscles); and muscles which are fully contained within the larynx (intrinsic muscles). 2.4.1. The intrinsic muscles The intrinsic muscles move the laryngeal cartilages and change the glottal configuration (see Figure 2.11 and Table 2.1a, b). They help to determine the shape, position, length, tension and stiffness of the vocal folds. Three groups of intrinsic muscles are distinguished according to their principal function: – the tensor muscles of the vocal folds: the cricothyroid muscle, the vocalis; – the glottal dilatory muscle: the posterior cricoarytenoid; – the glottal constricting muscles: these include the lateral cricoarytenoid muscle, the lower thyroarytenoid, the upper thyroarytenoid and the interarytenoid. All these muscles are paired, apart from the interarytenoid.
Figure 2.11. The intrinsic muscles of the larynx
36
From Speech Physiology to Linguistic Phonetics
2.4.1.1. The cricothyroid muscle (CT) This muscle is paired and symmetric. It originates on the lower edge and outer surface of the cricoid arch. It is divided into two muscular groups: the pars recta and the pars obliqua. The pars recta inserts into the inner part of the inferior margin of the thyroid cartilage. The pars obliqua inserts into the anterior margin of the inferior horn of the thyroid cartilage. The CT is the most important intrinsic muscle and it is directly responsible for the operation of the cricothyroid articulation. The upper pars recta makes the cricoid cartilage swing forward towards the front of the thyroid cartilage. The pars obliqua is responsible for a forward translation movement1. The cricothyroid muscle produces an increase in the antero-posterior (lateral) diameter of the larynx and consequently a lengthening of the vocal folds. The ligaments and muscles of the vocal folds are thus tensed, hence the name of “vocal fold tensor” given to this muscle. Of all the intrinsic muscles, the CT is the muscle which is most directly involved with fundamental frequency (f0) regulation and variation. Continuous contraction of the CT produces an increase in f0; conversely, relaxation of the CT lowers it (Gay et al., 1972; Shipp et al., 1979; Honda, 1988). f0 is only independent of the CT in one circumstance: when f0 is already low and continues to decrease by about 15 Hz/s (Collier, 1975). In chest voice, the CT helps to control intensity (Gay et al., 1972). An increase in CT activity co-occurs with the production of accented vowels (Hirose and Gay, 1972). 2.4.1.2. The posterior cricoarytenoid muscle (PCA) Paired and symmetric, this muscle is the biggest and most powerful of the larynx muscles. It extends from the posterior surface of the cricoid plate to the muscular process of each arytenoid. Contraction of the posterior cricoarytenoid stimulates the translation movement which part the vocal processes. Thereafter, the lower vocal folds separate. The PCA is the sole abductor muscle of the vocal folds (see Figure 2.12).
1 Swing and rotation can be as much as five degrees and the translation movement can reach 1 mm (Takano et al., 2003).
Phonation and the Larynx
37
Figure 2.12. Action of the posterior cricoarytenoid muscle: vocal fold abduction
Hirose and Gay (1972) and Larson et al. (1987) have established that the presence or absence of voicing depends on PCA activity in separating the vocal folds from one another. This activity is weak and slight during phonation. In voiceless consonants, PCA activity increases (Zeroual et al., 2006). This phenomenon is even more marked when the consonant is in final position. In a word-initial vowel, PCA activity decreases. The amount of activity is the same if the word begins with a voiceless consonant. Poletto et al. (2004) found that the adductor and abductor muscles act synergistically to enable fine motor control of the tension in the vocal folds: this was observed for example during production of the syllable [hi], in which the PCA, the cricothyroid and the thyroarytenoid muscles were simultaneously active, although antagonistic. 2.4.1.3. The lateral cricoarytenoid muscle (LCA) Paired and symmetric, this muscle originates at the upper edge of the cricoid arch and it inserts on the muscular process in the lateral part of the arytenoid cartilage. The lateral cricoarytenoid muscle is the smallest of the intrinsic muscles. Its contraction produces the self-pivoting of the arytenoids. As a result, the vocal processes close and the length of the vibrating part of the vocal folds is reduced (Honda, 1983). This muscle is called the glottal constrictor (see Figure 2.13).
38
From Speech Physiology to Linguistic Phonetics
Figure 2.13. Action of the lateral cricoarytenoid muscle: vocal fold adduction
The LCA is instrumental in vowels and not in consonants. Its activity happens before the start of voicing (Hirose and Gay, 1972). In chest register, an increase in LCA activity is partly responsible for intensity control (Gay et al., 1972). 2.4.1.4. The thyroarytenoid muscle (TA) Paired and symmetric, this muscle is slender at the top and thick at the bottom. It originates on the inner surface of the thyroid cartilage, at the reflex angle. The thyroarytenoid is a very fast muscle. It opposes the cricoarytenoid and its main function is to draw forward the arytenoid cartilages, thus shortening the vocal folds and decreasing their tension. Two layers have been defined in the TA: – an external layer, divided into several bundles; one of them is attached to the epiglottal cartilage to form the thyroepiglottic muscle; – an internal layer, the true vocal fold muscle, attached at the back to the vocal process. The external layer of the thyroarytenoid muscles draws the epiglottis backwards and restricts the upper opening of the laryngeal inlet. The internal layer or vocalis muscle (VOC, i.e. vocalis) is responsible for the consistency, stiffness and tension of the vocal folds. An increase in TA activity correlates with increase in f0 (Gay et al., 1972). Conversely, a decrease in TA activity corresponds to a decrease in f0 (Arnold, 1961).
Phonation and the Larynx
39
Titze et al. (1989) show that when f0 and the intensity level are high, TA activity can produce a decrease in f0. They relate this apparently paradoxical finding to the results found during co-activation of the TA and the CT, where shortening and lengthening produce contradictory effects for the tension of the vocal folds. Similarly, Van Riper and Irwin (1958) observed a decrease in f0 when the thyroarytenoid was active. This could explain the findings of Martin et al. (1990) that, during singing, symmetric and increasing activity in the CAP, the TA and the CT correlated with an increase in intensity in three registers from 130 to 390 Hz. The vocalis muscle is active during vowels and inactive during consonants (Hirose and Gay, 1972). This is consistent with the fact that the vocalis muscle is active before the start of the voiced [i] in Japanese, whereas for the non-voiced [i], there is no evidence of activity. 2.4.1.5. The superior thyroarytenoid muscle (TAS) This paired muscle stretches from the upper part of the reflex angle of the thyroid cartilage to the muscular process of the arytenoid. It constricts the glottis. Little is known about its role in phonation. 2.4.1.6. The interarytenoid muscle (INT) The interarytenoid muscle is the only unpaired and oblique muscle of the larynx. This muscle consists of a transverse layer and an oblique layer. The transverse arytenoid muscle runs horizontally across the posterior face of the cartilages. On contraction, it draws the arytenoid together by pulling them up on the shoulders of the cricoid cartilage. The oblique arytenoid muscle overlays the transverse muscle in the shape of an X. The contraction of this muscle brings the apexes of the arytenoid cartilages together. It adducts the vocal folds. During forced contraction, it can bring the false folds together for “ventricular voice”. Some fibers in the oblique set are attached to the epiglottal cartilage: these are the aryepiglottic muscles. Their contraction draws the epiglottis back and down and helps to close off the vestibule of the larynx. Lee et al. (2001) find an increase in INT activity during voiced consonants. This is particularly clear when the vowel is preceded by a voiceless consonant, when the arytenoids are separated. For a voiced consonant, the INT is less active than for a vowel. This could be a manifestation of the fact that the glottis is slightly less closed for a voiced consonant than for a vowel.
Posterior surface of Rotates the arytenoids on the cricothe muscular process arytenoid joint. Pulls them inferiorly of each arytenoid and medially. Abducts the vocal folds. Widens the glottis. Devoicing Muscular process of Rotates the arytenoids inwards and the arytenoids downwards. Draws the vocal processes together. Voicing and Fo raising Posterior surface and Pulls the arytenoids together lateral border of each Voicing, Fo raising arytenoid Apex and lateral side Adducts the vocal folds. Can bring the of opposite arytenoid vestibular folds closer together, voice quality
Superiorly and laterally
(a)
INSERTION ACTION Side of epiglottis, at Pulls back the epiglottis : sphincter its apex action. Lower pharyngeal articulation Aryepiglottic fold Depresses the epiglottis. Closes the laryngeal inlet. Role in swallowing
COURSE Upwards and forwards Superiorly and posteriorly
Upper border of the Superiorly and cricoid cartilage posteriorly
ORIGIN Apex of each arytenoid Inner surface of the thyroid cartilage close to its angle Depression on the posterior surface of the cricoid cartilage
Transverse Muscular process of Horizontal opposite arytenoid Interarytenoids Superiorly and Oblique Lower posterior surface of each obliquely arytenoid
Lateral cricoarytenoid
Posterior cricoarytenoid
Thyroepiglottic
Aryepiglottic
MUSCLES
40 From Speech Physiology to Linguistic Phonetics
Vocalis (Deep part of the thyroarytenoid)
External thyroarytenoid
(Lower) Pars obliqua
Cricothyroid
MUSCLES (Upper) Pars recta
Posteriorly and superiorly
Vertically upwards
COURSE
(b)
Vocal processes of the arytenoids, near the vocal ligament
Anterior margin of the inferior horn of the thyroid cartilage Frontside of the arytenoid cartilage
Pulls the arytenoid cartilages forward Loosens the vocal ligament. Lowers pitch Increases loudness. Sets voice quality Tenses the vocal fold Voicing, Fo control
Sliding motion : Toward / away from the midline
INSERTION ACTION Inner part of the Rocking motion : anterior / posterior inferior margin of the Increases the length of the vocal folds thyroid cartilage Increases the tension = Fo raising
Table 2.1. The intrinsic laryngeal muscles (a and b)
Inner surface of the Posteriorly thyroid cartilage at the angle Posterior and inferior half of the angle of Posteriorly the thyroid
Lower border and outer surface of the arch of the cricoid cartilage
ORIGIN
Phonation and the Larynx 41
42
From Speech Physiology to Linguistic Phonetics
2.4.2. The extrinsic muscles In addition to their function of fixing the larynx to neighboring organs, the extrinsic muscles are responsible for the vertical movements of the larynx (see Figure 2.14 and Tables 2.3 and 2.4). Both the raising and the lowering of the larynx have indirect consequences on the volume and the pressure of air in the supraglottal cavities; they also induce changes in the degree of tension of the vocal folds.
Figure 2.14. The extrinsic muscles of the larynx
Phonation and the Larynx
43
2.4.2.1. The raising muscles or elevators 2.4.2.1.1. Direct action The stylopharyngeus muscle This muscle originates at the base of the styloid process and is attached by: – 1 bundle of fibers in the pharynx; – 1 bundle of fibers at the epiglottis; – 1 bundle of fibers on the upper horn of the thyroid cartilage; – 1 bundle of fibers on the upper edge of the cricoid cartilage; The stylopharyngeus raises the pharynx and the larynx. The pharyngo-staphyline muscle The pharyngo-staphyline is a muscle of the velum. One branch of it is intertwined with the stylopharyngeus and runs as far as the rear side part of the upper edge of the thyroid cartilage. This muscle restricts the pharyngo-nasal isthmus; it lowers the soft palate and it also raises the pharynx and the larynx simultaneously. 2.4.2.1.2. Indirect action A great number of muscles originate on the hyoid bone, giving it a complex role in the interplay between it and the base of the tongue, the pharynx and the larynx. Other hyoid bone functions include raising and lowering the larynx, lowering the jaw as well as pulling the root of the tongue slightly backwards and down (MacNeilage and Scholes, 1964). The hyoid bone The hyoid bone is horseshoe-shaped and is a rather special structure: it is in fact the only bone in the human body which is not joined to any other bone. It is in some way “suspended” above the larynx, at the height of the fourth cervical vertebra. It is anchored by ligaments to the styloid processes of the temporal bone. There are three parts to the hyoid bone: the middle part or body, and two horns: the greater horn and the lesser horn (see Figure 2.15).
44
From Speech Physiology to Linguistic Phonetics
Vertical ridge Transverse ridge Figure 2.15. Anterior view of the hyoid bone
The body of the hyoid bone consists of a quadrilateral blade of bone. It has two sides: – the front, from which the geniohyoid, genioglossus, mylohyoid, digastric and stylohyoid muscles originate; – the back, from which the thyrohyoid, sternocleidohyoid and omohyoid muscles originate. The greater horn is an extension of the body of the hyoid bone. It forms an anchorage point for the hyoglossus and the thyrohyoid muscles. The lesser horn is a small ovoid bone working in conjunction with the body and the greater horn. Originating from them are the upper lingual and lower lingual muscles and the middle pharyngeal constrictor. As mentioned above, and because of its multiple attachments, the hyoid bone tends to move up and down. Bothorel (1980) established the incidence of vertical movements by the hyoid bone during speech and song from an acoustic and radiocinematographic study. He found that the hyoid bone is systematically higher for voiceless consonants than for voiced consonants, and that this holds for both plosives and fricatives. In vowels, he noticed a correlation between elevation of the hyoid bone and rises in f0.
Table 2.2. Interactions between the hyoid bone, the root of the tongue, the mandible and the larynx
Phonation and the Larynx 45
46
From Speech Physiology to Linguistic Phonetics
The suprahyoid muscles This group of muscles is attached to the bottom of the hyoid bone and to other structures above it. When the upper attachments are fixed, the contraction of the infrahyoid muscles draws the hyoid bone upwards and makes the larynx rise: – the genioglossus muscle is attached to the tongue, the base of the skull and the hyoid bone. It pulls the tongue forward and raises the hyoid bone; – the hyoglossus muscle is attached to the greater horn of the hyoid bone and to the sides of the tongue. It lowers the tongue while drawing the hyoid bone upwards. The action of these two muscles has been invoked to explain the intrinsic f0 differences between high and low vowels. Thus, for the [i] vowel, the f0 is slightly higher than the f0 of all other vowels in the same syllable and in the same phonetic environment. The interaction between the tongue and larynx is the basis for the “pull tongue theory”, or the theory of lingual attraction, accounting for the association between tongue movement and variation in vocal fold tension; – the geniohyoid and mylohyoid muscles are attached to the lower mandible. If the jaw is fixed, their contraction lifts the hyoid bone; – the stylohyoid muscle, attached to the base of the skull, raises the hyoid bone;
– the lower pharyngeal constrictor is attached by fibers to the external side of the thyroid cartilage, by another branch to the fibrous arcade which connects the thyroid cartilage to the lower edge of the cricoid cartilage, and finally by a cricoid branch to the lower edge of the cricoid cartilage itself. The contraction of this muscle produces a front-to-back and transverse constriction in the pharynx. It also raises the larynx. The activity of these muscles results in a rise in f0 (Erikson et al., 1977; Honda, 1983; Honda et al., 1999). The digastricus (anterior and posterior belly), which is attached to the base of the skull and the mandible, raises the hyoid bone and lowers the mandible. The contraction of the digastricus enlarges the pharyngeal cavity. This enables it to maintain a sufficient difference in transglottal pressure and thus helps to sustain vocal fold vibration for a longer period of time; for example, it contributes to longer voicing during the held phase of voiced plosives.
Table 2.3. The extrinsic laryngeal muscles: the elevators
Phonation and the Larynx 47
48
From Speech Physiology to Linguistic Phonetics
2.4.2.2. The lowering muscles or depressors 2.4.2.2.1. Direct action The sternothyroid This muscle runs from the sternum to the thyroid cartilage. In contracting, it fixes the attachment point of the thyrohyoid and lowers the larynx. Lowering of f0 is observed (Sawashima et al., 1973; Sawashima and Hirose, 1980). The thyrohyoid The thyrohyoid continues the course of the sternothyroid towards the hyoid bone. After lowering the larynx, it draws the hyoid bone down. There is no systematic effect on the f0 level (Kakita and Hiki, 1974), apart from prolonging voicing. Masaki et al. (1999), using MRI, noticed that the larynx is lower for /d/ than for /t/. 2.4.2.2.2. Indirect action The larynx can also be lowered by a contraction of the sternohyoid muscle (SH). Simada et al. (1991) note that SH contraction can be associated with jawopening, and with the lowering and retraction of the tongue. SH activity begins immediately after the onset of voicing (Hirose and Fujimura, 1970). For Simada et al. (1970), Hirose et al. (1970), Simada et al. (1991) and Erikson et al. (1983), SH activation corresponds to a lowering of f0. Accordingly, Atkinson (1978) sees SH activity during the held phase of voiceless consonants. Collier (1975) observes that changes in SH activity co-occur with the start of a fall in f0 in the transition from high to low f0. Halle (1994) points out that SH has a role in readjusting f0 to a medium low level at the start of the Chinese tone 2 (Medium-Rising). It is often active during the lowering of f0 in tone 4 (High-Low).
Intermediate tendon Lower body of the hyoid bone
Lower border of greater horn of the hyoid bone
INSERTION
Vertically
Lower border of the greater horn of the hyoid bone
Superiorly and Oblique line on the slightly laterally thyroid cartilage
Vertically
Anteriorly and upwards
COURSE
ACTION
Lowers the larynx. May cause rotation of the cricothyroid joint Decrease in Fo Tilts the hyoid bone backwards Depresses the hyoid bone If hyoid bone fixed, elevates the thyroid cartilage, raising Fo
Draws the hyoid bone down. Pulls the larynx forward and downward Lowering Fo. Ingressive airstream
Lowers the hyoid bone and the larynx
Table 2.4. The extrinsic laryngeal muscles: the depressors
MUSCLES ORIGIN Depressors Anterior Intermediate tendon Omohyoid Posterior Upper border of the scapula Sternohyoid Posterior surface of the manubrium and medial end of the clavicle Sternothyroid Posterior surface of the manubrium and first costal cartilage Thyrohyoid Oblique line of thyroid cartilage
Phonation and the Larynx 49
50
From Speech Physiology to Linguistic Phonetics
2.5. Innervation of the larynx Two principal nerves can be defined, divided into several branches. The upper laryngeal nerve comes from the plexiform pneumogastric ganglion. This nerve is generally recognized as the motor nerve of the cricothyroid muscle. The lower laryngeal nerve or recurrent nerve innervates all the other intrinsic muscles of the larynx. 2.6. The mucous membrane of the larynx The cartilages, ligaments and muscles of the larynx are covered with a mucous membrane which gives the larynx its characteristic appearance. This mucous membrane runs from the larynx to the trachea and continues through the bronchi into the lungs. Where it folds over on itself, it forms the chief part of the ventricular fold. This membrane adheres strongly to the epiglottis and to the vocal folds. Everywhere else, it is generally lax and easily stretched. The vocal mucous membrane has an important role in voice production. 2.7. Phonation The first known experimental studies were conducted by Ferrein (1721)2. To this researcher we owe the first formulation of the myoelastic theory of phonation. His work was supplemented by the numerous detailed quantitative studies of Müller (1837), another researcher to whom we owe a large part of our knowledge of the larynx. Müller worked from the excised larynxes of cadavers; by blowing and playing with weights and counterweights, he studied the effects of changes in tension and airflow on the quality of the sound. In his work, Müller noticed that sound is produced at the level of the vocal folds. He remarked that speech can be considered as a modification of a laryngeal source by the upper cavities. Müller established that laryngeal frequency is a function of the length and tension of the vocal folds and of subglottal pressure. If tension is kept constant, an increase in subglottal pressure leads to an increase in frequency. What is particularly impressive is that Müller never directly observed the vocal folds in motion. To be able to observe them, laryngeal mirrors had to be invented, such as
2 It is from Ferrein that we have inherited the controversial term “vocal cords”, suggesting fine narrow bands like violin strings, when actually they are fairly thick musculomembranous bands that are horizontally oriented.
Phonation and the Larynx
51
those designed by Garcia (1855). Experiments as numerous and rewarding as those of Müller would not be repeated until Van den Berg (1955). The very nature of the object of study and its inaccessibility has made research into the vocal folds excessively difficult. Our current knowledge has not been acquired by a single method: it is a synthesis of data gathered by several techniques: stroboscopy (Schutte et al., 1998; Lee et al., 2001) photography (Yanagisawa and Yanagisawa, 1991), cinematography, radiology (Kusuyama et al., 2001), fiber-optic endoscopy (Rasp et al., 2006), electromyography (Faaborg-Anderson, 1957), electro-glottography (Fourcin, 1974; Baken, 1992; Abberton and Fourcin, 1997) and more recently MRI (Honda et al., 1995, 2004; Narayanan et al., 2004; Kröger et al., 2005). The generation of the glottal wave is the result of a complex combination of muscular, elastic and aerodynamic forces. 2.7.1. Opening and closing of the glottis The opening and closing of the glottis essentially depends on the movements of the cricoid, thyroid and arytenoid cartilages. The cricothyroid muscle draws up the arch of the cricoid cartilage and tilts back the upper edge of its lamina. The shape of the articulatory surface of the cricoid arch allows rotatory and gliding movements. The rotatory movement is one in which the cricoid cartilage rotates upon the inferior horn of the thyroid cartilage around an axis passing through both joints. The distance between the vocal processes and the angle of the thyroid is thus increased and the folds are consequently elongated. The articulation between the arytenoid cartilages and the cricoid allows two types of movement: one is a rotation of the arytenoid on a vertical axis, whereby the vocal process is moved laterally or medially (as a result the rima glottidis increases or diminishes); the other is a gliding movement, which allows the arytenoid cartilages to come closer or to recede from each other. The movement of the arytenoid cartilages can be directly controlled by the cricoarytenoid muscles; these ensure an inner rotation around the vertical axis of the arytenoid cartilages. The posterior cricoarytenoid rotates the arytenoid cartilages outward around a vertical axis passing through the cricoarytenoid joints, so that their vocal processes and the vocal folds attached to them become widely separated: the glottis opens. The lateral cricoarytenoids rotate the arytenoid cartilages inward, in order to approximate their vocal processes: the glottis closes.
52
From Speech Physiology to Linguistic Phonetics
The interarytenoid muscles crossing transversally between the arytenoid cartilages approximate them and thus close the glottis, especially at its back part.
Table 2.5. Muscular activity and main laryngeal gestures
2.7.2. Vocal fold vibration During normal respiration, the vocal folds are separated and allow a relatively free passage of air. To achieve phonation, an appropriate balance between transglottal pressure, vocal fold thickness, longitudinal tension, degree of abduction, glottal configuration and tissue damping must be achieved. 2.7.2.1. Vibration of the vocal folds according to the myoelastic theory As a first step, the vocal folds are brought together by contraction of the lateral cricoarytenoid and interarytenoid muscles. When the vocal folds are adducted, air coming up from the lungs meets an obstacle to its normal flow. It creates beneath the vocal folds a pressure directed at right angles to them: subglottal pressure. This steadily increases until it equals the opposing force of the vocal folds. When subglottal pressure overcomes the resistance, the vocal folds are blown apart by the puff of air, at which point subglottal pressure falls. Having effected the opening, the force ceases to exist and the vocal folds adduct again, because of their weight, their tension and their elasticity. This closing is greatly reinforced by the retro-aspiration phenomenon (following Bernoulli’s law), produced by the passage of air between the vocal folds.
Phonation and the Larynx
53
The Bernoulli effect, which literally sucks the folds together medially, is one of the manifestations of the principle of conservation of energy. In passing through the glottis, the speed of airflow increases and the air pressure between the vocal folds decreases. The latter can diminish under atmospheric pressure. This negative pressure sucks the two folds towards the center. The more the glottal space narrows, the more the speed increases and the more marked this effect is. At the same time, there is a limit caused by friction which produces a resistance. Once the vocal folds have returned to the center, they again exercise a force opposing the passage of air. Subglottal pressure rises again. The cycle previously described repeats itself. These rhythmical openings and closures are responsible for the generation of glottal waves. Laryngeal frequency equals the number of openings and closures or cycles per second. To generate voice, a subglottal pressure of 2-3cm H2O and a transglottal output of air of 50 cm3 are required. In normal speech, these values vary in the range of 1015 cm H2O and 100-300 cm3 . Fundamental frequency essentially depends on subglottal pressure or more exactly on the difference of pressure below and above the glottis; on the output of air going through the glottis; and on the tension in the vocal folds. Tension in the vocal folds is the product of several parameters: the length of the vocal folds, the length of the vibrating mass, the consistency of the membranous layer and the tension of the vocalis muscle itself. These parameters are principally controlled by the intrinsic muscles of the larynx and to some extent by the vertical movements of the larynx and the position of the hyoid bone and the tongue root. The body-cover theory has been proposed to take account of the movements produced at the level of the mucous membrane of the vocal fold and of their interactions with the vocal ligament and the bottom layer of the vocal fold (Titze, 1994). 2.7.2.2. Elements of the body-cover theory Current understanding of the complexities in both tissue composition and vibratory mucosal waveforms has generated more advanced theories about vocal fold vibration. Hirano's body-cover theory (1981) is the first recognition of the important role of the passive (non-muscular) superficial layers (epithelium and lamina propria) to vocal fold vibration. Each layer contributes a graduated change in mass and compliance for vibration. Very high-speed films of vocal fold movement show that during glottal closure, complete adduction is reached sooner in the upper part of the vocal fold. Conversely, when the upper parts of the vocal folds separate, the lower lips begin to close again (see Figure 2.16). The difference is due to the
54
From Speech Physiology to Linguistic Phonetics
difference in compliance of the different tissues. The epithelial layer is most elastic, whereas the muscle tissue is stiff. The lamina propria serves as a coupling between the superficial mucosa (compliant, fluid oscillation) and the deep muscle tissue, providing the underlying stability of vocal fold mass and tonus. The vibration of the vocal folds permanently changes the relationship between aerodynamic forces. The lack of synchrony between opening and closing over the whole length of the vocal folds means that Bernoulli’s law has a different effect on the upper and lower sections of the vocal fold. This effect is accompanied by a difference in speed of movement in the two sections of the vocal folds. There is thus a vertical phase difference.
Figure 2.16. Schematic representation of the vibration of the vocal folds seen from the front and from above (from Hirano, 1981)
Phonation and the Larynx
55
Because of the relative independence of the mucous membrane and the vocalis muscle, and because of the greater flexibility of the former, the passage of air will lift it. It will wave as it glides over the ligament below it. This movement is very complex because of the visco-elastic properties of the mucous membrane. This undulatory movement introduces a horizontal phase difference. The horizontal and vertical elements give rise to an elliptical vibratory movement. If we examine the movements of particular points of the vocal fold, we can see the great variety of their trajectories. These movements cannot however be considered to be totally random: the regularity they obey is long term (see Figure 2.17). Over a long period, they tend to a periodicity governed by an “attractor”. Such a system is known as chaotic. The non-linear dynamic emphasises the regularity below. The “chaos” is characterized by the degree of deviations or bifurcations of trajectory according to a deterministic system of n-dimensions (Titze, 1993). Lyapunov coefficients expressing the fractal dimension of a signal allow a chaotic system to be quantified. Giovanni et al. (1999) examine their usefulness in characterizing glottal vibration and distinguishing normal voice from pathological voice.
Figure 2.17. Trajectories in a phase plane of a point on the vocal fold from the initiation of phonation and after a few cycles (from Titze, 1993)
56
From Speech Physiology to Linguistic Phonetics
2.7.3. Voice registers The length, tension and thickness of the vocal folds can be adjusted to control laryngeal frequency, intensity and vocal quality (see Table 2.6). Most often, two voice registers are distinguished: chest voice, corresponding to the frequencies of the normal range of the speaking voice, and falsetto for high frequencies (Roubaud et al., 1987, 1997; Sundberg, 1995). Transglottal pressure Rate of flow Expiratory muscles Interarytenoid
Medial compression Lateral cricoarytenoid Transverse. Oblique interarytenoids
Height of the larynx Extrinsic laryngeal muscles Elevators and depressors
Up = Vocal fold tension increase intra-oral air pressure up
Increase in subglottal pressure and transglottal airflow
Compression by arytenoid cartilages
Fo raising Intensity raising
Fo raising Change of register
Down = Vocal fold tension decrease Intra-oral air pressure down. Fo Ì Fo changes quite dependent upon airflow and intrinsic adjustment Micromelodic component
Table 2.6. Control parameters of the states of the glottis
2.7.3.1. Chest voice 2.7.3.1.1. Vocal folds and vibratory movement Chest voice is characterized by the great amplitude of the vibratory movements of the vocal folds. The distance between the folds can reach about 3 mm. The vocal folds are thick and have the appearance of lips. The vocal folds vibrate over their whole length, from the inner surface of the thyroid cartilage to the arytenoid vocal process. This produces a great vibrating mass with weak effective tension. The duration of closure increases when the frequency is lowered. The closure phase lasts longer than the opening phase.
Phonation and the Larynx
57
2.7.3.1.2. Fundamental frequency variation The passive longitudinal tension of the vocal ligaments is weak. The subglottal pressure is low. An increase in the active tension of the vocalis muscle causes the frequency to rise as long as the passive pressure does not exceed a relatively low threshold. If the subglottal pressure is too weak, vibration ceases; if it is strong enough, it produces a change of register and goes into falsetto voice. Loss of energy tends to deaden vocal fold vibrations; this is largely due to the adduction of the vocal folds itself, to the resistance of muscular tissue and to aerodynamic resistance. 2.7.3.2. Falsetto Falsetto register includes high frequencies. In terms of laryngeal functioning, there is a great difference between chest register and falsetto register. In falsetto register, the main role is played by laryngeal adjustment realized by the activity of the muscles of the larynx. This allows the effectively vibrating mass to be adjusted by increasing the static properties of the tissues and by diminishing the importance of the aerodynamic forces. The muscles of the larynx have the effect of lengthening the vocal folds. It is also important not to underestimate Bernoulli’s effect, for the length of the constricted passage with friction is greater. 2.7.3.2.1. The vocal folds and their vibration The vocal folds are elongated and very slender in falsetto register. Passive longitudinal tension is very great. Subglottal pressure is raised and can reach 50 cm H2O. The amplitude of the vibrations of the vocal folds is reduced to about 1 mm. If the vestibular bands are sufficiently close, they may vibrate. The period of closure is very short and the vocal folds never adduct completely. The glottis is permanently open. The consumption of air is great. When the vibration of a vocal fold is deadened, a phase difference may occur between the vibrations of the two vocal folds. When the difference is great, the vocal folds may even vibrate at different frequencies. This possibility, diplophonia, is sometimes exploited by singers: it is a notable characteristic of Eskimo singing. An increase in the passive longitudinal tension of the vocal ligament produces a rise in frequency. The contraction of the vocalis muscle lowers the frequency and results in chest voice if the passive tension drops below a certain threshold. 2.7.4. Head voice? There is a zone of frequencies where chest voice and falsetto voice overlap. This observation gives rise to a controversy as to whether there is a need for a third
58
From Speech Physiology to Linguistic Phonetics
register: head voice. The question is how can it be characterized? Chest voice and falsetto voice can be defined objectively. They correspond to two very specific modes of functioning of the vocal generator: – in chest voice, when subglottal pressure is weak and unchanged, an increase in the active tension (vocalis muscle) lowers the f0, whereas the opposite result is obtained in falsetto voice when the strong subglottal pressure remain unchanged; – this differential behavior of the larynx is also reflected in the control of intensity variation. Isshiki (1964, 1965) observed that for medium and low laryngeal frequencies (chest voice), an increase in subglottal pressure and intensity were strongly correlated. When air output is constant, an increase in subglottal pressure produces an increase in aerodynamic forces. The opening of the larynx diminishes. Intensity and f0 both increase; – in falsetto voice, the open quotient may diminish as a result of the increase in subglottal pressure; but it may also be adjusted by the muscles of the larynx. Van den Berg (1958) and Mead and Bunn (1974) found that intensity may vary and increase even if subglottal pressure remains constant or decreases. Fink (1962, 1975) found stronger activity in the lower thyroarytenoid corresponding to an increase in intensity. Our research on subglottal pressure and intensity (Marchal and Carton, 1980) also demonstrates the effect on intensity of laryngeal adjustment. These variations are particularly noticeable in the transition from chest voice to falsetto. In this respect, it seems more difficult to account for the behavior of the vocal folds in the hypothetical head voice, because they seem to function sometimes as for chest voice and sometimes as for falsetto voice. The need to have a head voice category in order to classify the frequencies and functioning of the larynx does not seem justified from a functional point of view. 2.7.5. Efficiency of the vocal generator The efficiency of a system is defined as the ratio between input and output of energy (Titze, 1989). Energy input to the vocal generator is roughly equivalent to subglottal pressure. The energy output mainly consists of the acoustic energy radiating from the lips, fairly well represented by the intensity of the vowels. It should however be noted that the vocal tract absorbs an appreciable quantity of energy. Moreover, measuring subglottal pressure is currently not the easiest of tasks. Isshiki (1965) suggests that the efficiency of the vocal generator is best measured by the ratio of effective glottal flow to mean output of air.
Phonation and the Larynx
59
2.7.6. The evaluation of phonation: voice quality Voice quality depends mostly on two types of adjustment: the state of the glottis and the supra-glottal configuration of the vocal tract. Fine variations in the laryngeal source can provide information of the physical state of the speaker (degree of fatigue, sex, age, vocal condition; Gauffin and Hammarberg, 1991) as well as on the speaker’s mood and attitude (Ni Chasaide and Gobl, 2007). There are two perceptual indices (GRBAS and RBH) used to evaluate phonation in addition to the acoustic indices: “jitter” and “shimmer”: – the perceptual judgement GRBAS gives a judgement of 0-3 on five parameters: G for the degree of hoarseness, R for roughness, B for breathiness, A for asthenicity and S for strain. In Europe, a scale of three parameters is more frequently used: RBH for roughness, breathiness and hoarseness; – “jitter” measures stability by examining variability over cycles (Schoentgen, 2001). This measure shows deviations from a norm and is often used to describe various voice pathologies, such as noisy or raucous voice; – “shimmer” measures the variability of amplitude over cycles and gives an indication of glottal flow. 2.8. The linguistic functions of laryngeal activity The larynx is the first place where the airflow from the lungs can undergo important modifications. Airflow passage can be relatively free, completely blocked or restricted in varying degrees (see Figure 2.18). We have seen how fine and how complex laryngeal adjustment can be. Every language makes great use of the contrasts deriving from modifications to airflow in the larynx. Catford (1977) describes 10 glottal states that can be linguistically significant. For Ladefoged (1971), seven states of the vocal folds and larynx are enough to account for the linguistic contrasts found in the languages of the world. 2.8.1. Glottal states and phonation types Following Ladefoged (1971), we identify seven phonation types: – modal voice; – voicelessness;
60
From Speech Physiology to Linguistic Phonetics
– breathy voice; – murmur; – laryngealization; – glottal occlusion; – whisper. 2.8.1.1. Modal voice The vocal folds close along their whole length. Adductive tension and medial compression are moderate. An increase in the airflow rate of air coming from the lungs produces suction (the Bernoulli phenomenon) which completes the closure of the vocal folds. There follows an increase in subglottal pressure: when this becomes greater than the forces of adduction, it forces its way through, pushing the vocal folds to the sides. The opening and closing cycle of the glottis is thus repeated in a semi-periodic fashion, as long as tension, airflow and air pressure are sufficient to maintain the vibration of the vocal folds. In modal voice, the necessary minimum subglottal pressure is 2-3 cm H2O. In normal speech, it is 10-15 cm H2O. The transglottal passage of air varies between 50 and 350 cc/s. As for airflow rate above the glottis, the jet of air can reach 2,000-5,000 cc/s at its opening. In this phonatory mode, f0 is essentially controlled by subglottal pressure and the activity of the vocalis, the thyroarytenoid and cricothyroid muscles. Voicing occurs in every language in the world. It is a feature of the production of vowels and voiced consonants. 2.8.1.2. Voicelessness Voicelessness is a phonation type as frequently used as modal voice. It is characterized by abduction of the vocal folds and absence of vibrations. The glottis is largely open, although less so than during normal respiration. Airflow is laminal; its speed is of the order of 200-300 cc/s. The vocal folds do not vibrate, nor is there any generation of sound. The devoicing type allows voiceless consonants to be contrasted with voiced consonants, as in the minimal pairs p/b, t/d, s/z, f/v, etc. 2.8.1.3. Breathy voice Breathy voice is distinguished from voicelessness by the degree of glottal width, which is greater in voicelessness. The vocal folds are closed along their whole length, but they vibrate without appreciable contact. Adductive tension is minimal. Medial compression is low. The constriction of the glottis prevents laminal airflow. The rate of airflow is much higher than in modal voice (900-1,000 cc/s) and turbulent flow appears. This process is accompanied by a hiss of air. Describing this
Phonation and the Larynx
61
as “aspiration”, as is often the case in phonology, is somewhat inaccurate as it is really a current of egressive air. Before and after the release of a plosive or fricative followed by a vowel, it sometimes happens that the vocal folds do not begin to vibrate immediately and that a hiss can be heard. This phenomenon, improperly called aspiration, is fairly widespread as a linguistic distinction: notably, it distinguishes /pàa/ “forest” from /phaa/ “to separate” in Thai. In an aspirated segment, the vocal folds are separated during release: they cannot vibrate. When a non-aspirated voiceless consonant is followed by a vowel, the vocal folds vibrate immediately after the release of the consonant. In a non-aspirated voiced consonant, the vocal folds vibrate during the held phase and continue to vibrate after articulatory release. Aspiration is incompatible with voicing. Certain phonetic descriptions postulate a series of four-way contrasts which include an aspirated voiced consonant. This is true of Hindi, for example. Pandit (1957) prefers to class this type of phoneme as a murmur. The vibratory pattern of the vocal folds is specific to this type of sound and is in no way comparable to the vibratory cycle of vocal folds habitually found in voiced consonants. 2.8.1.4. Murmur The arytenoids are parted and only the vocal fold ligament can vibrate. Adductive tension is weak. Medial compression varies from moderate to strong. Transglottal airflow rises to 300-400 cc/s. The sound is described perceptually as “breathy and whispered”. To designate such sounds “voiced aspirants” is doubly false: to do so is to use neither the term “aspirate” nor the term “voiced” in the sense in which it is normally used for other phonemes. This class of sounds is found in Bantu languages. It is also found in many Indian languages; thus, in Gujarati, [bar] means ”12”, whereas [Eણar] means “burden”. 2.8.1.5. Laryngealization The arytenoid cartilages are firmly adducted. Adductive tension is very strong, as is medial compression. Longitudinal tension is weak. The arytenoids are turned inwards, leaving only a small part of the vocal ligament able to vibrate anteriorly. Airflow is weak and flow rate reaches around 12-20 cc/s. It produces a series of small semi-regular puffs of air. The resulting sound is low in frequency and poor in harmonics; it is a raucous sound also known and defined as “creaky voice” or “vocal fry”.
62
From Speech Physiology to Linguistic Phonetics
In Margi, /bábá/ means “placed” and /bàbà/ means “hard”. Lango also has examples of a laryngealized vowel which forms a phonological contrast with a simple vowel: /lee/ (“animal”) and /le߆e߆/ (“chop”). 2.8.1.6. Glottal closure The arytenoids are firmly pressed together. The vocal folds are completely closed for their whole length in order to produce a plosive articulation. In Tagalog, this state of the vocal folds has a phonological function in that /kazo:n/, (meaning “straw”) contrasts with / ka?o:n/ (meaning “box”). In French, the glottal stop only has a stylistic effect. It is usually found preceding initial vowels, for example in commands: [?avozarm] (“à vos armes!”); [?alt] (“halte!”). A glottal stop can even replace a voiceless consonant. This is very common in the Cockney variety of English where it is an allophone of /p/, /t/ and /k/ in word final position. 2.8.1.7. Whisper The vocal folds are closed, while in contrast the arytenoids are separated and the intercartilaginous glottis is open. Transglottal airflow is weak: 25-30 cc/s. Whispering can have no phonological role except by contrast with voicelessness. This seems to be the case for the final position in Wolof.
Figure 2.18. Main glottal states: 1) Wide opening, laminal airflow (200-300 cm3): voicelessness. 2) Adduction/abduction of the vocal folds, airflow rate from 50 to 350 cm3, subglottal pressure from 5 to 30 cm H2O: modal voice. 3) Complete closure of the glottis: glottal stop. 4) Arytenoids apart, vibration of the ligamental part, airflow from 300 to 400 cm3: murmur. 5) Arytenoids close together, slight opening of the ligamental part, small periodic burst of air (12-20cm3), low frequency (40-60 HZ): creak. 6) Constriction, turbulent airflow (25-30 cm3): voiceless glottal fricatives: whisper. 7) Narrow opening, vocal fold vibration and noise generation, very important airflow (900-1,000 cm3): breathy voice
Phonation and the Larynx
63
2.8.2. Tone and intonation The possibility of varying the frequency of opening and closing the vocal folds is exploited in languages for the purpose of organizing discourse. Intonation allows a hierarchy of several elements. For this purpose, prosody interacts with phonology, morphology and syntax, and is parallel to the semantic structure of the spoken phrase. Fundamental frequency variation can also apply to segmental elements only and has a role in distinguishing between pairs of words according to the level or the direction of evolution of laryngeal frequency (see Chapter 6). 2.8.3. Glottal articulation When considering states of the glottis, we indicated that glottal closure along the whole length of the vocal folds was possible. Esling (2006) reports the existence of epiglottic plosives and fricatives in Agul, a Caucasian language. These consonants have sometimes been described as pharyngeals; this is tenable for the fricative, but it is physiologically impossible to achieve closure at the level of the epiglottis and the pharyngeal wall. It is therefore a matter of aryepiglottic consonants in which the epiglottis is a passive articulator, with the false vocal folds as the active articulator. 2.9. Phonetic features There are no languages which simultaneously make use of more than three glottal states in the same system. Most languages have only two such contrasts: voiced/devoiced and aspirated/non-aspirated. For a phonological description, it is therefore unnecessary to distinguish seven features. One phonetic property is common to all the states we have described: they all rely on the vocal folds being more or less approximated, i.e. on glottal constriction. Chomsky and Halle (1968), following the lead of Lieberman and Ladefoged, also retain the feature of heightened subglottal pressure. These two features should allow linguistic definition on an objective graded phonetic basis of the glottal source of the phonemes of languages Lisker and Abramson (1971) criticize this view of matters and draw the attention of phoneticians and phonologists to voice onset time (VOT), or the moment when vocal folds begin to vibrate. Their study of 14 languages (1964) brings them to conclude that the contrasts of voicing, aspiration and force of articulation could be sensibly replaced by a contrast based on the temporal control of glottal opening and
64
From Speech Physiology to Linguistic Phonetics
closure. In English, for example, /p/ and /b/ are usually devoiced and only distinguished by VOT. For /b/, the vocal folds begin to vibrate immediately after release, whereas for /p/ there is a certain amount of delay before voicing occurs. The problem with this description becomes apparent with languages that employ threeway contrasts as in Korean, for example (Kim, 1970). How can VOT be distinguished from “aspiration”? The study of Burushaski (Marchal et al., 1977) suggests that the notion of VOT has no validity in this language. Finally, what about languages where there are fourway contrasts, as in the case of Hindi? The only solution would be to reintroduce the notion of graded opposition/contrast, on condition that it has a strong explanatory power. It is necessary to distinguish between the physical continuum, systematic phonological typology and the need for explanatory power in phonetic features. Where VOT is concerned, it appears that to consider temporal control that is impossible to demonstrate is to be far removed from physiological events. VOT does not make it clear what types of constraint are at work. It cannot account for the phenomena of combinatory phonetics that are understandable in terms of the features of subglottal pressure and glottal constriction. This is not to say that such delay does not exist or that there is no specific timing at which vocal fold vibration begins, but that other identifiable phenomena may be responsible for the observed delay. We think that excessive simplification is a mistake, and that phonetic description does not need to be minimally redundant. It is too easy to succumb to the temptation of hasty generalizations which, when found wanting for one language or another, naturally lead to ad hoc descriptive procedures. In the same vein, the tone switching encountered for vowels following certain consonants in a number of languages (Hombert et al., 1979) results from a particular state of the vocal folds, subglottal pressure, configuration of the vocal tract and the sequence of sub- and supra-glottal events. This is also true for the presence of aspiration at consonantal release (Kim et al., 2005), which does not originate in itself from a disparity in the times at which the vocal folds begin to vibrate. It is in order to avoid this kind of pitfall that we believe it is essential to conduct research into the physiology of phonation, which lends itself to linguistic formalization. Close attention to the groups of muscles involved can shed light on the consequences of certain actions on other organs or articulators, whereas an atomistic approach neglects the dynamic aspect. In fact, it must be borne in mind that man speaks by means of organs which have certain limits. One way of envisaging how this is done is to situate the process in a physiological theory of speech production and to invoke biomechanical and aerodynamic models.
Chapter 3
Articulation: Pharynx and Mouth
In the previous chapters, we saw how the current of air generated by the respiratory system could be transformed at the level of the larynx by various adjustments of the vocal folds. It is at the level of the supralaryngeal cavities that the articulatory organs give speech sounds their characteristics and definitive properties. All the organs that have a role in this final stage of speech production have a primarily biological function: they enable the absorption, mastication and transport of food. Their utilization for speech is a secondary function. The property common to all these organs is their great mobility and thus the possibilities they offer for rapid modification of the vocal tract: changing the size of the oral cavity and the pharyngeal cavity, connecting with the nasal cavity, adding a labial resonance. The articulators act either by allowing air to pass relatively freely (vowel articulation) or, through a mesh of constrictions, creating phenomena of turbulence and frictional sounds (consonantal articulation). Finally, consonantal articulation can also be created when the articulators completely block the passage of air, as for plosives. The bucco-pharyngeal cavity lies at the top of the larynx and extends to the lips (see Figure 3.1). It is bounded on top by the palatal bone, at the back by the pharyngeal wall and at the bottom by the jaw and the hyoid bone. We will examine in turn the articulatory roles of the different structures and organs of the oral cavity and the pharynx.
66
From Speech Physiology to Linguistic Phonetics
Figure 3.1. Principal anatomical landmarks of the vocal tract
3.1. The oral cavity The palatal region forms the upper boundary of the oral cavity and separates it from the nasal cavities; it consists of the hard palate in front and the velum behind. The boundary at the front is the upper set of teeth. The palatal region is a concave vault, usually divided into three main zones: the alveolar ridge, the hard palate and the soft palate (see Figure 3.2). These zones can in turn be divided into three parts (from front to back: pre-, medio- and post-). There are no precise anatomical boundaries corresponding to the traditional phonetic divisions.
Articulation: Pharynx and Mouth
67
Figure 3.2. Upper oral articulatory locations
The alveolar ridge includes the furrowed part lying immediately behind the incisors. The hard palate extends back from the alveolar ridge and covers the palatal bone: this is the central part. The back part of the palate, the soft palate or velum, extends into the uvula. When the velum is lowered, air can escape through the nose, giving rise to nasal resonance. When the velum is raised and makes firm contact with the pharyngeal wall, the rhinopharyngeal passage is closed: this produces sounds that are purely oral in nature. Two-thirds of the oral cavity is taken up with the tongue. 3.1.1. The tongue The tongue is a muscular organ which takes up the central part of the floor of the oral cavity. Ovoid in shape, the tongue is flattened at the front and wider and thicker at the back. The tongue has an osteofibrous structure. Its skeleton includes the hyoid bone and two fibrous membranes: the lingual septum, which divides the tongue into two halves, and the hyoglossian membrane. The anterior part of the tongue comprises the apex and the blade. The apex corresponds to the foremost front part facing vertcally the upper teeth. The blade extends on the upper lingual surface behind the apex and extends 15-20 mm to the back.
68
From Speech Physiology to Linguistic Phonetics
The dorsum of the tongue can be divided into two parts: the root and the body. The tongue root constitutes the base and extends into the pharynx as far as the epiglottis. The body can itself be divided in two: the pharyngeal section and the oral section. The pharyngeal section is practically vertical and faces the pharyngeal wall. Spurs jut out from it: the lingual amygdala. The oral section starts at the apex, i.e. the most forward and most mobile part, and continues to the terminal sulcus. The oral section of the tongue is covered by a thick mucous membrane which adheres to the underlying muscle network. The central part of the dorsum of the tongue is more rounded; it is a little less mobile than the apex. Although there are no precise boundaries, it is customary to divide this lingual region into three zones from front to back as follows: the pre-dorsal, medio-dorsal and post-dorsal zones (see Figure 3.3). Its surface is uneven because it is entirely covered with tiny protrusions: the papillae, seat of the sensation of taste.
Figure 3.3. Subdivisions of the tongue
The complex network of entangled muscles which either run through or are attached to the tongue allow it to adopt a wide variety of shapes and to move rapidly inside the oral cavity.
Articulation: Pharynx and Mouth
69
The tongue is important for mastication, deglutition and for articulating speech sounds. This last function defines our angle on presenting lingual activity. In fact, the tongue defines for all practical purposes the shape of the oral vocal tract during the production of phonemes. The shape of the oral cavity and the position of the tongue form the two most useful parameters for describing the articulation of speech sounds in phonetics. 3.1.1.1. The tongue muscles There are 19 tongue muscles (nine paired, one unpaired). They emerge from the lower jaw, the hyoid bone, the hard palate, the styloid process and the lingual septum, and are attached to the mucous membrane under the surface of the tongue. The intrinsic muscles alter the shape of the tongue, while the extrinsic muscles move it within the mouth1 (see Figure 3.4 and Tables 3.1 and 3.2). 3.1.1.1.1. Intrinsic muscles The superior longitudinal muscle The superior longitudinal is a superficial tongue muscle lying immediately above the lamina propria of the dorsum. It is attached to the small horns of the hyoid bone and by a few fibers to the epiglottal ligament and runs along the length of the tongue from the root to the tip where it is attached to the mucous membrane. It runs above the transverse muscle. At the side, its fibers join those of the styloglossus, the hyoglossus and the inferior longitudinal. The superior longitudinal shortens the tongue and raises the tip for /t/, /l/ and /n/ (Hardcastle, 1976; Masanobu et al., 2000) in synergy with the inferior longitudinal. The inferior longitudinal muscle The inferior longitudinal originates from the small horn of the hyoid bone together with some of the fibers belonging to the genioglossus and hyoglossus muscles. It runs along the whole of the sides and underneath part of the tongue. It stops before the tip of the tongue and mingles with fibers from the genioglossus, the hyoglossus and the styloglossus in the mucous membrane. The inferior longitudinal retracts and lowers the tongue. It combines with the genioglossus and the hyoglossus for the release of an apical plosive. It is also involved in the articulation of high front vowels and velar consonants in lowering the tip and rounding the dorsum of the tongue.
1 For a description of the histological structure of the tongue and a detailed exposition of the complex relationship between the muscles of the tongue, see Takemoto (2001).
70
From Speech Physiology to Linguistic Phonetics
The transverse muscle The transverse muscle runs from the median lingual septum towards the body of the mucous membrane between the superior and inferior longitudinal muscles and the genioglossus muscles. The transverse raises the edges of the tongue and helps to form the central groove needed to articulate /s/ and /ƌ/. Stone et al. (2004) distinguish two functional parts to the transverse muscle: front and back sections supporting the actions of the genioglossus. For Miyawaki et al. (1974) and Wilhelms-Tricarico (1996), the transverse muscle is responsible for the gesture of protrusion accompanying plosives and alveolar fricatives, in combination with the vertical and superior longitudinal muscles. Buchaillard (2007), from her studies involving simulation, insists that the transverse muscle has a role in postural control, and that when activated it limits the deformation of the tongue in the transverse dimension and increases the efficiency of other muscles in the sagittal plane. The vertical muscle This muscle takes its name from the orientation of its fibers which run vertically from several parts of the lingual septum, from the lower side of the tongue and from the mucous membrane of the tongue. Some of the lateral fibers mingle with those of the transverse. The vertical muscle flattens the tongue; it is active in forming the vowel /i/ and contributes to the articulation of alveolar plosives. The amygdaloglossus This is a very slender muscle originating on the outer side of the amygdal capsule. Its fibers go deep into the body of the tongue. The amygdaloglossus raises the root of the tongue. The pharyngoglossus This muscle is a bundle of the superior constrictor of the pharynx. Its fibers mesh with those of the styloglossus, the inferior longitudinal and the genioglossus on the side of the tongue. The pharyngoglossus pulls the tongue backwards and raises the dorsum of the tongue towards the soft palate. For alveolar and palatal plosives, it keeps the sides of the tongue on the palate.
Table 3.1 Lingual motions and speech gestures due to the contraction of the intrinsic tongue muscles
=ƌƛ?
Articulation: Pharynx and Mouth 71
72
From Speech Physiology to Linguistic Phonetics
Figure 3.4. Movements of the tongue following the contraction of the principal intrinsic and extrinsic tongue muscles
3.1.1.1.2. The extrinsic muscles The genioglossus This muscle arises from the inner mandibular surface at the symphysis. Its anterior fibers run upwards and forwards towards the tip of the tongue and mesh with the inferior longitudinal, the hyoglossus and some of the fibers of the styloglossus. The middle fibers insert into the tongue’s dorsum mucous membrane and into the hyoglossal membrane. The posterior fibers run horizontally backwards and are anchored in the front face of the hyoid bone and the inner surface of the base of the epiglottis. The action of the genioglossus is complex in that each series of fibers has a particular function (Miayawaki, 1974). The posterior fibers raise the hyoid bone and draw the tongue upwards and forwards to articulate /l/ (Perkell, 1974). The middle fibers bring the tongue forward and flatten it in the velar region. The anterior fibers draw the tip of the tongue down and back in synergy with the inferior longitudinal (Perrier et al., 2003) and help to create the characteristic groove of the vowel /i/ (Baer et al., 1988). For Stone et al. (2004), the genioglossus therefore functions as if the anterior and posterior parts were independent entities. This view should however be tempered by the findings of an experiment by Honda and Fujinu (2000), in which articulation was altered by means of a pseudo-palate. They observed a clear compensatory movement during the articulation of /ƌ / and /Wƌ/ in the contexts of /i/ and /a/, but were unable to
Articulation: Pharynx and Mouth
73
attribute this to a difference in EMG amplitude signals from (respectively) the anterior or posterior and rear section of the genioglossus. The palatoglossus The palatoglossus originates from the palatine aponeurosis. Its fibers run downwards and laterally, forming the anterior pillars of the fauces in front of the tonsils. It is generally described as the lowering muscle of the velum. The palatoglossus works in conjunction with the styloglossus and in opposition to the hyoglossus to raise and bulge the posterior part of the tongue to articulate velar consonants. Because of the mechanical links between the tongue and the velum, tongue movements can affect the position of the velum. When lowered, the tongue can cause the velo-pharyngeal channel to open, thus contributing to the nasalization of open vowels. The styloglossus The styloglossus runs downwards from the styloid process of the temporal bone towards the inferior sides of the tongue. The two fiber bundles of this muscle join the two edges of the tongue at the base of the skull. The lower fibers run towards the lingual septum across the hyoglossus and the inferior longitudinal. The upper fibers are the most important part of the styloglossus. They run along the body of the tongue to its tip where they mesh with the fibers of the inferior longitudinal. The styloglossus is the chief tongue-raising muscle. It draws the tongue back and up (Perrier et al., 2003). It works in conjunction with the genioglossus for velar articulations such as /k/ and /g/. The hyoglossus muscle is its antagonist for adjusting the vowel aperture. Baer et al. (1988) found important styloglossus EMG activity in high back vowels such as /ŝ/, /o/ and /u/. The hyoglossus This muscle originates in the side of the anterior face of the hyoid bone and in its greater horn. The anterior fibers run upwards and forwards and end in the mucous membrane of the tongue-tip. The middle and posterior fibers run towards the root of the tongue and mesh with the styloglossus and the lower part of the inferior longitudinal. The hyoglossus muscle raises the hyoid bone. When the hyoid bone is fixed, the hyoglossus lowers and retracts the tongue. In conjunction with the styloglossus, it helps to produce back vowels.
Downwards and anteriorly
Downwards and laterally
Lower end of styloid process
Oral surface of soft palate
Side of tongue and soft palate
Side and undersurface of the tongue ; interdigitates with longitudinalis inferior
Intrinsic muscular fibers of the tongue
Lateral margins of the tongue
Superior longitudinalis Transverse
Tip of the tongue
INSERTION
Raises back part of the tongue, bulges the dorsum, narrows fauces
Elevates and retracts the tongue
Depresses the tongue
Lowers the tongue Retracts sides
Draws the tongue forward
Contracts and depresses the tip
ACTION
Velar articulation
[ k, g] Production of most vowels
Role in singing Front vowels
Release of constrictions Grooving
Front articulation
Release of stops
SPEECH
Table 3.2. Lingual motions and speech gestures due to the contraction of the extrinsic tongue muscles
Palatoglossus
Styloglossus
Vertically
Vertically
Hyoglossus
Backwards towards hyoid bone
Upwards towards the tip of the tongue
COURSE
Lesser horn of hyoid bone
Greater horn and side of hyoid bone
GGP
Chondroglossus (part of hyoglossus)
Chin
GGA
ORIGIN
Superior mental spine of the mandible
MUSCLES Genioglossus
74 From Speech Physiology to Linguistic Phonetics
Articulation: Pharynx and Mouth
75
3.1.2. Tongue control The tongue is an exceptional anatomical structure in that, apart from the heart, it is one of the few organs that consist almost exclusively of muscular tissue. This gives it extreme suppleness and flexibility in its movements, enabling it to effect very fast changes in the configuration of the buccopharyngeal part of the vocal tract (Engwall, 2003; Iskarous, 2005). As we have also seen, the muscular structure of the tongue is extremely complex: extrinsic muscles attached to external structures allow changes in its position while intrinsic muscles alter its form. During speech, the length of the tongue can increase by 200% and decrease by 160% (Napodow et al., 1999). Because of the interdigitation of the lingual muscles (Takemoto, 2001), it is difficult to attribute any given tongue-shape to a particular action of a specific muscle (Kumada, et al., 1998). From a biomechanical point of view, the tongue is considered as a muscular hydrostat, i.e. the tongue is of a fixed incompressible volume such that a distortion in one part of the system affects at least one other part (Parthasarathy, 2007). Furthermore, tongue behavior is linked to the movements of the hyoid bone and of the mandible. Several studies have tried to extract a limited set of parameters to account for tongue movements. Using X-ray images of vowels pronounced by five speakers and a factor analysis, Harshman et al. (1977) found two factors, corresponding respectively to protrusion and retraction and to a functional divide between the tip and dorsum of the tongue. Maeda and Honda (1994) have developed a two-dimensional model of vowel articulation based on two physiological axes which has subsequently been related to a two-factor model of vowels. In this model, the protrusion/retraction factor is set up alongside the opposing contractions of the anterior genioglossus and the styloglossus (Honda, 1996; see Figure 3.5). Although two factors apparently provide an adequate explanation of vowel systems, studies that include consonants have identified a third factor, usually attributed to an independent movement of the tip of the tongue (Sanguinetti et al., 1998). Stone and Lundberg (1996) describe three-dimensional models mimicking the surface of the tongue using ultrasound data of vowels and prolonged consonants. They found three distinct contours: a rise in front, a rise at the back and a depression in the middle. They also hypothesize a fourth pattern consisting of simultaneous raising of the tip and the dorsum of the tongue, used for supporting the retroflex [Ƅ].
76
From Speech Physiology to Linguistic Phonetics
Figure 3.5. Orthogonal relationships between some extrinsic tongue muscles (from Honda, 1996)
Articulation: Pharynx and Mouth
77
Nguyen et al. (1994, 1996), using EPG frames, agree with Stone et al. (2004) on two factors relating to the tip and the dorsum of the tongue. Gerard et al. (2003) and Buchaillard (2007) explain local changes in the surface of the tongue by using finite element models, and relating them to the motor command of the underlying muscle groups. The elementary motor patterns seem to correspond to a selection of two of the four extrinsic muscles. According to the data of Baer et al. (1988), the following groups of muscles are involved in the realization of the following vowels: /i/:
coactivation of the anterior and posterior parts of the genioglossus;
/ /: coactivation of the anterior genioglossus and the hyoglossus; /u/: coactivation of the posterior genioglossus and the styloglossus; /Ś/: coactivation of the hyoglossus and the styloglossus. The orthogonal organization of the muscles of the tongue gives rise to the idea that there is a close relationship between patterns of muscular contraction and acoustic output. Maeda and Honda (1994) therefore used the EMG signals from these muscles as the input parameters for Maeda’s (1990) model and derived vowel formants from them. They obtained an acceptable correlation between the measured formants of the natural vowels and those of the synthetic vowels thus obtained. 3.2. The pharynx The pharynx is a passageway linking the buccal cavity with the esophagus and the nasal cavity with the larynx (see Figure 3.6). It serves both the respiratory and digestive system. The pharynx is a musculo-membranous conduit. It runs vertically from the base of the skull to the level of the sixth cervical vertebra. The pharynx looks like an irregularly-shaped barrel, opening out at the top, wide in the middle and narrower at the base. Its average length is in the order of 15 cm. When the pharynx contracts, its lower edge rises. The total length of the pharynx can shorten by 3 cm. Its diameter is about 4.5 cm in the middle, and 4 cm at the greater horn of the hyoid bone. The pharynx then becomes steadily narrower to reach 2 cm at its base. The pharynx can be divided into three parts: the rhino-pharynx, the oro-pharynx and the hypopharynx.
78
From Speech Physiology to Linguistic Phonetics
Figure 3.6. Posterior view of the pharynx and its associated muscles (from Zemlin, 1968)
3.2.1. The rhino-pharynx The rhino-pharynx or naso-pharynx is sometimes called the cavum. It is bounded at the top by the base of the skull. Its forward part opens out into the upper opening of the nasal cavity. The lower part extends to the soft palate and its sides to the pharyngeal ostium of the Eustachian tube. The Eustachian tube connects the middle ear to the pharynx and relieves pressure on the eardrum. The rhino-pharynx is connected to the oro-pharynx by an isthmus. 3.2.1.1. The nasal passage The importance of the nose and the nasal cavity for speech production becomes fairly clear as soon as any kind of nasal problem occurs: during colds the voice becomes nasal, and the temporal organization of speech is disrupted when it is difficult to breathe through the nose. The nasal passage extends from the nostril to the pharynx. It is divided into two parts separated by the nasal septum. The nasal passage becomes constricted just
Articulation: Pharynx and Mouth
79
above the nostrils, where there is the main passage with an important section. Its main function is to warm, humidify and filter the inhaled air. From the point of view of pulmonary ventilation, the nasal passage presents an important resistance to airflow. When the oral inhalatory airflow is 10-12 l/s, the maximum nasal inhalatory airflow cannot exceed 2 l/s. The speed of airflow in the anterior nasal constriction is 12 to 15 m/s, and falls to 1 m/s in the main passage. The parallel activation of the oral cavity and nasal passages is controlled by velar activity. When the velum is lowered, it opens a channel between the airway and the nasal cavities. Connecting the nasal cavities and the vocal tract produces nasalization that is exploited by several languages – French, Portuguese and Yoruba, for example – to produce a set of nasal vowels and consonants contrasting with oral vowels and consonants. 3.2.1.2. The velum The velum is a flexible membranous extension of the hard palate. It comprises a skeletal structure (a fibrous blade), muscles and a mucous membrane. The anterior part of the velum is attached to the posterior part of the palatal vault: the superior side is attached to the base of the skull by two muscular bundles. Their fibers run downwards on both sides of the nasal cavity to the lateral borders of the velum and insert into the tongue and the pharynx. The velum is continued sideways by two folds: the anterior and posterior faucal pillars of the larynx. A small muscular appendix, the uvula, is attached to the posterior end of the velum. 3.2.1.2.1. The muscles of the velum The muscles of the pharynx are responsible for the contraction of the pharynx and the rising motion of the larynx which must occur during swallowing to ensure that food boluses pass through and descend the esophagus. The up-down and frontback modifications required to effect this movement are also exploited for phonetic and linguistic purposes. Because of its situation, the pharynx can be compared to a pipe with adjustable length and width, leading to the upper cavity of the larynx. It serves as the first resonator for laryngeal sound. Velar movement is stimulated by three groups of muscles: raising muscles, lowering muscles and tensor muscles (see Figure 3.7 and Table 3.3).
80
From Speech Physiology to Linguistic Phonetics
Raising muscles The levator veli palatini This is a very large and powerful velar muscle. The levator palatini arises from the apex of the petrous portion of the temporal bone and the medial wall of the Eustachian tube cartilage. The fibers course downwards and forwards along the upper pharyngeal wall. Muscle fibers from both sides interdigitate in the velum to form the median raphe of the velum. As indicated by numerous studies using EMG (Fritzell, 1969; Horigushi and Bell-Berti, 1987), endoscopy, MRI (Ettema et al., 2002) and EMA (Rossato et al., 2006), this muscle is responsible for raising the velum and closing the velopharyngeal passage (Katz et al., 1990). Kuenzel (1977), together with Benguerel et al. (1977), observed that the velum rises further for devoiced plosives than for voiced plosives, thus confirming the earlier findings of Bell-Berti and Hirose (1975). Kuehn and Moll (1998) have likewise noted the level of activity of the levator palatini varies according to the phonemic context. Vaissière (1988), using the X-ray microbeam system, has furthermore shown that the elevation of the velum and the temporal organization of its movements were influenced by prosody: accentual structure and rate of speech. The musculus uvulae The musculus uvulae is a small spindle-shaped muscle. At the top, it is mainly attached to the palatine aponeurosis. It courses medially and posteriorly along the length of the soft palate and inserts in the mucous membrane of the uvula and into the pharyngeal constrictors. Its contraction stimulates the retraction of the uvula and helps to close the passage from the oropharynx to the nasopharynx, in conjunction with the levator veli palatini and the pharyngeal constrictor muscles. It has a potential role in the formation of the rolled uvular [R]. The tensor muscle The tensor veli palatini The tensor veli palatini is a triangular muscle with bony attachments to the lower surface of the sphenoid bone of the skull and the lateral wall of the Eustachian tube. Its fibers course downwards and anteriorly. They wind around the hamulus of the medial pterygoid plate and spread out along the palatine aponeurosis. The tensor veli palatini spreads and tenses the soft palate. Its action helps in closing off the velopharyngeal port. Little is known about its exact role in phonation.
Articulation: Pharynx and Mouth
81
Lowering muscles The palatoglossus This muscle has been described in detail in the section on tongue muscles. It is attached to the lower part of the palatal aponeurosis, runs down the anterior faucal pillar and inserts into the posterolateral side of the tongue. The palatoglossus closes the oropharyngeal isthmus. When the tongue is fixed, the palatoglossus helps to draw the velum down. Since the tongue and the velum are connected in this way, it is not surprising that the openness of vowels should have an effect on the elevation of the velum, as reported by Moll (1963), Fritzell (1969), Ushijima and Sawashima (1972) and Bell-Berti et al. (1979). Clearly, this should be considered in conjunction with the fact that high vowels are less prone to nasalization. The palatopharyngeus This is a long thin muscle originating from the both the anterior hard palate and the midline of the soft palate, below the levator veli palatini. Its fibers join the stylopharyngeus and insert into the posterior border of the thyroid cartilage and spread over the lateral wall of the pharynx. The palatopharyngeus works in synergy with the palatoglossus. When the larynx and the pharyngeal wall are fixed, the palatopharyngeus lowers the velum. When the velum is fixed, it helps to raise the thyroid cartilage and can be thought of as an extrinsic muscle of the larynx.
Figure 3.7. Muscles of the pharynx and of the velopharyngeal port: 1) orifice of eustachian tube; 2) tensor palatini; 3) levator palatini; 4) uvular muscle; 5) glossopalatine muscle; 6) salpingopharyngeus; 7) superior constrictor; 8) palatoglossus
Palatopharyngeus
Palatoglossus
DEPRESSOR
Tensor Palatini
Musculus Uvulae TENSOR
Levator Palatini
MUSCLES ELEVATORS
COURSE
ACTION
Shortens and stiffens the uvula
Downwards and laterally
Downwards and laterally Upper border of thyroid cartilage, posterior wall of the pharynx
Side of the tongue
Elevates posterior part of the tongue Narrows fauces Depresses soft palate Elevates larynx (if soft palate fixed)
Downwards and Tendon around the Tenses and flattens forwards, lateral to the pterygoid hamulus the soft palate levator Palatine aponeurosis
Mucosa of the Uvula
Palatine aponeurosis Elevates and of the velum retracts the velum
INSERTION
Table 3.3. Elevator, tensor and depressor muscles of the velum
Palatine aponeurosis Oral surface of soft palate Palatine aponeurosis Soft palate
Sphenoid bone. Lateral Eustachian tube wall
Apex of inferior Downwards and surface of temporal forwards bone. Medial wall of Eustachian tube Posterior nasal spine of the palatine bones Medially
ORIGIN
Depresses velum Nasalization
Helps to close off the nasal cavity
Closes the velopharyngeal port for non nasal articulation
SPEECH
82 From Speech Physiology to Linguistic Phonetics
Articulation: Pharynx and Mouth
83
3.2.1.3. Coupling between oral and nasal cavities The lowering of the soft palate connects the nasal and oral cavities. This coupling enables nasalization: the production of nasal vowels and consonants. It translates acoustically as a general lowering of the amplitude of the spectrum, attenuation and an increase in bandwidth of f1, and the presence of poles and zeros at different frequencies according to vowel quality. The flattening of the spectrum in the f1-f2 region is one of the essential characteristics of nasalization. 3.2.2. The hypopharynx and the oropharynx The hypopharynx or laryngopharynx extends from the cricoid cartilage to the top of the epiglottis. It is connected to the esophagus. The lateral walls form the aryepiglottic folds. It is the crossroads of the airways and the digestive tube. The oropharynx is the portion of the pharynx that is posterior to the oral cavity. It reaches from the uvula to the upper border of epiglottis and root of the tongue. Posterior and lateral walls are formed by the superior and middle pharyngeal constrictors. 3.2.2.1. Musculature of the oropharynx and the hypopharynx The vertical and antero-posterior dimensions of the pharynx can be altered by the position of the tongue and by actions of two groups of pharyngeal muscles: the constricting and the raising muscles (see Table 3.4). 3.2.2.1.1. The constricting muscles The pharyngeal constricting muscles are distinguished according to their location: – the superior constrictor; – the middle constrictor; – the inferior constrictor.2 They are nested within each other from the top down. They fit inside each other like three cone shaped cups (see Figures 3.8 and 3.9). All their fibers join in the mid line posteriorly as the pharyngeal raphe.
2 In laryngectomees, the inferior constrictor can function as a pseudo-epiglottis, modulating the esophageal voice by adjusting the opening of the esophagus.
84
From Speech Physiology to Linguistic Phonetics
Figure 3.8. The constrictor muscles of the pharynx
The superior constrictor arises from the medial pterygoid plate and the pterygoid hamulus. Its fibers insert into the midline of the pharyngeal raphe. The middle constrictor originates from the stylohyoid ligament and the greater and lesser horns of the hyoid bone The function of the middle and superior constrictor muscles is to bring the posterior wall of the pharynx forward and to bring the lateral walls closer to each other, thus narrowing the pharyngeal tube. The inferior constrictor comprises: 1) a thyroid bundle inserted on the outer side of the thyroid cartilage;
Articulation: Pharynx and Mouth
85
2) a cricothyroid bundle attached to a fibrous arch between the lower edge of the thyroid cartilage and the lower edge of the cricoid cartilage; 3) a cricoid bundle inserted on the lower edge of the cricoid cartilage, where the arch and the process meet. The peculiarity of this muscle is that it can both constrict the pharynx and raise the larynx. It then has an indirect role in adjusting the tension of the vocal folds. 3.2.2.1.2. The raising muscles The pharynx has six raising muscles, three on each side: the stylopharyngeus, the palatopharyngeus and the salpingopharyngeus. The stylopharyngeus This muscle is inserted on the base of the styloid process and is divided into several bundles: – a pharyngeal bundle spread over the intrapharyngeal aponeurosis of the oropharynx; – an epiglottic bundle which forms the pharyngo-epiglottic fold; – a thyroid bundle attached to the superior horn of the thyroid cartilage on its upper edge; – a cricoid bundle inserted on the upper edge of the cricoid cartilage. Contraction of this muscle raises the pharynx and the larynx. The palatopharyngeus This muscle arises from the soft palate and joins with the middle constrictor. When the thyroid cartilage and the pharyngeal wall are fixed, contraction of the palatopharyngeus lowers the soft palate. If the soft palate is fixed, it raises the larynx and enlarges the pharynx. The salpingopharyngeus The salpingopharyngeus arises from the inferior cartilage of the auditory tube in the nasal cavity. It consists of three bundles: 1) a palatal bundle attached to the upper side of the palatine aponeurosis; 2) a pterygoid bundle;
86
From Speech Physiology to Linguistic Phonetics
3) a tubal bundle. These join and course downwards into the posterior faucal pillar and insert into the lateral walls of the pharynx and inferior constrictor muscle fibers. The salpingopahryngeus draws the pharyngeal walls laterally upwards. It may act synergically with other pharyngeal muscles in closing off the velopharyngeal port. It can be seen that the constricting muscles, like the raising muscles, ensure both directly and indirectly a mechanical link between the velum, the pharyngeal walls, the tongue and the larynx, in addition to their primary functions. There are thus strong mechanical interdependencies between these structures and any single action by one of them is liable to affect the others.
Figure 3.9. Lateral view of the pharynx and of the constrictor muscles
Stylopharyngeus
Salpingopharyngeus
- Thyropharyngeal
Inferior pharyngeal constrictor - Cricopharyngeal
Middle pharyngeal constrictor
MUSCLES Superior pharyngeal constrictor
Lower margin of Eustachian tube Styloïd process
Side of the cricoid cartilage Thyroid lamina
Median pharyngeal raphe
INSERTION Median raphe Pharyngeal spine
Downwards
Downwards
Up and backwards
the pharynx
thyroid cartilage
Upper border of thyroid cartilage Joins posterior
Constricts lower part of the pharynx Elevates the lateral pharyngeal wall Elevates and opens
Sphincteric action
Narrows the diameter of the pharynx
ACTION Pulls the pharyngeal wall forward. Forms a ridge (Passavant) Assists in V.P. sealing
Pharyngeal raphe
Obliquely downwards Esophagus
Up and back around the pharyngeus
COURSE Backwards Interdigitates with the palatopharyngeus
Table 3.4. Muscles of the oro- and hypopharynx
ORIGIN Posterior margin of the medial pterygoid plate Pterygoïd mandibular raphe Alveolar process of the mandible Greater horn of the hyoid bone
Pharyngeal articulation Assists in V.P. closing >$75@
Esophageal speech
Associated with tenseness
SPEECH Oral articulation
Articulation: Pharynx and Mouth 87
88
From Speech Physiology to Linguistic Phonetics
3.2.3. The role of the pharynx in speech If phoneticians have paid detailed attention to the anatomical and muscular description of the pharynx, it is because alterations of the size and shape of the pharyngeal cavity play an important part in determining voice quality. For example, Takemoto et al. (2006) considered that the hypopharyngeal and ventricular cavities, with the piriform sinuses, are responsible for the fourth formant and have a role in determining speaker characteristics. Moreover, these cavities constitute the basis for phonological contrasts, defined in phonology (+/- expanded pharynx) to specify certain phonemes. The expansion of the pharyngeal cavity can occur in several ways. The root of the tongue can, for example, move forwards. This movement, often suggestive of tension, is sometimes indicated by another feature, e.g. (+/advanced tongue root). Variation in the volume of air in the pharynx is also related to the elevation of the larynx, designated (+/- lowered larynx) by Chomsky and Halle (1968). Additionally, the volume can be varied by movement of the pharyngeal wall. These factors have all been related to tension, and some researchers see in them physiological correlates of the feature (+/- tense). Finally, nasality is made possible by the lowering of the velum. Having recalled the consequences of velar movements and the vertical movements of the larynx, we will offer a summary of the main studies focusing on movement in the tongue root and in the pharyngeal wall, and we will try to establish the varying opinions on the controversial question of the feature of tension. 3.2.3.1. Nasality Compared with other processes involved in speech production, the oronasal process is simple: it depends on the ability to move the velum up or down. The upward movement depends chiefly on the action of the levator veli palatine, whereas the downward movement is essentially due to the palatoglossus. When the velum is raised and supported on the pharyngeal wall, it closes the velopharyngeal port and forms an occlusion. All the air used in phonation must then pass through the mouth and the speech-sounds are described as oral. In the case of oral plosives, there is in fact a double occlusion: one occlusion in the place of lingual articulation and one velopharyngeal occlusion. When the velum is lowered, air can escape through the nose. There are only two possible outcomes: if there is a complete occlusion in the oral cavity, air can only escape through the nose and the sound is nasal. If there is no occlusion, air will flow through both the nose and the mouth. The velopharyngeal opening must be open at least 0.2 cm2 for nasality to be perceived (Warren et al., 1993).
Articulation: Pharynx and Mouth
89
3.2.3.1.1. Vowels The velopharyngeal opening is smaller for nasal vowels than the opening observed during respiration, in which the velum is lowered close to the dorsum of the tongue. Its position during speech is about halfway between its positions for breathing and for complete occlusion. For vowels, the velopharyngeal opening varies inversely according to the openness of the vowel. For vowels such as ± and ŤѺ, the velum is higher than for more open vowels such as ŚѺ and ŝѺ. The most common hypothesis is that this fact is explained by mechanical constraints. Differences in the size of velopharyngeal opening cannot be the sole factor explaining differences in perception of the degree of nasalization of vowels. Oral airflow, air pressure and the tension of the pharyngeal walls are other factors that play a part in producing nasal resonance. The timing of the articulatory events that lead up to the connecting of the oral and nasal cavities is also a determinant in some languages (Lacerda and Head, 1963; Rossato et al., 2006) and can be more important in this respect than the degree of opening. Amelot and Michaud (2006) show that there is no direct relationship between velar movement and nasal airflow, which is affected by the relative impedance of the nasal passage and the oral tract. Velar movement is relatively slow compared with tongue movement, and they can get out of phase: such asynchrony can result in various effects of partial or total assimilation. In some cases, vowel nasality continues into the following consonant and gives the segment some nasality in its initial phase. This phenomenon is also found in Provençal, like the persistence of a homorganic consonantal continuation, known to have been present in old French. This phenomenon is not peculiar to French. Ohala (1991, 1996) also finds an epenthetic nasal segment in Hindi, between a word-final vowel followed by an initial plosive consonant in the following word. It also happens that the nasal character of a segment is passed from one segment to another in a phonetic sequence: this is an assimilation phenomenon with variable consequences. The study of coarticulation in nasality has been the goal of much research, making use of such varied technology as electromyography, aerophonometry, X-ray microbeam, ultrasound and more recently MRI (Chafcouloff and Marchal, 1999). This work has looked at the organization in space and time of coarticulation. In particular, its effects have been compared in languages that differ in their phonological inventory (Solé, 1992). In a study comparing this phenomenon in six different languages, Clumeck (1976) observed that the velum is lowered earlier in American English and Brazilian Portuguese than in French, Chinese, Swedish and Hindi. In particular, these studies have helped to demonstrate the importance of the oral/nasal distinction in the phonological system of each language.
90
From Speech Physiology to Linguistic Phonetics
In English, where the phonological contrast between oral and nasal vowels does not exist, the presence of a nasal consonant in an utterance can result, through anticipation and perseveration, in a nasal quality over a long stretch, and in nasalization of several vowels. In French, the extent of coarticulation is limited because of the need to preserve the phonological distinction between oral and nasal vowels. Prosodic organization also plays a decisive part, the phonemes of syllables in accentually strong positions being less susceptible to assimilation (Farnetani, 2007). With the exception of apparently a single language, the Chinantec of Oaxaca spoken in Mexico (Merrifield, 1963), a gradual systemic contrast between oral vowels, weakly nasalized vowels and strongly nasalized vowels does not appear to have been adequately stable to be retained in the vowel systems of the world’s languages. 3.2.3.1.2. Consonants The plosives It should first be understood that it is possible for all oral consonants to contrast with a homorganic nasal consonant. This is a very productive opposition, since it applies to 97% of the consonants that form part of the UPSID base. Thus, we have the series /p, b/, /t, d/, /k, g/ corresponding to the series /m,n,ƾ /, between which the sole articulatory difference is the opening of the rhinopharyngeal channel. The occlusion in the oral cavity is in the same place for both sets: at the two lips for /m/, in the dental and alveolar region for /n/, and at the soft palate for /ƾ/. The passage of air through the nasal cavity constitutes the only difference. These consonants are traditionally classed as plosives: this is true of the oral version of them, but it would be more precise to consider them as continuants insofar as the nasal resonance is continuous. French has retained the dorsopalatal /Ż / as in /aŻo/ as distinct from the sequence /n + j/, since it is a single phonological entity realized in a single articulatory movement in a single syllable. The sound /ƾ/ which used to exist in French is no longer heard except as a regional feature in the south, in the context of a nasalized /g/ as in “mangue mûre”, for example, or in English loanwords: “bowling”, “parking”, etc. One notable case concerns the so-called syllabic “n” as in English “button” or “sudden”. The velopharyngeal closure can be maintained completely up to the /n/; the plosive release is accompanied by the rapid opening of the velopharyngeal port which allows the nasal cavity to be engulfed by air which escapes through the nose.
Articulation: Pharynx and Mouth
91
Because of the pre-existing high level of intra-oral pressure, the plosive release is more like a nasal explosion. Wolof and certain Australian languages such as Aranda, Arapana and Wailpi (Wurm, 1972) make a phonological distinction between plosive consonants depending on whether the release is nasal or oral. Fricatives There is no set of nasal fricatives corresponding to the set of oral fricatives. The nasalization of fricatives essentially depends on context and provides no distinctive features. Catford (1977) observes that nostrils are flexible and can therefore be considered as an articulator. Restriction of the nasal passage can produce turbulence in the airflow around the nostrils. Tibetan distinguishes between voiced “nostril” fricatives as opposed to unvoiced “nostril” fricatives. Prenasalization The existence of nasalization preceding a consonant is well-attested in several languages. Prenasalization can be attributed to the anticipation of the opening of the velopharyngeal port in preparation for an upcoming nasal phoneme. Numerous studies on coarticulation have examined conditions (depending on the language) in which the velum can be lowered without involving phonetic confusion. It is a different matter when prenasalization characterizes a class of phonemes in which nasalization is followed by a homorganic plosive in the same syllable, resulting in a single phoneme. This type of prenasalized consonant is fairly common, particularly in Polynesian languages. Ladefoged (1971) suggests that prenasalization should be retained as a distinctive feature of the graded type. 3.2.3.2. The pharynx as acoustic filter The pharynx is the first resonating chamber, and in this capacity it has its own mode of resonance. As a coupled oscillator, it accentuates and dampens some harmonics. In a way, it is possible to think of the pharynx as a filter with an ability to transfer depending on the amount of impedance brought to bear on the larynx: an amount that varies according to the length of the pharyngeal tube. Moreover, Takemoto et al. (2006) have shown that the hypopharynx, and in particular the piriform sinuses and the vestibular folds of the larynx, play an important part in determining the timbre of vowels. These authors consider that the pharyngolaryngeal cavity is responsible for a specific formant (F4) around 3 kHz, and that the
92
From Speech Physiology to Linguistic Phonetics
cavities bounded by the piriform sinuses produce attenuation of the spectrum between 4 and 5 kHz. According to Sundberg (2003), the area function of the pharynx must be at least 6.1 times larger than the area function of the laryngeal cavity in order to generate the singing formant around 2.8 kHz. The ventricles must open equally widely. There is another phenomenon with a complementary role in the continual balance between laryngeal source and the resonating qualities of the supraglottal cavities. The raising or lowering of the larynx alters the length of the pharyngeal cavity; the shape of the latter also depends on related movements of the pharyngeal walls and the tongue (involving the levator veli palatini and various constrictors which contribute to the raising of the larynx). The vertical movements of the larynx also help to change the laryngeal source, which will more or less adapt to the new geometry of the pharynx. Tension in the pharyngeal walls and the loss of energy (as heat) also have acoustic consequences that remain to be explained. As yet, there seems to be no definitive answer to these two problems. 3.2.3.2.1. Consequences of change in the vertical dimension of the pharynx Lowering the larynx increases the length of the pharyngeal cavity. This involves a lengthening of the entire vocal tract. Larynx lowering results in the lowering of all formant values: this is more exaggerated for formants which may be considered as resonances of the posterior cavity. For most vowels, the drop in the first formant (F1) is of the order of 5-6%. For /u/ and /Ś/ the effect is greater than for /a/. The second formant generally drops by 8% for high front vowels such as /i/ and /y/. For /u/, the effect is less marked than for /a/. The third formant remains fairly stable for all vowels except /u/, in which frequency drops. The fourth formant undergoes a moderate drop of 5%. The drop is less for /ŝ/ and /i/ and a little more marked for /a/ and /Ś/. In percentage terms, F2 is the formant most affected by laryngeal lowering. The net result of this movement is to bring F3 closer to F4. This phenomenon of reduced distance between F3 and F4 has also been observed in the singing voice by Sundberg (2003), who considers this narrowing of the difference between the two upper formants as the principal characteristic of sung vowels when produced with a lowered larynx. The raising of the larynx has the physical effect of shortening the vocal tract and acoustic effects opposite to those just mentioned.
Articulation: Pharynx and Mouth
93
3.2.3.3. Movement of the pharyngeal walls 3.2.3.3.1. Pharyngeal articulation No known language has occlusives articulated in the pharyngeal zone, nor are there any pharyngeal nasal consonants. Pharyngeal fricatives are relatively rare. Arabic has some voiceless pharyngeal fricatives which form contrasts to laryngealized pharyngeals and murmured pharyngeals. Pharyngeal constriction generally constitutes a secondary articulation. It is accomplished either by retraction of the tongue or by the contraction of the constricting pharyngeal muscles which reduce its diameter, or by a combination of the two. In Arabic and other Semitic languages, pharyngealization is the exponent of emphasis. In Khoisan languages, this articulation is sometimes described as epiglottal, in view of the fact that constriction occurs at the level of the epiglottis. These languages also possess a series of vowels described as “strident vowels”: they are produced with strong pharyngeal constriction and such compression of the vocal folds that only the interligamentary part vibrates. 3.2.3.3.2. Active or passive expansion and the question of a tension feature The first observations of pharyngeal wall movement were reported by Harrington (1944) after his studies on velar closure. Using radiocinematography, both Perkell (1969) and Kent and Moll (1969) noticed that variations in the anteroposterior diameter were associated with the voiced/voiceless distinction, i.e. that the oropharynx is wider during the held phase of occlusives when they are voiced than when they are voiceless. This expansion of the pharyngeal cavity for voiced occlusives is attested by all subsequent research (Bell-Berti, 1973 and 1975; Minifie et al., 1974). It is generally agreed among those who subscribe to the myoelastic theory of phonation that this expansion allows for a difference of air pressure at the level of the glottis which is enough to sustain the vibration of the vocal folds during the held phase of voiced stops3.
3 Several theoretical studies have been undertaken to determine the maximum length of the held phase of a voiced occlusive. The maximum expansion of the pharynx (taking into account the anteroposterior movement of the tongue, the retraction of the side and rear walls and the lowering of the larynx) can increase the volume by 10 ml. Each extra 10 ml of volume allows glottal vibrations to continue for 10 ms. A maximum extension of vibrations for 100 ms can be attributed to the expansion of the pharyngeal cavity. Additionally, there is the role of the lowering of the velum and the raising of the hard palate, which is not inconsiderable, according to Bell-Berti (1973).
94
From Speech Physiology to Linguistic Phonetics
However, agreement disappears when it comes to deciding whether this phenomenon is due to active causes or whether there is simply a passive enlargement of the pharynx caused by intra-oral pressure acting on the walls of the vocal tract. The latter position has been espoused by Perkell, who considers that the increased volume of the oropharynx occurs as a result of strong air pressure associated with weak muscular tension. The pharyngeal space therefore alters involuntarily. The hypothesis of there being a motor mechanism that expands the oropharynx has been proposed by Kent and Moll (1969) and Rothenberg (1977), among others. They suggest that the two diameters (anteroposterior and vertical) are modified by the action of a muscle pulling the hyoid bone, and therefore the larynx, backwards. They argue that this would have the effect of pulling the tongue root forwards and down, thus increasing the anteroposterior diameter of the pharyngeal cavity. Dickson and Dickson (1995) have considered how the combination of the following muscles could explain the movement of the pharyngeal wall: palatoglossus, superior constrictor, musculus uvulae and levator veli palatini. They have suggested that the levator veli palatini is crucially responsible for the movements of the pharyngeal lateral wall during speech. They describe a mechanism which enables the levator veli palatini to move the lateral walls inwards at the level of the torus tobarius. There are two main reasons for rejecting this hypothesis: first of all, it is not certain whether lateral movements of 10-12 mm at the level of the velo-pharyngeal port could be achieved purely as a result of action by the levator veli palatini. Furthermore, if we postulate this movement towards the torus, and assume that the levator is responsible for moving the upper part of the pharyngeal wall, we would expect maximum displacement to occur at this location. However, Skolnick’s (1970) data show that this is not the case. The research we have cited clearly demonstrates that the greatest median displacements occur well below the levator veli palatini. The inference is that the fibers of the upper constrictor play a not insignificant role of drawing the pharyngeal walls inwards during velo-pharyngeal closure. Fritzell (1969) had already found a significant correlation in activity between the levator palatini and the upper constrictor during velo-pharyngeal closure. Several authors (Podvinek, 1952; Bosma and Fletcher, 1962) have described the mechanism whereby the contraction of the pharyngo-staphyline muscle stimulates a medial displacement of the pharyngeal walls. Fritzell (1969), using electromyography, shows that pharyngo-staphyline activity is not significantly correlated with velar closure but co-occurs consistently with the vowel /a/. Fritzell’s conclusions that the palatopharyngeus draws the pharyngeal walls medially for the
Articulation: Pharynx and Mouth
95
vowel /a/ seem to co-occur with the movements of the pharyngeal lateral walls at the level of the jaw. Ultrasound data show that the lower part of the walls move medially during open vowels (Zagzebski, 1975). It is interesting to note that such displacements and pharyngo-staphyline activity in open vowels do not agree with the conclusions of Minifie, who found, for example, more activity for /i/. However, Minifie is extrapolating the movements from EMG activity potentials; he does not observe them directly. In his 1974 experiment, he researched the activity of constricting muscles (i.e. a different group of muscles). In view of the potential role of constrictors in the lowering and raising of the larynx, we may wonder what his research reveals about the relationship between the vowel, the lateral walls position and the height of the larynx. These discrepancies between results and interpretations in research on the nature of movement in the pharyngeal walls indicate how difficult it is to associate a particular muscular activity with the specific movement of a single structure at the level of the pharynx. To do this, we would have to see the whole result of the multiple relations that exist between the various pharyngeal structures, in particular the result of the many repercussions of the activities of constricting and raising muscles on the position of the velum and the pharyngeal walls and on the tension of the vocal folds. It seems reasonable to conclude that the activity of the pharynx and the movements of its walls depend to some extent on phonetic context, and that greater or lesser muscular tension at any given spot is the result of a very specific motor activity required for the production of a certain phoneme or class of phonemes, e.g. the need for orality or nasality on the one hand or the need to adjust the height of the larynx on the other. In support of this hypothesis, it can be seen that the articulatory movements of the central part of the oro-pharynx are clearly different from the movements of the upper walls of the pharynx (Minifie et al., 1970; Magen et al., 2003). Displacements at this level tend to show a correlation between an increase in pharyngeal volume for high vowels and a decrease in pharyngeal volume for low vowels. The position of the lower walls for consonants thus tends to depend crucially on the vowel environment. Conversely, when it comes to velo-pharyngeal closure, it is the consonant that is chiefly responsible for the movements of the pharyngeal walls. The search for a correlate of tension in the activity of pharyngeal muscles has not ended. Currently, the position of the tongue root in relation to the pharyngeal wall is retained in articulatory descriptions. The ATR (Advanced Tongue Root) feature is thought to account for a series of oppositions between vowels and consonants, regardless of the active or passive nature of the mechanisms involved in the movements of the pharyngeal walls.
This page intentionally left blank
Chapter 4
Articulation: The Labio-Mandibular System
Among the features that specify vowels, two are particularly associated with the labio-mandibular system. These are stricture and labialization. For vowels, the aperture is large and airflow is laminar without significant turbulence. Open stricture is made with an open approximation of the articulators; it is largely determined by the angle of the jaws. The feature of labialization, which is also used to specify certain classes of consonants, encompasses several articulatory dimensions. The lips can take up a variety of configurations between extreme separation and closure. In addition, they can be projected forwards or pressed together in various degrees. Above all, taking into account that these two articulators are independent, any position of the lips can co-occur with any position of the tongue. In practice, because of a number of anatomical, physiological and aerodynamic restraints, the languages of the world favor a particular set of labial configurations (Linker, 1982). In this chapter we will review the anatomical conditions which govern the activity of the lips and jaws and identify the muscle groups involved in producing their movements. The lips are easy to see and thus lend themselves to direct observation. For this reason, they have been the subject of a large number of phonetically detailed studies involving many techniques and different tools: still photography, highspeed video, and devices for optical monitoring or for measuring strength of movement. Very often, imagery methods such as X-rays, ultrasound, X-ray microbeams and electromagnetometry have been used in association with other tools such as electropalatography or electromyography or even with techniques for measuring
98
From Speech Physiology to Linguistic Phonetics
disruption to normal activity, such as blocking mastication or modifying a resistance applied to jaw movement. The upper and lower lips are mobilized by sets of independent muscles and they can move individually. The lower lip has the most important musculature. It follows the movement of the jaw and is the more mobile of the two. On the whole, research has shown a certain local individual variability. We will report only the best established and most general of these tendencies. 4.1. The lips: anatomical and functional description The lips are two mobile musculo-membranous folds surrounding the buccal orifice. Each lip has an anterior or cutaneous side, a posterior or mucous side (the endolabial side) and a free border (the exolabial side). The ends of the lips join to form the corners of the mouth. Under the dermis, the muscles are closely bound with the cutaneous layer in which they originate. Two groups of muscles can be distinguished: dilatory muscles and constricting muscles (see Figure 4.1 and Table 4.1a, b). In this chapter we will describe the principal muscles according to their function and mobilization during speech production (see Figure 4.2).
Figure 4.1. Frontal view of the facial muscles
Articulation: The Labio-Mandibular System
99
4.1.1. Lip closure 4.1.1.1. The orbicularis oris This is the most important of the lip muscles and comprises most of the thickness of the lips. Honda et al. (1995) distinguish two muscular layers in the orbicularis: a deep layer encircling the mouth, and a more superficial layer running along the vermilion zone. The upper and lower fibers of the muscle converge upon the lips. The orbicularis makes up most of the muscular part of the lips and it acts as a sphincter. It allows the lips to adduct by enabling the lowering of the upper lip and the raising of the lower lip. Also concerned with the act of lip closure are the masseter, the temporalis and the internal pterygoid, as well as the levator anguli oris (see below). The triangularis muscle lowers the upper lip. This combination of activity is seen in oral closure during the bilabial stops /p/, /b/ and /m/. 4.1.1.2. The compressor labii This muscle is named after its function; it comprises muscle fibers coursing from front to back around the buccal orifice, and compresses the lips from front to back. Its role in phonation requires further research. 4.1.2. Lip protrusion Lip protrusion is effected by the activity of the orbicularis (deep layer) and the mentalis muscle. 4.1.2.1. The mentalis The mentalis is made up of two small bundles set in the deep muscular layer on either side of the median line of the chin. These bundles originate under the lower incisor and run downwards; they insert into the skin of the chin. The chief function of the mentalis is to raise the lower lip. Gentil and Gay (1986) dispute this as their main function, rather seeing that their activity results in a backwards swing (eversion). 4.1.3. Lip rounding Lip rounding is effected by drawing the corners of the mouth together through the action of the orbicularis sphincter. It is required, for example, for the following
100
From Speech Physiology to Linguistic Phonetics
set of vowels /\±¡Ś, ŝRX/. Lip protrusion may be variable and is limited by the action of the buccinator, the risorius and major zygomatic muscles. The degree of lip rounding also depends on the raising and lowering of the jaw. 4.1.4. Raising the upper lip 4.1.4.1. The levator labii superioris alaeque nasi This muscle consists of a narrow band of fibers. They extend from the external side of the upper maxillary process to the upper lip. Its action elevates the wings of the nose (alae nasi) and raises the middle part of the upper lip. The levator labii superioris is activated during the release phase of bilabial plosives (Öhman et al., 1965). 4.1.4.2. The levator labii superioris The levator labii superioris is a thin quadrangular facial muscle. It arises from the lower margin of the orbit and inserts into the skin of the muscle of the upper lip. It elevates and everts the upper lip. 4.1.4.3. The zygomaticus minor The minor zygomatic muscle is inserted into the side of the zygomatic bone parallel to the deep elevator, and attaches to the underside of the skin of the upper lip. It draws the upper lip upwards and outwards. With the buccinator, the major zygomatic and the muscles raising the corners of the mouth, it controls the constriction between the lower lip and the upper incisors, thus playing an important role in the production of the labiodental consonants /f/ and /v/. 4.1.5. Lowering the lower lip 4.1.5.1. The depressor labii inferioris The depressor labii inferioris is a flat quadrangular muscle situated on the side of the chin and rising obliquely towards the lower lip. At the top it unites with the muscle from the opposite side of the chin on the median line. This is the chief lowering muscle for the lower lip, which it draws downwards and outwards. It has an important role to play in the release of bilabial consonants such as /p/, /b/ and /m/, with some variation in activity according to the consonant. Giot (1977) observes that in French, its activity is more important for /p/ than for /b/, and interprets this as a difference in the articulatory strength. The depressor labii
Articulation: The Labio-Mandibular System
101
inferioris is assisted in its function by the depressor anguli oris and the muscles which lower the jaw, as well as by the anterior suprahyoid muscles. 4.1.6. Lip spreading 4.1.6.1. The buccinator The buccinator is a flat muscle with an irregular quadrangular shape. It is inserted on the anterior edge of the pterygomandibular ligament, and on the lateral surface of the alveolar process of the maxilla and mandible in the region of the last molars. Its fibers cross over at the corners of the mouth. The main function of this powerful muscle is to retract the angles of the mouth. It is opposed by the orbicularis and the mentalis muscles. It has a role in the production of the spread vowels LH and the consonants IYƁ. 4.1.6.2. The risorius This facial muscle is also called the “laughter” muscle. The risorius is a flat muscle originating at the level of the masseter, running parallel to the lips and ending at the corners of the mouth. It reinforces the action of the buccinator (Öhman et al., 1965). The buccinator and risorius combine to produce the alveolar fricatives ƌ, ƛ, Ɓ. The major zygomatic acts synergistically. 4.1.7. Lowering the corners of the mouth 4.1.7.1. The depressor anguli oris This muscle is triangular in shape. It originates on the anterior part of the lower jaw and inserts into the orbicularis at the corners of the mouth. The depressor anguli oris lowers the corners of the lips and probably has a role in the opening of the buccal orifice during the production of high vowels and in the release of stop consonants (Öhman et al., 1965). It also opposes the levator anguli oris muscle in the production of fricatives. 4.1.7.2. The platysma The platysma is a very broad slender muscle extending from the anterior part of the sternum to the lower jaw and the cheek. It acts with the depressor anguli oris to lower the corners of the mouth. It can also contribute to protrusion for the closed vowels X\.
Figure 4.2. Action of the principal muscles of the face on the lips: 1) levator labbii alaeque nasi; 2) levator labii superioris; 3) zygomatic minor; 4) zygomatic major; 5) risorius; 6) buccinator; 7) depressor anguli oris; 8) depressor labii inferioris; 9) levator anguli oris; 10) orbicularis oris; 11) mentalis
102 From Speech Physiology to Linguistic Phonetics
Upwards Superiorly and anteriorly
Medially
ORIGIN Outer rami of the mandible Oblique line of the mandible Mandible, near mental foramen
Mental symphisis
Upper pectoral and deltoid regions
Alveolar processes of the maxillary bone. Pterymandibular raphe
MUSCLES
Depressor anguli oris Depressor labii inferioris
Mentalis
Platysma
Buccinator
Risorius
Canine fossa below the infraorbital foramen
Levator anguli oris
INSERTION
Table 4.1. The muscles of the lips (a and b)
(b)
Modiolus Fibers of O.O.
Modiolus Skin of lower lip. Blends with O.O. Skin of the chin O.O. Lower border of mandible Skin of lower face
Upwards
Modiolus
ACTION Draws the angle of the mouth laterally Depresses the angle of the lips Depresses the lower lip Everts and protrudes lower lip Draws down and laterally the angles of the mouth Draws the lips back. Pulls angles of the mouth laterally
Raises and everts the upper lip Elevates and everts the upper lip Elevates and draws the angles of the mouth laterally Elevates the angle of the mouth and upper lip
Skin and muscles of the upper lip Skin of the upper lip and orbicularis oris Modiolus at the angle of of the mouth. Orbicularis oris of upper lip Angle of the mouth Intermingling with fibers of zygomatics and O.O.
Lateral part of the nostril and upper lip
ACTION Adducts the lips : Narrows orifice of the mouth Pursues the lips Elevates the upper lip and wing of nose
INSERTION Mucous membrane of the margin of the lips and the raphe
Upwards and medially
Forwards
COURSE
(a)
Downwards and highly laterally
Inferiorly and medially Inferiorly and medially
Downwards
Downwards and laterally
Circular
COURSE
Medial of the infraorbital margin Facial surface of the zygomatic bone Anterior surface of the zygomatic bone
Upper frontal process of the maxilla
ORIGIN Midline anterior surface of maxilla and mandible
Levator labii superioris alaeque nasi Levator labii superioris Zygomaticus minor Zygomaticus major
Orbicularis oris
MUSCLES
Labio dental Bilabial fricatives [i.e]
Release of bilabial consonants Protrusion [ u, y ] Protrusion of upper lip [ u, y ]
Spreading [ i . e ]
Spreading [ i . e ]
ROLE IN SPEECH
Assists in closure of bilabial articulation
[ s, z ]
[ f, v ]
Labio-dental articulation
ROLE IN SPEECH Labial articulation [ p, b, m, f, v ] Protrusion [ o, u, oe, 6] Bilabial closure release
Articulation: The Labio-Mandibular System 103
104
From Speech Physiology to Linguistic Phonetics
4.1.8. Raising the corners of the mouth 4.1.8.1. The levator angularis oris This muscle – deep, flat and triangular in shape – courses down from the canine fossa towards the upper lip and inserts into the orbicularis at the corners of the mouth. Its action raises the corners of the mouth and the upper lip. Insofar as some of its fibers also insert into the lower lip, the levator anguli oris can also help to draw the lower lip upwards. It opposes the depressor anguli oris for fine adjustments and is reputed to have a role in the closure of labiodental consonants and to contribute to closure in bilabial stops. 4.1.8.2. The zygomaticus major Like the risorius, this muscle is also known as a laughter or grinning muscle, because of the facial expression it produces. The major zygomatic is a long, flattened, superficial muscle which extends backwards from the minor zygomatic muscle and the zygomatic bone to the corners of the mouth. Its chief function is to draw the corners of the mouth sideways and upwards. Its raising acts in conjunction with the levator anguli oris muscle. Phonetically, it has a role in lip spreading and contributes to the production of the vowels /i, e/ and consonants such as /f/ and /v/. Close inspection of muscular activity in the lips suggests a large amount of interdependence between them. Strong coordination is necessary to achieve the fine adjustment of lip position required for different phonemes. Indeed, even if we could relate particular muscle activity more directly with specific movements in given directions of the lips or the corners of the mouth, it nevertheless remains true that their synergetic and antagonistic patterns of activity must be strictly controlled to achieve the desired articulatory result (see Table 4.2). Several influences are at play, resulting as much from the position of the articulators when at rest as from the constraints associated with the anticipation of upcoming phonemes or the effects of perseveration, and particularly related to the inertia of the organs involved. This is especially relevant to the fact that many electromyographic studies devoted to the labial muscles have managed to produce results that are quite dissimilar and even, in some cases, contradictory. Studies of coarticulation in labiality have been a revelation in this respect, demonstrating the complexity of coordination in speech production; at the same time, it must be borne in mind that phonemes vary according to the phonological inventory of the language to which they belong (Boyce, 1990, Farnetani and Recasens, 1999). Considering all these factors, the jaws and the position of the mandible play a particularly crucial role.
Buccinator
Platysma
Mentalis
Depressor labii inferioris
Depressor anguli oris + +
+
LOWERING angles of mouth
lower lip
+
lips
CLOSING
Table 4.2. Main functions of the lip muscles
+
Levator anguli oris
Risorius
+
+
Zygomaticus minor
Zygomaticus major
+
+
mouth
RAISING
upper lip
Levator labii superioris
Orbicularis oris Levator labii superioris alaeque nasi
MUSCLES
+
+
+
+
RETRACTING ROUNDING angles of lips mouth
+
+
lips
PROTRUDING
Articulation: The Labio-Mandibular System 105
106
From Speech Physiology to Linguistic Phonetics
4.2. The jaw The jaw comprises two bony arches in which the teeth are implanted: the upper jaw consists of 13 bones altogether and the lower jaw of one single bone: the lower maxillary. The upper jaw is fixed. Only the lower jaw or mandible can be moved. The main part of the mandible is a U-shaped arch. The round part of the arch faces forward and constitutes the chin area. From the open side of the arch, two bony extensions rise to join the temporal bones at the level of the temporo-maxillary joints in front of the ear. The temporo-maxillary joints are capable of three principal movements: 1) raising and lowering the lower jaw; 2) protruding and retracting; 3) moving laterally. The lower jaw maintains close muscular links with both the hyoid bone and the tongue. Its movements are thus capable of affecting the position of the tongue and, to a lesser extent, that of the larynx. 4.2.1. Muscles of the lower jaw The prime function of the jaw muscles is to enable mastication. Their actions are to raise, to lower, to project and to retract the lower jaw, as well as moving it sideways (Van Eijden et al., 1997). (see Figure 4.3). The lower jaw can move in several directions under the action of the chewing muscles, the temporalis, the external and internal pterygoids and the suprahyoid muscles (see Tables 4.3, 4.4 and 4.5). 4.2.1.1. The lateral or external pterygoid The lateral pterygoid is a short, thick, flattened muscle running from the pterygoid process and the greater wing of the sphenoid bone to the temporomaxillary joints. Its action produces lowering and protrusion of the mandible, as used in the articulation of /s/ (Van Riper and Irwin, 1958).
Articulation: The Labio-Mandibular System
107
Figure 4.3. Lateral view of the mandible and directions of its displacements due to the contraction of various mandibular muscles
4.2.1.2. The internal pterygoid The internal pterygoid is a thick quadrangular muscle inside the external pterygoid. It extends from the pterygoid fossa of the sphenoid bone of the skull to the inner side of the angle of the jaw. Its fibers run downwards and backwards. It works with the masseter to raise the jaw. It works against the anterior suprahyoid muscles to help produce /f/ and /v/. 4.2.1.3. The temporalis muscle The temporalis is a long, paired, triangular muscle of the forehead. It originates in the lower parts of the sides of the skull, approximately at the level of the temples, and extends behind the ears. Its fibers course forwards and converge near their point of insertion at the level of the coronoid process. The temporalis acts with the masseter and the internal pterygoid to raise the lower jaw. Because of its mildly posterior course, it plays a role in the protrusion of the mandible. It works against the depressors of the mandible during production of /f/ and /ƌ/ and front vowels. 4.2.1.4. The masseter The masseter is a superficial muscle, which is short, thick and powerful. It originates in the zygomatic arch and runs obliquely downwards and backwards. It is the most powerful muscle used in mastication. Its action has the effect of raising the mandible. The masseter intervenes to adjust vocalic openness and more particularly to achieve bilabial closure.
MUSCLES
Table 4.3. The mandibular depressors
Obliquely upwards
Skin - Lower neck
Platysma
Tendinous raphe Floor of the mouth
Downwards and medially
Pterygoid fossa Temporomandibular joint Mandible Skin of lower face
Anterior body of the hyoid bone
Intermediate tendon near the lesser horn of the hyoid bone
INSERTION
Downwards and anteriorly Downwards and posteriorly
Downwards and posteriorly
COURSE
Lateral surface of the lateral pterygoid Backwards plate
Mastoid process of the skull Anterior inner surface of the mandible, near the symphysis Inner surface of mandible
Inner surface of mandible
ORIGIN
Lateral Pterygoid
Mylohyoid
Geniohyoid
- Posterior belly
Digastricus - Anterior belly
ACTION
Draws the corners of the mouth inferiorly Depresses the mandible
Depresses and protracts the mandible
Brings the tongue backwards and upwards for velar articulation If hyoid bone fixed , depresses the mandible. Otherwise raises the hyoid bone upwards and forwards If hyoid bone fixed , depresses the mandible. Otherwise raises the hyoid bone forwards and upwards
If hyoid bone fixed, depresses the mandible. Otherwise : draws the hyoid bone up and forwards. Elevates the larynx, stretching the vocal folds. Brings the tongue forwards and upwards for alveolar and front high vowels
108 From Speech Physiology to Linguistic Phonetics
Articulation: The Labio-Mandibular System
109
Table 4.4. The mandibular elevators
4.2.2. The suprahyoid muscles The work of the mastication muscles is supplemented by the activity of the suprahyoid muscles: the digastricus, the mylohyoid and the geniohyoid. The suprahyoid muscles lower the hyoid bone. They also contribute to the retraction and lowering of the lower jaw (Westbury, 1988; Van Eijden and Koolstra, 1998). According to Baer et al. (1988), they play a major role in the production of front vowels. When, however, the position of the mandible is fixed by contraction of the mandibular elevators, these muscles instead draw the hyoid bone upwards and forwards and raise the larynx. Raising Internal pterygoid
+
Masseter
+
Temporalis
+
Lowering
Protruding
Retracting
Gliding
+
+
+
External pterygoid
+
Geniohyoid
+
+
Digastricus
+
+
Mylohyoid
+
+
Genioglossus
+
+
+
+
Table 4.5. Possible movements of the mandible under the influence of the mandibular and suprahyoid muscles
110
From Speech Physiology to Linguistic Phonetics
4.3. Linguistic functions of lip movement Protrusion, spreading and rounding of the lips are used to produce distinct classes of vowels and consonants. Lips may be primary or secondary articulators. 4.3.1. Vowels Four features are commonly used to specify vowels: degree of aperture, nasality, frontness and labiality. In the world’s languages, it has been observed that high vowels have a greater tendency to be labialized than low vowels. Similarly, back vowels are more often labialized than front vowels. Keyser and Stevens (2006) see these tendencies as enhancing perceptual differences between vowels. However, this rule is not without exceptions: French, for example, has the contrast between /i/, a non-rounded front vowel and /y/, a front vowel that is rounded but nevertheless lacks labial protrusion. Similarly, Swedish phonology has a couple of high back vowels which are labialized, and /u/ [+rounded] contrasts with Ÿ [-rounded].The corners of the mouth are vertically compressed and the articulation is everted. In fact, as mentioned previously, the lips are capable of adopting widely varying configurations which do not easily translate into binary oppositions such as [ rounded]. Moreover, this factor makes it difficult to compare the phonological and phonetic systems of various languages. Lips may be more or less separated, rounded, projected or pressed against one another. It would surely be useful to quantify the vertical dimension of labial protrusion as opposed to the compression of the corners of the mouth and even of the cheeks. Then, we could question the usefulness of the feature [ compressed] as opposed to the feature [rounded]. Abry et al. (1979) measured 12 lip position parameters for French vowels and concluded by means of factorial analysis that two parameters are sufficient to characterize them: the area of lip opening and the relationship between the labial axes (vertical opening versus horizontal opening). Lip area alone would explain 100% of the [ rounded] variation. Linker (1982) in his comparative photographic and acoustic study of labiality in English, Cantonese, Finnish, French and Swedish shows that these languages differ in their use of the labiality parameter. Labial protrusion is not sufficient to show the difference between them on its own: – one factor, horizontal opening, is all that is required for English; – two factors, horizontal opening and frontal area are needed for Cantonese; – three factors are needed for Finnish, French and Swedish, but not the same ones:
Articulation: The Labio-Mandibular System
Finnish:
111
horizontal opening, frontal area, vertical opening horizontal opening and frontal area/protrusion
French:
horizontal opening and frontal area/protrusion vertical opening – protrusion of lower lip/horizontal opening
Swedish:
horizontal opening, frontal area, protrusion horizontal opening and frontal area/protrusion vertical opening, frontal area, protrusion
4.3.2. Consonants The articulatory processes which are set in motion to produce bilabial consonants, or which include lip movement as a secondary articulator in their realisation, are complex because they rely on the participation of several relatively independent articulatory subsystems (Löfqvist, 2005). According to their mode and place of articulation, we will discuss stop consonants and fricatives in turn. 4.3.2.1. Stop consonants 4.3.2.1.1. Primary articulators Bilabials Occlusion by lip closure produces a series of stop consonants found in almost all the languages of the world. These consonants may be oral and voiced (e.g. /b/), voiceless (e.g. /p/), or even nasalized (e.g. /m/). The adduction of the lips is effectively achieved by the action of the orbicularis with the assistance of the masseter, the temporalis, the internal pterygoid and the levator anguli oris muscles. Typical closure duration varies between 50 and 150 ms. During oral occlusion for /p/ and /b/, intra-oral pressure increases, but not to the point where it forces release of the closure. Release occurs as a result of a specific gesture initiated mainly by the depressor labii inferioris which lowers the bottom lip. This is followed up by the depressor anguli oris and the platysma. The upper lip elevators also take part in the opening gesture (Leanderson, 1971). It must be borne in mind that the muscular forces in lip movements in speech are quite weak, representing less than 20% of the potential maximum of the forces of contraction. Hinton and Arokiasamy (1997) demonstrated that in lip occlusion in normal speech, interlabial pressure for /p/ was of the order of 10.5% of the interlabial pressure that could be exerted with the assistance of the jaw, and of 14.6% if the jaw was blocked. Several studies have demonstrated greater EMG
112
From Speech Physiology to Linguistic Phonetics
activity for voiceless consonants than for voiced consonants. Similarly, changes in activity related to suprasegmental organization have been reported. Increases in airflow output and intensity correlate with increases in electromyographic activity in the orbicularis, the levator anguli oris, the depressor anguli oris, the buccinator and the depressor labii inferioris (Gay et al., 1974; Wohlert and Hammen, 2000). This translates as an increase in the speed and amplitude of lip movement (Schulman, 1989; Dromey and Ramig, 1998) and can also be associated with increases in intraoral pressure. The gesture of labial closure is often cited in support of theories of coproduction. This gesture is indeed representative of the phenomena set in motion at the peripheral level and is a good example of the control mechanisms that are enabled by coordinative structures. Experiments in the mechanical pertubation of the activity of the jaw during the production of bilabial stops (by blocking its movement or creating variations in the resistive charge applied to it) stimulate very rapid movements (around 40 ms) in the upper lip by way of compensation or recovery, especially in the held phase, in a way that a central control mechanism would not be able to achieve. Ito et al. (2000) propose a model of mechanical connection between the upper lip and the jaw which accounts neatly for the phenomena observed in continuous speech. Labiodentals Hockett (1955) and Ladefoged (1971) report the existence of stop consonants realized by the closure of the lower lip on the upper incisors. These labiodentals would contrast with bilabial stops in some languages. It seems that this type of contrast, which appears to occur rarely, might yield to a distinction between a bilabial stop and an affricate. 4.3.2.1.2. Secondary articulators Insofar as lip movement can be relatively independent of the tongue, the lips are also capable of modifying the principal articulation by their shape. According to the place of articulation by the tongue, it is thus possible to identify labiodental, labioalveolar, labiopalatal and labiovelar stops. 4.3.2.2. Fricatives 4.3.2.2.1. Primary articulators Bilabials The orbicularis brings the lips very close together. The constriction thus created in the labial channel produces friction noise. In Spanish, there is the consonant , an allophone of /b/ when intervocalic. In African languages such as Ewe,
Articulation: The Labio-Mandibular System
113
contrasts with /v/ and /f/ and also with its voiceless homorganic counterpart Ɓ, also encountered in several Caucasian languages. Labiodentals The lower lip is slightly raised and its internal surface rests on the tip of the upper incisors. Air escapes laterally. The lower orbicularis, buccinator and risorius combine to achieve this closing movement. For Gentil and Boë (1979), the adduction between the lower lip and the upper teeth also involves the upper orbicularis, the minor zygomatic, the levator anguli oris and the major zygomatic muscles, which raise the corners of the mouth. This articulation is very widespread: it is found in French and English, and also in Portuguese where there are contrasts both between /f/ and /v/ and between them and /p, b/. The inverse form, with lips everted, allows for finer distinctions; these are exploited by languages such as Bantu (Guthrie, 1948). Labiopalatals and labiovelars /w/ is a labiopalatal semi-consonant and forms part of the inventory of a great number of languages, including French and English. Ů, a labiovelar semiconsonant, contrasts with the labiopalatal /w/ in French. Lip protrusion is as for closed rounded vowels. 4.3.2.2.2. Secondary articulators Post-alveolars The orbicularis of the lips is chiefly responsible for protrusion during the production of ƌ and ƛ. Lip separation is adjusted by the deep and superficial muscles involved in elevating the upper lip, the minor zygomatic and the depressor labii inferioris. Alveolars The muscles which draw the corners of the mouth backwards for /s/ and /z/ are the buccinator and the risorius.
114
From Speech Physiology to Linguistic Phonetics
4.4. Motor coordination between the lips and the lower jaw The role of the jaw in speech production has been the object of research with some very far-reaching consequences. This can be explained by the relative ease with which the jaw and its movements can be observed, with the aid of various techniques: devices with strain-gauges, to analyze the limits of opening and closing and to evaluate the physical forces in play; and tracking procedures using imaging techniques such as X-rays, film, video or electromagnetometry. As far as the articulatory description of languages is concerned, the most useful data has been produced by radiofilms (Wiolland, 1971; Maeda, 1990). The interesting aspect of the radiofilms is that the jaw and several other articulators, such as the tip of the tongue, the dorsum and the lips, can be examined at the same time. The research shows that there is a significant variability among speakers, and suggests a major role for prosodic structure, which will be confirmed by MRI studies (Lee et al., 2006). As far as coordination between articulators is concerned, the main finding is a strong correlation between tongue height and the angle of jaw opening (Sanguinetti et al., 1998), which has allowed articulatory synthesis to be simplified in that only one command for aperture is required. A very strong correlation between lip and jaw movements was also found. For a precise gesture, the lips mobilize the coordinated activity of about 10 muscles. These function interdependently on the principle of motor equivalence (see Figure 4.4). A similar interdependence applies to the lower jaw: during mouth closure, at least six muscles work together, although it is not possible to establish an unambivalent causal relationship between the activity of any given muscle and a particular movement. Blair and Smith (1986) and Lucero and Löfqvist (2005) have shown that the same vowel can be produced by different combinations of muscular activity, thus demonstrating the great plasticity of commands and the play of motor equivalents set in motion during speech production. Folkins and Zimmerman (1982), in an EMG study involving a fixed mandible, have firmly established the peripheral nature of lip-jaw coordination. Studies of speech impairment have similarly shown that the activity of the lips and the jaw are closely coordinated. It is therefore legitimate to refer to the lips and the jaw as a definite labio-mandibular system1.
1 For the decoupling of the lips and jaw and their respective roles, see Westbury et al. (2002).
Articulation: The Labio-Mandibular System
115
Depressor anguli oris Depressor labii inferioris
Lower lip Raising the lower lip Mandible Labial closure Upper lip Depressing the upper lip
O.O.inferior Levator anguli oris Mentalis Geniohyoid Mylohyoid Digastric External pterygoid Masseter Temporalis Internal pterygoid
Mandible Figure 4.4. Hierarchical control of the raising of the lower lip by a coordinative structure
Stone and Vatikiotis-Bateson (1995) aim to treat coarticulation as an effect of speech perturbation and show that a front-back impaired dimension can be compensated by up-down vertical displacement. They conclude from their ultrasound and EMG studies that it is necessary to retain only two degrees of freedom: aperture (as a joint effort by tongue, lips and jaw) and the shape of the vocal tract. Finally, one study concerns the mechanical link between the jaw and the larynx provided by the hyoid bone. Hoole and Kroos (1998) observe an inverse correlation between larynx height and labial protrusion, whereas Lim et al. (2006) find an inverse correlation between the fundamental frequency and the degree of jaw opening. It should also be noted that lip protrusion results in a lengthening of the vocal tract which could be compensated by a lowering of the larynx if forward movement of the lips is not possible (Saltzmann et al., 1998). A great number of studies have been conducted within the framework of the model of task dynamics (Hawkins, 1992) and the hypothesis that the behavior of
116
From Speech Physiology to Linguistic Phonetics
articulators in speech production is not different from what is observed in the control of precise movements such as locomotion and prehension, and that this can be tested and partially validated. The temporal organization of articulatory events is considered to be hard-wired in “gestures” which are themselves defined by dynamic equations. These studies therefore aim to examine the timing and phasing of jaw activity in relation to vowel and consonant targets, and in relation to the activity of various articulators such as the tongue, the lips and the larynx. The paradigm of articulation speed has allowed testing of the robustness of articulatory strategies and the role of interactions between several articulators (De Nil and Abbs, 1991; Harrington et al., 1995; Vaxelaire and Sock, 1996). Finally, several studies have examined in detail the activity of the jaw and the lips in the light of their modelling. Beautemps et al. (2001) show that a reduced number of degrees of freedom enable variation in articulatory data to be taken into account: two are sufficient for the jaw, three for the lips, four for the tongue and one for the larynx. This approach is interesting because it can integrate the synergy between articulators in one model and thus reduce the number of dimensions necessary to describe it. In addition, the validity of these parameters can be tested when used as an input to an articulatory synthesizer.
Chapter 5
Elements of Articulatory Typology
In the preceding chapters we have expounded the anatomical and physiological bases for speech production. We have described how the respiratory, laryngeal and digestive systems can be recruited to emit sounds. In this chapter we present a synthetic view of how languages exploit all these resources for linguistic ends. We indicate how a description of the sounds of the world’s languages can be organized on the basis of speech production. These elements of phonetic description based on articulatory considerations can also serve to account for the variability of speech at various levels, whether idiosyncratic, interindividual, expressive or pathological. These elements will constitute the premises of a phonetic theory of performance. We will consider in turn aerodynamic mechanisms, phonatory modes and articulation. 5.1. Aerodynamic mechanisms 5.1.1. Pulmonary initiation 5.1.1.1. Egressive airflow The airflow used in speech production is initiated by an organ or an articulator. The lungs constitute the main initiator. During exhalation, air flows out of the body and the flow is controlled by the respiratory muscles: this is egressive airflow. The vast majority of phonemes are produced by the modulation of egressive pulmonary air (see Table 5.1).
ࣔ
IY
ࡩ
ݡɕ
/DWHUDOIULFDWLYH
$SSUR[LPDQW
˓ۏ
'HQWDO
O
ޗ
לז
V]
ޞ
U
Q
WG
$OYHRODU
ߑও
3RVWDOYHRODU
ד
ޙ
ߋॷ
ޔ
ل
ߺ˅
5HWURIOH[
ु
M
oբ
ص
Fթ
3DODWDO
ס
؋
[
غ
NС
9HODU
ट
ޣ
ِ
Tд
8YXODU
юয
3KDU\QJHDO
Kѓ
প
*ORWWDO
Table 5.1. Consonants produced with pulmonic initiation
:KHUHV\PEROVDSSHDULQSDLUVWKHRQHWRWKHULJKWUHSUHVHQWVDYRLFHGFRQVRQDQW6KDGHGDUHDVGHQRWHDUWLFXODWLRQVMXGJHGLPSRVVLEOH
DSSUR[LPDQW
/DWHUDO
)ULFDWLYH
ɂ
7ULOO
؇
P
1DVDO
7DSRU)ODS
/DELRGHQWDO
SE
%LODELDO
3ORVLYH
&RQVRQDQWV3XOPRQLF
7KH,QWHUQDWLRQDO3KRQHWLF$OSKDEHWUHYLVHGWR
118 From Speech Physiology to Linguistic Phonetics
Elements of Articulatory Typology
119
5.1.1.2. Ingressive airflow Sounds produced in the inhalatory phase, as in the “oh” of surprise, are relatively rare. The current of air goes from outside the oral cavity inwards: it is thus called ingressive. Languages do not make linguistic use of this ingressive mode of the pulmonary initiator; it is a paralinguistic manifestation in sound of states of surprise or lively emotion. 5.1.2. The larynx 5.1.2.1. Ejectives The larynx is mobile and can be raised or lowered like a piston. When there is oral occlusion, raising the larynx has the effect of compressing the trapped air as if behind a dam; the resulting increase in intra-oral pressure increases the explosive sound when the occlusion is released. This ejection mechanism may be heard in palatal stop consonants. It is fairly frequent in unvoiced velar and uvular stop consonants. It is present in approximately 18% of the world’s languages (Ladefoged and Maddieson, 1996). It can also be heard in fricatives and clicks. 5.1.2.2. Implosives Lowering the larynx during the held phase of a stop consonant increases the size of the bucco-pharyngeal cavity and lowers the intra-oral pressure: in extreme and rapid lowering, it can even create a depression and produce an air intake. Sindhi has a series of homorganic stop consonants including voiced, voiceless, aspirated voiceless, breathy voiceless and implosive. Uduk, a Saharan language, has contrasts between implosive and ejective stops that may be voiced, voiceless, aspirated voiceless, bilabial and alveolar. 5.1.3. The supralaryngeal articulators The occurrence of a double occlusion in the vocal tract allows the creation of a closed cavity. If the size of this cavity is altered, two things can happen: – expansion involves reduction of pressure inside the cavity. On release, air will be required and will rush into the cavity, producing an ingressive airflow; – conversely, if the size of the cavity is reduced, an increase in pressure will result, and on release this will produce egressive airflow. This is known as the velaric airstream mechanism (see Figure 5.1).
120
From Speech Physiology to Linguistic Phonetics
In ingressive airflow, as in egressive airflow, there is a slight sound at the moment of release: the spectral characteristics of this sound are determined by the articulatory place of release. The sounds produced by this mechanism are known as clicks (see Table 5.2). Posterior occlusion is achieved by the tongue against the hard palate or the soft palate. According to where the anterior occlusion is made and released, different clicks are heard: bilabial (the sound of a kiss); dental (the sound used to encourage horses); lateral alveolar; and unvoiced, voiced and nasalized post-alveolar clicks. These sounds have phonological status in several African languages such as Zulu and Xhosa.
Figure 5.1. Velaric airstream mechanism: production of a dento-palatal click (from Ladefoged, 1975)
&RQVRQDQWV1RQ3XOPRQLF &OLFNV
9RLFHGLPSORVLYHV
(MHFWLYHV
ݎ
%LODELDO
ȴ
%LODELDO
·
ਞ
'HQWDO
˄
'HQWDODOYHRODU
S·
%LODEDO
ৡ
3RVW DOYHRODU
կ
3DODWDO
W·
'HQWDODOYHRODU
ਠ
3DODWRDOYHRODU
Я
9HODU
N·
9HODU
ਟ
$OYHRODUODWHUDO
с
8YXODU
V·
([DPSOHV
Table 5.2. Implosive, ejective consonants and clicks
Elements of Articulatory Typology
121
Such sounds are equally familiar to us despite having no phonological status in French or English. In fact, we produce them fluently in the course of uttering two stop consonants such as /tk/ or /kt/ in utterances such as “fat cat” or “tactics”, in which a double occlusion will occur during part of the held phase of the two stop consonants (see Figure 5.2).
Figure 5.2. Coproduction of [t] and [k] in [katkar] resulting in the production of an ingressive alveo-palatal click (from Marchal, 1987)
5.2. Phonatory modes In Chapter 2, concerning the larynx, we saw that up to 10 phonatory modes can be discerned according to laryngeal adjustment: the position of the arytenoids, the degree of opening in the intercartilaginous and interligamentary glottis, vocal fold tension and subglottal pressure. While remembering that no language makes systematic use of more than three of these modes, we recall below the characteristics of the main seven and give examples of the linguistic and paralinguistic uses to which they are put in the languages of the world.
122
From Speech Physiology to Linguistic Phonetics
The principal modes are: – modal voice; – voicelessness; – breathy voice; – murmur; – laryngealization; – whisper; – glottal occlusion. 5.2.1. Voicing or modal voice Modal voice Modal voice constitutes the most natural mode of vocal fold vibration. The arytenoids are drawn together and the vocal folds are closed along their whole length, with a medium degree of tension. Subglottal pressure and Bernoulli’s effect produce a rhythmical opening and closing of the vocal folds and the vibrations continue as long as the transglottal pressure difference remains adequate. Voicing is a characteristic of vowels. It is also a feature of the series of voiced consonants which contrast with the series of voiceless homorganic consonants that are present in most of the world’s languages. 5.2.2. Voicelessness The vocal folds are spread, and the glottis is wide open, although slightly less so than for normal exhalation. Air escapes freely between them and the airflow is laminal: this is the mode used for the production of voiceless consonants. According to Maddieson (1984), when only one series of stop consonants is present in a language, they will be voiceless. 5.2.3. Breathy mode The breathy mode is characterized by separated arytenoids, slightly less than for voicelessness. The airflow is greater than for voicelessness. Subglottal pressure may also be stronger owing to increased exhalatory effort: this is no doubt the foundation for the practice adopted by certain phonologists of citing a single feature (+increased subglottal pressure) to describe phonemes realized in this phonatory mode. Owing to
Elements of Articulatory Typology
123
the configuration of the larynx, airflow becomes turbulent. Although it is an egressive airstream, it is sometimes unfortunately described as “aspirated”. Voicing and “aspiration” cannot co-occur, contrary to the impression given by certain transcriptions. The three possible contrasts comprise: (voiced), (unaspirated voiceless) and (aspirated voiceless). “Aspiration” can be involved in only one part of the held phase of a consonant, and in fact can only occur at the moment of release in stop consonants. Several authors, following Lisker and Abramson (1964), prefer to distinguish between aspirated and non-aspirated homorganic consonants on the basis of the delay in the onset of voicing after release of the held phase: this is known as voice onset time (VOT). This temporal approach allows them to consider, as part of the same set, specific consonants as lenis, fortis, voiced, unvoiced, aspirated and nonaspirated. On this basis of temporal organization related to voicing, five gradations in the use of the VOT parameter can be distinguished in languages: – continuous voicing: VOT is non-existent; – partial voicing: VOT is expressed as a negative value corresponding to the duration of the voiced interval; – voicing occurring at the moment of release: VOT is zero; – voicing occurring after a period of brief aspiration: VOT expressed as a positive duration value; – voicing occurring after a period of long aspiration: VOT has a long positive value. In a phonetic sequence, the relative timing of articulatory gestures at laryngeal level with supraglottal gestures may, as articulatory phonology would claim, be the basis of phonological distinctions. Aspiration may also occur before the beginning of a voiceless stop when it is preceded by a nasal vowel or a liquid. Pre-aspiration can be distinctive when in intervocalic positions in certain languages such as Icelandic or Gaelic as well as native American languages, e.g. Ojibwa. In other languages, aspiration may only be an allophonic variation in certain positions. This is what occurs in English when a syllable-initial voiceless stop is characterized by a phase of release described as “aspirated”, e.g. the /p/ of /pan/ sounding as [phan].
124
From Speech Physiology to Linguistic Phonetics
Aspiration can also be employed as a stylistic device: exasperation or contempt can be expressed by aspirating initial stops, as in words like “pillock” or “coward” addressed to the recipient of verbal abuse. 5.2.4. The murmur The arytenoids are well parted. Airflow is important. Vocal fold tension is weak and closure is incomplete. When voiced, the sound is accompanied by a breathy noise. Ladefoged (1971) suggests that most Indo-Aryan languages have a series of stop consonants with a murmured release in addition to those series which are distinguished by voicing, voicelessness and aspiration. Murmured consonants are also found in Bantu languages, as well as in numerous Indian languages. For example, in Gujarati EDU means “twelve”, whereas EDU denotes “burden”. Similarly, voiced nasals can be distinguished from murmured nasals in Shona. The murmur is a fairly inefficient phonatory mode as far as making best use of the larynx is concerned, because a significant proportion of air escapes without being modulated. The voice thus produced is relatively weak. 5.2.5. Laryngealization or “creaky” mode In this phonatory mode, the arytenoids are pressed so closely together that the back parts of the vocal folds are crushed together and only the ligamentary part can vibrate. The sound is raucous and features a bass fundamental frequency. Laryngealization is an alternative term, as is vocal fry or glottal fry. In both French and English, laryngealization can have a stylistic role as an expression of boredom. Laryngealized voice can, however, also be a symptom of phonatory disorder. In this respect, it is interesting to note that what appears as linguistically significant in one language may be considered pathological in other linguistic communities; it also raises the difficult question of the boundary between what can be considered normal and what can be considered atypical. In particular, it poses the delicate problem of how to define objectively the distinction between paranormality and pathology, using physical parameters. Hausa, Bura and Margi include semivowels and voiced and laryngealized stops in their phonological inventory. For example, in Margi /bibi/, “placed” differs from /bjbj/, “hard”. In Lango, a series of voiced vowels contrasts with a series of laryngealized vowels (Ladefoged, 1964): /lee/ means “animal”, for example, while /le߆e߆/ means “ax”.
Elements of Articulatory Typology
125
5.2.6. Whisper mode The vocal folds are closed anteriorly; the arytenoids, however, are separated and the intercartilaginous glottis is open. Transglottal airflow is weak: whispering can only have a phonological function when it contrasts with voicelessness: this appears to be what happens for stops and fricatives in final position in Wolof. 5.2.7. Glottal closure The vocal folds are completely adducted along their whole length, producing a complete glottal closure. In Tagalog, this state of the vocal folds operates a phonological function since /kaƝo:n/ “straw” is meaningfully contrasted with /kaho:n/, “box”. In French, as well as in English, the glottal stop only achieves a stylistic effect. It can be found in front of initial vowels, for example, in commands such as “On your marks!” or “As you were!”. 5.3. Articulation In the previous section, we have seen how the airflow generated by the respiratory system can be transformed at the level of the larynx by various adjustments of the vocal folds. It is at the supralaryngeal level that the articulatory organs give speech sounds their characteristic features and definitive properties. All the organs that have a role to play in this last stage of speech production have a primary biological function: ensuring the mastication, transport and absorption of food. Their use in speech is a secondary function. The one property common to all these organs is their great mobility, and thus the opportunity they offer of varying the configuration of the vocal tract: modifying the size of the oral cavity, of the pharyngeal cavity, connecting with the nasal cavities, as well as coupling and resonating with the lips. The term “articulation” denotes all the movements of the articulatory organs which have a role in forming speech-sounds. 5.3.1. The dimensions of the articulatory description of speech sounds The sounds of speech can be described under seven principal dimensions: – an operating dimension to describe the way in which articulators control the passage of phonatory air in the vocal tract, i.e. articulatory mode; – a vertical dimension to characterize the degree of opening or closure of the vocal tract, i.e. aperture;
126
From Speech Physiology to Linguistic Phonetics
– a horizontal dimension to describe the region where the main constriction occurs from the phonetic point of view, i.e. the place of articulation; – a transverse dimension to indicate the shape of the channel in which the air runs and the way in which the air bypasses an obstacle in the vocal tract, either in the center or at the sides; – a temporal dimension; – a kinetic dimension to characterize articulatory dynamics and their timing with respect to the various phases of articulatory gestures; – a dimension to describe the energy co-opted by the production system, i.e. the strength of articulation. 5.3.1.1. Articulatory mode The articulatory mode takes into account the resistance to the flow of phonatory air. Three cases can be distinguished: free passage of air; partially free passage; and totally blocked passage. 5.3.1.1.1. Free passage of air When nothing hinders the passage of air, the only function of the vocal tract configuration is to modulate the phonatory air while favoring the resonance of certain harmonic frequencies. The bucco-pharyngeal cavity “filters” the laryngeal source and gives rise to resonant sounds, in particular those of vowels. 5.3.1.1.2. Presence of a partial obstacle If there is a partial obstacle, air cannot escape freely. The obstacle narrows the air passage, i.e. causes a constriction. Consonants produced with this vocal tract configuration are called constrictives, e.g. /f, s, ƌ, v, z, ƛ/. 5.3.1.1.3. Presence of a complete block Where the flow of air is completely blocked by the presence of a barrier in the vocal tract, the sound is cut off. Consonants featuring such a complete occlusion are termed “occlusives” or “stops”. This class of consonants is found in all languages, e.g. /p, t, k, b, d, g/. 5.3.1.2. Aperture The aperture (from the Latin apertum, “open”) feature designates the minimal width of the air channel at the place of articulation. This meaning is slightly different from the way in which the term is used to refer to the openness of vowels
Elements of Articulatory Typology
127
or consonants. For vowels, “aperture” usually describes the distance which separates the highest point of the tongue dorsum from the hard palate. For consonants, aperture refers to the cross-section of the vocal tract: this feature therefore includes the vertical as well as transverse dimensions of the restriction of the air channel at every point of the vocal tract. In order of increasing size of aperture, there are occlusives (with no aperture), fricatives (constrictives featuring the noise of friction), voiced approximants (constrictives with no audible friction noise) and finally vowels, which can be high, mid-high, mid-low and low. 5.3.1.3. Place of articulation The articulatory system comprises the lower maxillary, the lips; the teeth, the tip, the dorsum and root of the tongue; the velum; the uvula; the pharynx; and the larynx. Traditionally, there is a distinction between the active articulators, which modify the vocal tract configuration, and the passive articulators, which are the target of the articulatory gestures. The movements of the active articulators vary the shape and the size of the air channel and create the aerodynamic and acoustic conditions necessary to produce phonemes. The idea of the place of articulation refers to the passive articulators or targets. The notion of “place” should not be taken too literally: it corresponds rather to an area or a target at which the articulator is aimed. Articulations are described by reference to the six main mobile parts of the vocal tract (the active articulators) and their targets: 1) the lips involved in the production of labial gestures; 2) the tip of the tongue for laminal and coronal1 articulation; 3) the dorsum of the tongue for dorsal articulation; 4) the root of the tongue and the epiglottis associated with root articulations; 5) the pharynx for pharyngeal articulation; 6) the larynx which can be mobilized for glottal articulations. The passive articulators constitute the targets of the articulatory gestures. Twelve targets or places of articulation can be usefully distinguished (see Figure 5.3) on the basis of anatomical, morphological and functional criteria:
1 The term “coronal” refers to articulation realized with the edges or the crown of the tongue, whereas “laminal” designates an articulation realized by the front part of the tongue blade.
128
From Speech Physiology to Linguistic Phonetics
1) the lips or labial zone; 2) the teeth or dental zone; 3) the alveolar ridge or alveolar zone; 4) the post-alveolar zone; 5) the pre-palatal zone; 6) the hard palate or palatal zone; 7) the post-palatal zone; 8) the velum or velar zone; 9) the uvula or uvular zone; 10) the pharynx or pharyngeal zone; 11) the epiglottis or epiglottal zone; 12) the glottis, glottal zone or laryngeal zone.
Figure 5.3. Schematic view of the 12 major articulatory locations
5.3.1.4. The transverse dimension The neutral shape of the tongue is the regular convex curve of the body of the tongue. The surface can be altered to create a local depression in the tongue blade.
Elements of Articulatory Typology
129
This results in the formation of a narrow furrow. Such a constriction in the channel of the airway produces an acceleration of the flow of phonatory air and results in jets of turbulence. This shape of the surface of the front part of the tongue is typical of the production of /s/. A concave curve in the body of the tongue can also be seen during /ƌ Air can pass through the center of such a restricted passage as in /s/ or /ƌ/. It can also, where there is a median support for the tongue, escape over the sides and form a lateral articulation, as is the case for /l/. Today, fMRI shows us that the activity of the tongue muscles plays an important role in shaping the characteristic lingual furrow of certain constrictives. The shape of this depression at the center of the tongue allows us to distinguish more finely between different articulations that seem very similar, seen in profile. We should also note that the linguopalatal contacts are seldom symmetric (Marchal and Espesser, 1989). 5.3.1.5. The temporal dimension of articulation The realization of phonemes unfolds in time. Essentially, three steps are identifiable: an initial phase, a central phase and a phase of release or transition towards the succeeding segments. This means that to talk of an articulation as a “state” of the articulators at any given moment is a rather simplified version of reality. It would be more appropriate to talk about the articulatory target, a notion which takes the dynamic character of articulation better into account. The duration of articulatory movement will depend on the distance that has to be travelled by the groups of articulators involved from the start of articulation, and will vary in proportion to the effective speed of those organs. The latter will vary according to the degree of innervation in the muscle, depending on the type as well as the number of the muscle fibers recruited for the movement (see Table 5.3). Finally, the coordination of the articulatory gestures of the different systems and subsystems responsible for the production of speech sounds deserves particular attention, because the durational differences between the various movements which result, and their effect on the flow of phonatory air and the acoustic signal may explain a great number of phonological pseudo-phenomena, as we found when discussing VOT.
130
From Speech Physiology to Linguistic Phonetics
Table 5.3. Relative duration of specific speech gestural tasks
The tip of the tongue is very mobile and can move very rapidly; the velum, on the other hand, is slower. In the same way, the lips and the jaw, although they belong to the system that is involved in labial closure, have specifically different speeds of movement. At all events, the speed of articulation is largely determined by prosodic organization. Given this, the asynchronicity of articulatory gestures constitutes an incontrovertible basic fact. Duration can also be exploited to distinguish classes of sounds; for example, some sounds are lengthenable while others cannot be. Vowels can be prolonged freely; their duration is limited mainly by the reserve of phonatory air. Voiced phonemes can also be prolonged for as long as the difference between subglottal pressure and intra-oral pressure can be maintained to keep the vocal folds vibrating. This explains why voiceless phonemes can last longer than voiced phonemes. A voiceless stop can be prolonged for a very long time; this cannot be the case for voiced stops.
Elements of Articulatory Typology
131
For constrictives, the consonant can be prolonged for as long as the characteristic noise of friction normally associated with it can still be perceived. Nasal vowels can reputedly last longer than oral vowels, and amongst the latter there is an order of duration which is linked to the degree of aperture. These durational differences are intrinsic to the vowel. On the other hand, there exist a number of phonemes for which duration constitutes a distinctive feature: for example, semi-consonants cannot be prolonged without losing their identity. Consonant that are trilled or rolled, like /r/ or /Ɖ/, are momentary occlusions. Flaps are distinguished by a single movement which has no stopping point; as for clicks, their characteristic sound comes solely from the release of an occlusion during their realization. 5.3.1.6. Strength of articulation It is quite difficult to achieve an objective measure of articulatory strength2, which is sometimes confused with the notion of articulatory tension. It cannot be related with any certainty to the activity of any one muscle or even to a well-defined set of given muscles. It is a rather vague notion which concerns the greater or lesser physiological activity involved in the production of a sound. The articulatory strength of a phoneme has phonetic consequences for all neighboring phonemes. It is through the measured effect of a phoneme on its neighbors that it has been possible to establish a ranking of the articulatory strength of phonemes.
2 In a series of articles on this subject, Malécot (1955, 1966, and 1970) tries by different techniques to evaluate articulatory strength. There has also been an attempt (McGlone and Proffit, 1972) to measure the force exerted by the tongue on the palate and the dental arcade, using pressure-sensitive electrodes placed on an artificial palate.
This page intentionally left blank
Chapter 6
The Articulatory Description of Vowels and Consonants
The languages of the world make specific use of the physiological and anatomical possibilities of the articulatory organs, and exploit only part of the gestures and articulatory positions that humans can produce. Without aiming to be exhaustive, which would be beyond the scope of this book, we will describe some articulations which are widely used for linguistic purposes, going from the lips to the glottis. For a more complete view, see Ladefoged and Maddieson (1996) or Laver (2002). Following established convention, articulations are designated by combining the name of the active articulator with the name of the place of articulation (see Table 6.1).
134
From Speech Physiology to Linguistic Phonetics
Table 6.1. Places of articulation, articulators and articulatory classification of speech sounds
6.1. Vowels 6.1.1. Mode There is no absolute criterion for distinguishing between vowels and consonants on an articulatory basis. Despite numerous efforts to do so (Straka, 1965), it is very difficult to distinguish, for example, /i/ from /j/ on this single criterion, and to indicate the articulatory threshold between vowel and semi-consonant exactly. The essential and obvious difference for the speakers of a given language is provided by the function of this class of phonemes in that language. The vowel constitutes the pivot of the syllable. One vowel can constitute a syllable all on its own, but this is not true for consonants: the latter must necessarily be associated with a vowel that “sounds with” it. The vowel is produced with an open air channel, with no major constriction and without generating any friction noise. It must be a lengthenable sound, with primary articulation that is central, oral, dorso-palatal or linguo-pharyngeal.
The Articulatory Description of Vowels and Consonants
135
6.1.2. Articulatory region/zone The classic phonetic approach is to class vowels according to the position of the tongue in the oral cavity, taking the position and shape of the lips into account in describing them. The articulatory region for vowels, as defined in a horizontal dimension, refers to the place of the top of the dome of the tongue in relation to the palatal vault. This can be either more or less towards the front or the back of the oral cavity: it is thus usual to distinguish between “front” vowels (e.g. /i/, /a/), “central” vowels (/œ /, /ø/) and “back” vowels (u/, /ɬ/). As for the lips, they can be spread, rounded or protruded, in various degrees. The parameter of labiality is both the easiest to see and the easiest to control. It is thus perfectly possible to pass easily from /i/ to /y/ just by protruding the lips. It should also be noted that there is a certain correlation between the height of the tongue and labiality. High vowels are reputedly less susceptible to rounding than low vowels. The rounding of /u/ is less important than the rounding of /ɬ/. Moreover, for back vowels, the corners of the mouth are drawn towards the inside of the mouth as if for a pout (inner rounding), whereas for front vowels, rounding traditionally takes the form of a vertical compression (outer rounding). 6.1.3. Vocalic aperture The “vocalic aperture” is the term designating the vertical distance which separates the dome of the tongue and the palate. It refers to the degree of convex curve in the surface of the tongue and its relative height in the oral cavity. On this basis, it is possible to distinguish high vowels such as /i/ and /u/, mid-high vowels such as /e/ and /o/, mid low vowels such as /͑/ and /ɬ/ and low vowels such as /a/ and /ľ/. This, then, is a classification based on a vertical dimension and not on the area at the place of articulation, as for consonants. The degree of aperture in vowels is governed both by changes in tongue height and by the raising and lowering of the jaw. Synergistic phenomena transpire: compensation between the two articulators as to the precise control of the height of the tongue in the phonetic context, and internal and external constraints governing the two articulators.
136
From Speech Physiology to Linguistic Phonetics
6.1.4. The vowel space: cardinal vowels Jones (1962), following Bell (1879), had the idea of conceiving a theoretical vowel space with limits corresponding to articulatory positions beyond which no vowel sound could be produced. This space bounded by the so-called “cardinal” vowels universally functions as a theoretical framework within which it is possible to specify the place and properties of the vowels of specific languages as uttered by particular speakers at a given moment. From the highest point at the front to the lowest point at the back, the space is delimited by the vowels /i/ and /ľ/. Starting with /i/, if the aperture is increased at equal distances, the vowels /e/, /͑/ and /a/ will be found to occur in the front part of the mouth. This preliminary catalog of vowels at the boundary of the vowel space thus comprises five vowels, to which can be added, by raising the tongue from the position of /ľ/, the equidistant vowels /ɬ/, /o/ and /u/. The eight vowels thus identified at the limits of the vowel space constitute the primary cardinal vowels1 (see Figure 6.1).
Figure 6.1. The primary cardinal vowels
1 For an alternative representation of vowels deduced from the description of consonants, see Catford (1977).
The Articulatory Description of Vowels and Consonants
137
An additional set of cardinal vowels can be obtained by inverting the position of the lips and adding a central series. This complementary set comprises the secondary cardinal vowels. The complete set of cardinal vowels is shown in Figure 6.2. The cardinal vowel system is a tool for describing vowels on an articulatory and auditory basis.
Figure 6.2. Schematic representation of the vowels in the vocalic space. Where symbols appear in pairs, the one to the right represents a rounded vowel
6.1.5. The temporal dimension Vowels may differ in their intrinsic duration through the conditions of their production and because of various aerodynamic constraints. All other things being equal, closed vowels are shorter than open vowels. Equally, the duration of vowels is affected by their phonetic environment, their position in the syllable, the syllable-type and the place of the syllable in the structure of the prosodic phrase.
138
From Speech Physiology to Linguistic Phonetics
Vowels may be freely prolonged up to the limit of their duration, which essentially depends on the reserve of breath and the ability to maintain enough transglottal pressure to sustain vibration of the vocal folds. Some languages make phonological use of the duration parameter for vowels. The distinction between short and long vowels is a contrast of quantity. The length relationship between these two types has variability, according to the language, in the region of 30-60%. Such a system operates in Czech and SerboCroat: /i i:, /e e: /o o: /a a: /u u:/. It should also be noted that the quantity contrast is often accompanied by a difference in vowel timbre. Thus, when Lehiste (1970) cites a ternary opposition of length for vowels in Estonian, it raises the question of the perceptual role of vowel timbre for distinguishing vowels as short, normal and long. 6.1.6. Dynamic aspects 6.1.6.1. Changes in vowel configuration: diphthongs Diphthongs are long vowels which change their timbre during emission. They constitute a single segment produced with two successive targets. It is therefore possible to characterize diphthongs according to: a) the direction of their articulatory glide; b) the degree of aperture from the first to the second target; and c) the perceptual importance of the two timbres: – Closing diphthongs: aperture diminishes: /DHDL±\ľX/. – Opening diphthongs: aperture increases: /L͑\±Xɬ /. – Rising diphthongs: the second element of the diphthong is the more important. – Falling diphthongs: the first element is the more important. Diphthongs are present in the phonological inventory of 30% of languages. Among these, /ai/ and /ľu/ are the most frequent and occur in 65-75% of cases. French lost its diphthongs in the 12th and 13th centuries. They are now found only in certain regional varieties of French, as allophones of long vowels. In Quebec French, there is often a diphthongization of long vowels or of vowels lengthened by a following /r, z, v, ও/ in closed syllables, e.g. /DL±ऺľX / as in [SDLU], [E±\U], [SľXW], (père, beurre, pâte).
The Articulatory Description of Vowels and Consonants
139
Triphthongs are vowels whose configuration changes greatly and quickly during emission, and in which three distinct timbres can be discerned. This type of vowel tends to be fairly rare as a phoneme, and often appears rather as a stylistic variant of diphthongs. This is the case in RP pronunciation of words like “flower”: [flaࡱ̸] or “fire”: [fai̸]. 6.1.6.2. Variation in fundamental frequency: tones In certain languages, such as Chinese, Vietnamese, Zulu and Yoruba, there are phonological elements that are not segments like phonemes, but nevertheless have a similar distinctive function. They exploit variation in fundamental frequency for phonemic distinctiveness. Thus, two morphemes composed of the same segmental phonemes can mean two different things according to the pitch of the voice during vowel production. In Yoruba, for example, the lexicon includes the following minimal pair: /ɬkɬ / “husband” as opposed to /ɬNɬ/ “car”. In /ɬthe laryngeal height is low: less so for /ɬ. This indication of pitch is called a “tone”. There are several types of tone. It is customary to distinguish between static and dynamic tones. Static tones One pitch level is contrasted with another pitch level in the same system. Thus, in Vietnamese, /mƗ/ pronounced with a high stable fundamental frequency means “devil”, whereas /ma/ with a low stable fundamental frequency means “in order to”. There is thus a contrast between high pitch and low pitch. Languages such as Yoruba also have a mid-range pitch. Dynamic tones Meanings conveyed by fundamental frequency variation also allow dynamic tones to be contrasted: these can be rising or falling. In Chinese, /lt/, “pear” is contrasted with /lu, “chestnut”. Similarly, there are rising-falling tones [^] and falling rising tones [̀]. Static tones and dynamic tones Varying combinations of level and direction of pitch can also be exploited for phonological purposes. Thus, in the low level, Vietnamese distinguishes between /má/ with a rising pitch (“tomb”) and /mà/ with a falling pitch (“day”). Mpi has the following tone distinctions: low rising, low, mid-high rising, midpitch, high rising and high, using both phonatory modes.
140
From Speech Physiology to Linguistic Phonetics
Our ears are not accustomed to discriminating between tonal variations that are not present in our own language. Moreover, the realization of tones is subject to certain variations: glottalization in high tones, nasalization in low tones and contextual modifications depending on the surrounding consonants. Articulatory studies can substantiate hypotheses on various questions of historical phonetics associated particularly with tonogenesis and the relationship between segmental and suprasegmental features (Hombert et al., 1979). 6.1.7. Secondary articulation Certain secondary articulations can accompany the realization of vowels. The main types are nasalization, retroflexion, pharyngealization and tension. 6.1.7.1. Nasalization The lowering of the velum allows phonatory air to escape both through the oral cavity and through the nose via the nasal cavities. The nasal resonance thus created is the distinguishing feature of a class of nasal vowels that contrasts with a class of oral vowels in some languages such as French, Polish or Portuguese. In French, there are four nasal vowels: /İѺ/, a mid-open front vowel; /œѺ/, a mid-open central vowel; /ŝѺ/, a mid-open back vowel; and /ŚѺ/, an open back vowel. 6.1.7.2. Retroflexion Retroflexion consists of an open approximant articulation realized by raising the back of the tip of the tongue, which takes on a concave retroflex shape. It affects laminal or or apico-postalveolar and sublamino-post-alveolar or pre-palatal articulations. 6.1.7.3. Pharyngealization Pharyngealization derives from a state of compression in the pharynx which occurs at the same time as the primary articulation. It involves a narrowing of the pharynx in the antero-posterior dimension which gives a “hurried” or “strangled” effect to the vowel. Pharyngealized vowels contrast with non-pharyngealized vowels of the same type in several Caucasian languages. 6.1.8. Tension The tense-lax distinction, often employed in phonological and phonetic descriptions, is hard to define. It must not be confused, as it often is, with the
The Articulatory Description of Vowels and Consonants
141
voiced/unvoiced distinction, nor with the fortis/lenis distinction, often used to account for or label seemingly similar phenomena. It depends neither on difference of intra-oral pressure nor on the greater muscular activity suggested by electromyographic data. Originally it was defined as a “broad/narrow” opposition: “narrow” vowels being more convex. Generative phonology also viewed the relative height of subglottal pressure as a correlate of tension. Recent research has relied on variations in tongue root position as indicators of tension, particularly in its degree of advancement. Advancing the root of the tongue flattens the curve of the back part of the tongue and increases the diameter of the hypo- and oro-pharynx. For the same position of the body of the tongue, some languages distinguish two classes of vowels on the basis of the distance that separates the root of the tongue from the pharyngeal wall. This distinction is described in the phonetic literature as “vowel harmony”. In pursuit of this topic, Ladefoged (1964) devoted an X-ray study to the vowels of Igbo, a West African language, and as far as he is concerned the (broad-narrow) distinction remains. 6.2. Consonants Consonants cannot stand alone. As their name consonna suggests, they “sound with”, i.e. they precede or follow a vocalic element. 6.2.1. Articulation mode 6.2.1.1. Stops Stops are characterized by a total blocking of air in the oral cavities. If the rhinopharyngeal channel is closed, no sound is produced: there is instead a period of silence lasting as long as the closure. When the velum is lowered, air can escape through the nasal cavities and the held phase of the stop is accompanied by a nasal resonance or murmur. This is the basis for distinguishing between voiced and voiceless oral stops such as /b, d, g, д p, t, k, q/, and nasal stops such as /m, n, ƾ, ص/ which in fact are continuants. 6.2.1.2. Constrictives Unlike stops, constrictives are produced when the airway is only partially restricted. Two cases can be distinguished according to the degree of constriction. When the constriction is weak, the airflow remains essentially laminar and this configuration is favorable to resonance. The term “approximant” is often used. Not
142
From Speech Physiology to Linguistic Phonetics
all consonants realized with this degree of approximation of the articulators are necessarily voiced, but they are not perceptually accompanied by the sound of friction. This, for example, is true of the semi-consonants MZњ/ and liquids such as OUޗ. When the obstacle to the free flow of air is considerable, the airflow becomes turbulent and audible friction noise can be heard. The resulting form of frication noise depends on the degree of constriction, the form of the constriction, the amount of air exhaled, the speed of airflow, the degree of absorbency or reverberation of the obstacle and its surrounding tissue, and the shape of the cavity traversed by the turbulent air below the place of constriction. Because of the auditory impression produced by these consonants, they are sometimes described as fricative consonants; this is the case with IYV]ߑও. 6.2.2. Description of consonantal articulations We will present our description of consonantal articulations beginning with the active articulators and showing their intended targets. 6.2.2.1. The lips The upper and lower lips are very visible anatomical structures, easily observable, and labial articulations have therefore been accorded particular attention on the part of phoneticians. Catford (1977) distinguishes the most external part of the lips (exo) from the part which, when at rest, faces the alveolar ridge and the teeth (endo), these being responsible for exolabial and endolabial articulations respectively. Exolabial These articulations correspond to the bringing together of the external edges of the upper and lower lips involved in the production of the fricative bilabials /ݡ/ and the bilabial stops /p, b, m/. [] is the most current realization of /b/ in the intervocalic position in Spanish. Endolabial The upper and lower lips are rounded and projected forwards for labialized back vowels and for /w/. Le Muire and Ohuallachain (1966) report a linguistic contrast between endoendolabial and exoexolabial stops in Erse.
The Articulatory Description of Vowels and Consonants
143
Endolabiodental The outer half of the lower lip makes contact with the lower edges of the upper teeth. This articulation can be seen in the Hindi /ࡩ/. Ladefoged and Maddieson (1996) are unsure of the existence of true labiodental stops, although this has occasionally been reported for Tongua. 6.2.2.2. The teeth Bi-dental The lower teeth press onto the upper teeth. The tongue is flattened and the lips have no part to play. Air escapes between the teeth. Voiced and voiceless approximants can be produced in this way. A voiceless bi-dental constrictive is attested in Shapsugh, a Caucasian dialect. 6.2.2.3. The tongue The tongue is the main articulator. Its three parts constitute largely independent articulators: these comprise the front part, including the apex and the blade; the dorsum; and the root. 6.2.2.3.1. The apex The apex is the vertical foremost part of the tongue. It faces the teeth and extends back for about 5-10 mm of the upper surface of the tip of the tongue. The apex is extremely mobile. It is responsible for the following articulations: – Apicolabial The apex presses on the upper lip to realize an apicolabial stop: this has been observed by Ladefoged in a South African language, Umotina. – Apicodental, apicoalveolar The tip of the tongue touches the backs of the upper teeth and at the same time the alveolar ridge immediately behind the upper incisors. This is the most current realization of /t, d, n/ in English. – Apicopostalveolar The apex moves towards the most posterior part of the alveolar ridge. One sound that is characteristic of this articulation is the English []ޗ.
144
From Speech Physiology to Linguistic Phonetics
The stops /t, d/ that are realized in this area are retroflex, i.e. the tip of the tongue bends upwards; these are found in Hindi. 6.2.2.3.2. The blade The blade of the tongue is situated on the upper surface of the front part of the tongue. It extends 15-20 mm backwards from the apex. The tongue blade is also extremely mobile. Articulations realized with the blade of the tongue are called “laminal”: – Laminodental This articulation is realized with the tongue blade behind the upper incisors and the tip just under the top edge of the lower incisors. – Laminodentoalveolar The tip of the tongue presses on the upper edge of the lower incisors and the blade points towards the alveolar ridge to produce the French /t, d, n / and /l/, /s/, /z/. Laminodentoalveolar stops are easily affricated: this is what one hears in for example, the French of Quebec, where /t, d/ affricate to /ts, dz / before /i/ and /y/. – Laminopostalveolar The blade of the tongue touches the most posterior part of the alveolar ridge to realize the articulation of /ߑ, ও/. – Sublaminopostalveolar, sublamino-prepalatal The underside of the tongue blade points upwards in the manner of a retroflex consonant towards the postalveolar region, also known as the prepalatal area, for the realization of the stops /ߺ, ˅/ in languages such as Tamil and several Dravidian languages (Balasubramanian, 1972; see Figure 6.3). Butcher and Tabain (2004) observed in Arrernte a contrast between consonants with laminodental, apicoalveolar, apicopostalveolar and laminopostalveolar articulations.
The Articulatory Description of Vowels and Consonants
145
Figure 6.3. Tamil voiceless sublamino-postalveolar [ߺ] and voiced nasal sublaminopostalveolar []ܩ. Radiograms by Balasubramanian (1972), cited by Laver (2002)
6.2.2.3.3. The body or dorsum of the tongue The body or dorsum of the tongue is the part of the tongue that begins about 4 cm back from the forward extremity, i.e. behind the blade, and extends back to the epiglottis. When at rest, the front part of the dorsum faces the hard palate and the rear part the velum. The movements of the front and back of the body of the tongue are interdependent: – Dorsoprepalatal The tongue dorsum moves towards the forward arch of the palate to make closure. All types of consonants – plosives, fricatives and approximants – can be realized by the tongue dorsum in this area of the palate, e.g. the Polish fricatives /ॼ, ɪ/. – Dorsopalatal This articulation occurs in the highest part of the hard palate and produces stops such as /c, վ/, fricatives such as /ʥ, ե/, the lateral /ु/ and the approximant /j/.
146
From Speech Physiology to Linguistic Phonetics
– Dorsovelar The tongue dorsum extends towards the soft palate and makes contact in order to realize the stops /k, g/, which are part of the inventory of an enormous number of languages, and the nasal dorsovelar /غ, found in English words such as “camping”. If the tongue dorsum moves away from the soft palate, it is possible to produce the voiced and voiceless fricatives /ܟ/ and /[/, and the semi-consonant /Z/. – Dorso-uvular This articulation is characteristic of the voiceless stop of Arabic: /q/; it is also part of the voiceless ~ voiced ~ aspirated series of Burushaski: TдTK. It also produces the fricatives /ޣ, /; // corresponds to what is currently known as the Parisian “r”. / ޣis realized by vibrations of the uvula on the dorsum of the tongue. 6.2.2.3.4. The tongue root The root of the tongue refers to the most posterior part of the tongue. The epiglottis moves with it: – Radicopharyngeal When the tongue root draws back towards the pharyngeal wall, it produces a narrowing of the pharyngeal conduit. Danish uses this articulation for its pharyngeal “r”. Drawing back the tongue root is also associated with the pharyngealization of various phonemes. Conversely, the forward projection of the tongue root has been considered as the parameter of articulatory tension. 6.2.2.4. The pharynx – Pharyngo-pharyngeal The action of the constrictor muscles of the pharynx can alter the cross-section of the oropharynx, produce a lateral compression and move the faucal pillars forwards. They can also draw the larynx upwards: this gesture produces the articulation of /ɦ/ and /ʄ/. 6.2.2.5. The larynx Apart from its phonatory function, the larynx can also realize two types of articulation: one at the level of the vestibular folds, the other at glottis level.
The Articulatory Description of Vowels and Consonants
147
A stop produced by means of bringing the vestibular folds together contrasts with the glottal type of closure in some Caucasian languages: – Glottal The vocal folds are pressed close together to make a complete closure and this is followed by a brutal release of the blockage. This type of glottal closure is known as a glottal stop. In English it is often a stylistic variant for vowel attack which reinforces delimitation, as in the exclamation “পabsolutely পevery পidiot পasks that!”. In very many languages, the glottal stop is part of the phonetic inventory and is used for distinctive purposes. This is the case in Tagalog, where /kaপo:n/, “reporter”, contrasts with /kaho:n/, “box”. 6.2.2.6. Double articulations Certain phonemes are realized by two articulations produced in areas far apart by different articulators. /w/, for example, is realized by lip rounding and protrusion and by a movement by the tongue dorsum in the velar area. /њ/ is characterized by a labial projection and a constrictive articulation by the tongue dorsum in the palatal area. 6.2.2.7. Dynamic aspects The dynamics of articulatory gestures must be taken into account in distinguishing articulatory modes. We have seen that certain consonants are realized by complete closure of the air channel. There is also a category of sounds produced by a series of momentary closures in several points of the vocal tract, giving them a quality of beating and rolling. 6.2.2.7.1. Trills, flaps and taps A trill is characterized by a series of brief complete closures followed by immediate release. This is, for example, the case for the apical /r/, where the tongue tip executes a series of closures on the alveolar ridge. The tip of the tongue is held loosely near the alveola and set in vibrations by the action of the airstream. For the /ޣ/, the uvula vibrates on the tongue dorsum. Such an “r” is described as trilled. Other phonemes can also be produced by a series of brief closures realized by one articulator: at the level of bilabial closure, for example, the pressure of air contained within the lips can change to separate them and stimulate a movement of lip
148
From Speech Physiology to Linguistic Phonetics
vibration that is reinforced by Bernoulli’s effect, which makes them come together through the “sucking” effect caused by the loss of pressure. Other segments are produced by a continuous movement and the contact between the two articulators is momentary. The active articulator strikes another on its way back to its rest position. It hits the passive articulator in passing. In contrast to the repeated movement described for trills, the flap occurs once. The distinctive feature of this type of articulation is its dynamic aspect: it cannot be prolonged. The English “r” in careful pronunciation is produced by a flap; this is also true, for example, of the American English intervocalic [t] which is found in “button” or “sudden”. This type of consonant can be produced at various points along the vocal tract and involve different articulators. In Margi, a West African language, there is a labiodental flap. Retroflex flaps also exist. A tap is another case of a rapid articulation where one articulator is thrown against another in a ballistic action. It is characterized by a single contact. The languages of the world include in their phonemic inventory tapped stops as well as tapped fricatives2. One series of consonants with weak constriction and characterized by a dynamic movement away from or towards an approximant are those designated semiconsonants or semi-vowels. Finally, one category of phonemes in which the dynamic character is essential for their phonetic identity is the series of affricates. 6.2.2.7.2. Affricates Stop consonants feature the sound of plosion at the moment of release or the sound of friction generated at the place of articulation: in general, a brief sound. There is also a category of stops in which the release phase is prolonged and the sound of friction gives the impression that it is a sequence of two phonemes: a plosive followed by a constrictive. Distinguishing between one and two phonemes purely on temporal grounds is quite difficult. Phonological considerations have to be taken into account to distinguish between a plosive followed by a fricative and an affricate such as the German /pf/, or the /ts/ and /Gz/ in a great number of languages.
2 For a detailed description of the transitional aspects of tapped, flapped and trilled articulations, see Laver (2002, part IV).
The Articulatory Description of Vowels and Consonants
149
The term “affricate” is used for phonemes produced by two consecutive articulatory gestures, beginning with closure to begin with and ending with constriction, realized in the context of producing a single phonological unit. This phenomenon could also be described as a plosive released in a homorganic fricative within the same syllable. The release could occur in the middle as with /s, z/ or at the sides as in /tla/, /dla/, or even with nasal accompaniment as is /tna/, /dna/. This type of release is not due to articulatory considerations or aerodynamics, as in the case of the allophones of the laminoalveolar stops /ts, dz/ found before /i/ and /y/ in Quebec French. Instead, the release is planned and voluntary, in order to distinguish it from simple stops and affricated stops in the same language. This type of distinction is very prolific in some languages: see, for example, Ladefoged and Maddieson, 1996. The Chipewyan language of Athabasko possesses a complete and nearly exhaustive series of articulatory possibilities of this type. 6.2.2.8. The temporal dimension We recalled previously that vowels and consonants could differ in their intrinsic duration because of the very conditions of their production and the articulatory constraints and variations in aerodynamics. We also saw that there are various contextual variations depending on the position of the phoneme in the phrase, its phonetic environment and the prosodic structure. Certain languages make phonological use of the parameter of duration for both vowels and consonants. It is therefore possible to have, phonologically speaking, a series of normal vowels and a series of long vowels, normal consonants and long consonants. Long consonants can give the auditory impression that they have doubled their length from an articulatory point of view, hence the term “geminates” with which they are sometimes designated. 6.2.2.8.1. Quantity opposition in consonants Several languages such as Estonian, Italian, Japanese and even Breton make use of quantity opposition for consonants. The relationship between the length of constriction and the length of the held phase for constrictives and short and long stops can vary by 50-300% in careful speech. It is debatable as to whether a long or geminate consonant in one syllable (we are excluding the case where two identical consonants meet at a morpheme boundary) is produced by one or by two articulatory movements. It is possible that there are two successive articulations with a certain variation of tension between the two. Lehiste et al. (1973) observed two peaks of EMG activity
150
From Speech Physiology to Linguistic Phonetics
in the orbicularis oris muscle during the production of /pp/ in Estonian. The work of Kraehemann (2007) with electropalataography and of Smith (1995) with X-ray microbeam found no trace of a double stop, but they show a difference of duration in the held phase. For Bothorel (1982), the length distinction in the Breton spoken at Argol is neutralized and carried on the timbre and duration of the vowel preceding both the simple consonant and the geminate. His interpretation essentially follows Falc’hun (1951, 1965) in suggesting that the distinction between simple consonants and geminates in Breton is a distinction between strong and weak consonants. 6.2.2.9. Articulatory strength The concept of articulatory strength is often invoked to explain phonetic changes or to take account of differences in segmental production. Voiced consonants are therefore often considered weak and voiceless consonants strong. However, the meaning given to this term varies considerably from author to author; for some it refers to global respiratory effort, while for others it is the tension with which phonation or an articulation is realized. The term “tension” is ambiguous enough in itself, referring as much to the sensation of effort as to the degree of activity of a muscle or group of muscles involved in a particular articulation. For Jakobson et al. (1952), tense phonemes are articulated more distinctly and with greater pressure than their lax counterparts. They suggest that the “fortis” phonemes are characterized by greater air pressure behind the point of articulation and by greater duration resulting from the greater distance from the resting (or neutral) position of the articulators. Instrumental studies have shown that duration is effectively a correlate of the [voiced-unvoiced] distinction. Does this in fact mean that voiceless sounds are produced with more force than their voiced counterparts? Studies using electromyography have often arrived at conflicting results. What activity should be measured? Activity of the lips, of the tongue muscles, tension of the pharyngeal walls? How is it possible to take into account individual variability as well as the variability between speakers? Intra-oral pressure during vowel-consonant sequences has been measured in a great number of languages. Mean maximal values for voiced and unvoiced consonants correlate well. Unvoiced stops have a higher value than their voiced counterparts. Malécot (1970) considers the [fortis-lenis] feature to be the synaesthetic interpretation of intrabuccal air pressure. Butcher (2004) found significant differences of air pressure and airflow for pairs of stops in aboriginal languages.
The Articulatory Description of Vowels and Consonants
151
There are relatively few studies in this area devoted to articulation itself. Straka (1965), using palatography, showed that the amount of tongue-palate contact increased with articulatory strength. Simon (1967), like Rochette (1973), showed with the aid of radiofilms that the width of lingual contact area varied according to the voiced/voiceless distinction. Using electropalatography, Engstrand (1989) noticed that there was larger contact area greater support for voiceless stops in Swedish. Reis and Espesser (2006) came to the same conclusion for Brazilian Portuguese. Marchal (1983), using EPG data, shows that the fortis/lenis distinction appears to be more relevant than the voicing opposition for French stops since it continues after an assimilation of phonatory modes. Speed of articulatory closure can also be a good indicator of articulatory strength. McLean and Clay (1995) show that speed of lip closure for a postvocalic /p/ is greater than for /b/. Task dynamics provide an interesting paradigm for “revisiting” this question of articulatory strength.
This page intentionally left blank
Chapter 7
Coarticulation and Co-production
Speech production, an extremely complex phenomenon, is the means by which a message is made audible. This message comprises a linear series of symbolic objects: lexemes, sets of phonemes. In linguistic theory, phonemes are represented by matrices of phonetic features. These features are abstract units; they are atemporal and independent of context. The segment thus possesses a single canonical form at the abstract level. Speech is produced by a series of articulatory gestures which are realized in time by means of organs which are subject to their own dynamic constraints. The passage from the discontinuous domain of phonemes and features to the sounds of speech, an articulatory and acoustic continuum, is characterized by enormous surface variability. It has long been recognized that phonemes are influenced on the one hand by perseveration and on the other by anticipation as they emerge in utterance. The lip rounding associated with the vowel /u/ in a word like “stew” begins as early as the tongue raising for the production of /s/. The lip protrusion for the /u/ in “spoon” is maintained using the /n/. In French it is clear that a voiceless consonant causes a tendency to devoicing in the preceding voiced consonant, as in /P͑G̸V͑ѺѺ/, where deletion of /̸/ produces /P͑Gલ ͑/ and the /G/ is consequently lacking in sonority. Conversely, in “paquebot”, the /k/ becomes voiced under the influence of the following /b/ to give /SDN੯ER/. Nasality too can be transferred from one phoneme to the next, as in the case of /P͑W̸Qm/ becoming /P͑WQm/, or even /P͑Qm/ after the complete nasalization of the /t/. In the German word “geben”, the place of
154
From Speech Physiology to Linguistic Phonetics
articulation of the nasal consonant may be assimilated to the preceding plosive, /n/ becoming /m/ as a consequence of the vowel elision. This contextual influence also applies to vowels separated by consonants, e.g. in French the /tİty/ of “WrWX” becomes /tety/ and the /İGH/ of “aider” becomes /HGH/. In several cases, such developments are thought to underlie historical changes that emerge in the lexicon (Ohala, 1993). Sometimes described as a phenomenon of phoneme-to-phoneme “fusion”, these modifications have been given several different labels, e.g. “epenthesis”, “accommodation”, “adaptation”, “assimilation” and “harmony”. D Jones (1962) speaks of “similitude”, whereas Sievers, as early as 18761, was already talking about articulation overlap. Following Menzerath and Lacerda2 (1933), the notion of coarticulation became current and phoneticians began to describe the superficial variability of phonetic segments in the spoken chain as the expression of the effects of coarticulation. The theoretical question that arises is whether the observed phenomena result from a linguistic cognitive process or are produced through the interaction of all the various anatomical constraints and physiological mechanisms (Tatham and Morton, 2006). Is coarticulation a universal phenomenon that operates in the same way in every language or is it specific to each language? Are some languages more susceptible to it than others? Do some phonemes resist it? Is there a gradation of contextual effects? Are all articulators equally involved? Is there a basic difference between the effects of anticipation and those of perseveration? Finally, does prosody play a part in the manifestation of coarticulation effects? Various theories have been advanced to take account of surface variability in speech. They divide into two groups: theories of translation and theories of action.
1 Sievers, Grundzüge der Lautphysiologie (1876, p. 103): “Three or four articulatory gestures compete for the production of [mi]. To avoid it, tongue raising towards the target of /i/ can start during the hold phase of /m/; lip protrusion can also be produced then without contradicting the nasal and labial properties (of the consonant) …” 2 Menzerath, (1933 – p. 65): “For the production of [pu], there is clear evidence that the ariculation of both sounds happens most of the time simultaneously. The lip gesture for /u/ as well as the tongue movement for /p/. I characterize it as ‘KOARTIKULATION’ or ‘SYNKINESE’ and postulate a new principle that can be formulated as follows: the articulation of each sound start as early as it can.”
Coarticulation and Co-production
155
7.1. Translation models The fact that communication by speech is possible implies that phonetic variability is not random and that speakers and hearers share a knowledge of the same semiological system. In other words, the construction of speech must be accompanied by important indicators that are sufficiently distinct from each other. Such indicators are allowed a certain latitude in their realization, but their variability has to be limited to avoid giving rise to confusion. It is from this working hypothesis that the idea arises that there must be an underlying absolute invariance in these indicators. The matrix of these physical characteristics gives rise to a definition of a segment as an abstract invariant linguistic unit. The segment is a secondary unit, constructed from a matrix of features; its invariance is drawn from the absolute and invariant nature of its features. Traditional linguistic theories have therefore postulated two levels of description: an abstract level, comprising phonemes and segments, and a concrete level, consisting of sounds and allophones. Phonetic features belonging to the abstract level are invariant by nature. Generally speaking, phonetic theory agrees that the principle of such a division is sound. The problem then arises of explaining how the phoneme becomes an allophone during production and how the allophone becomes a phoneme as far as perception is concerned. As a general rule, theories of translation aim to explain the steps by which a more or less ideal form (the canonical form) becomes a variable articulatory and acoustic signal as a series of transformations. In these models, speech is conceived as a series of transitions from one state to another, or a series of translations from one type of representation into another: Information (Message) I
t
R1t
R2 t
Rjt
--Rk
Information (Message) I’
s
sRn
sRn-1
sRm
--R1
In a translation model, the focus is on how the message between speaker and hearer undergoes a series of transformations. Each box in the above tables represents the handling of a transformation and its translation into another representation. This is governed by an automatic sequential process leaving little room for constructive activity by the speaker. Speech production is thus a linear series of translations of a single phrase into a set of representations in the following way:
156
From Speech Physiology to Linguistic Phonetics
Semantic representation
Deep structure
Structural representation
Surface structure
Phonetic representation
Phonetic signal
The exact nature of the different levels of representation depends on the particular theory adopted. To focus more closely on the transition from phonological representation to acoustic signal, the above model becomes: Grammar Æ
Grammar
Phonological Æ Phonetic Æ representation representation
Phonological Å representation
Articulatory continuum È Acoustic continuum È Central and peripheral auditory representation (perceived)
It thus appears that the translation theory is also valid for perception. The schema below shows the model in reverse: Phonetic representation p Phonetic representation p Surface structure p Deep structure p Semantic representation
Coarticulation and Co-production
157
7.1.1. From plan to execution The process of coarticulation has been described as a transition through a series of states by Daniloff and Hammarberg (1973) and Hammarberg (1976). The phonetic process is characterized as “the assignment of phonetic effects to phonological causes” (Hammarberg, p. 356) or a conversion of phonological features into articulatory transitions (Kent and Minifie, 1977). Liberman et al. (1967) postulate a set of conversion rules to account for the transition from phonological representation to acoustic signal, and from a sequence of discrete, static, context-free segments to continuous, dynamic phonetic behavior. This means that speech production implies the dismantling of the phonological edifice, whereas perception implies its reassembly. The essential part of reorganization (dismantling of phonological framework) occurs when the muscular contractions shape the vocal tract according to the application of articulatory rules. Phonemes are translated into extrinsic allophones which are transformed into continuous behavior. Under the influence of motor theory, a large body of opinion requires the invariants to be coded as patterns of motor commands. Since there is no invariant in the acoustic signal, it has been sought in the articulatory configurations. However, as there are no invariants there either, it was thought that they could, or should, be found at the motor level. This is the reason for innumerable studies of coarticulation using electromyography but, as it daily transpires, peripheral commands are far from invariant. Speech production apparently requires just the opposite. Motor commands must be flexible in order to take into account the various possible points of departure for the articulatory organs. On the other hand, articulatory movement is not solely the result of muscular impulses but also of reactive forces. This was amply demonstrated by several studies on the jaw, which have made very clear the relationship between the departure point of the lower jaw, the resistive charge applied to it and EMG activity in the masseter. In the end, the result must be relatively equivalent and correspond more or less to a stereotype. 7.1.2. Feature spreading The question consists of discovering the transformation operations that take into account the transition of the highest level of symbolic representation into muscular commands which allow information that is relevant from a linguistic point of view
158
From Speech Physiology to Linguistic Phonetics
to be preserved in speech. This aim, to describe the measures of production without losing sight of their linguistic function, will involve seeing the phenomena of coarticulation as an expression of the co-production phenomena of certain phonetic features. The theory of the articulatory syllable (Kozhenikov and Chistovich, 1965) is a good example of the school that influenced the work of Öhman (1966) and Perkell (1969). This theory postulates a vocalic continuum, upon which consonantal articulations are locally inserted. Coarticulation would be maximal during CnV sequences (where Cn = a certain number of consonants) and minimal in other types of syllable. The work of Carney and Moll (1971), which shows that in American English C1V1C2V2 sequences (where C is a fricative) the V2 movements coincide with the movement of C2 consonantal closure, follows this theoretical model, as does the work of Benguerel and Cowan (1975) on labial protrusion in French, and that of Bladon and Carbonaro (1978) on Italian; see also Sussmann et al. (1973) and Butcher and Weiher (1976). Another form of traditional model is the bottom-up model by Henke (1966) which proposes a forward scanning technique: “look ahead planning”. The idea is that future instructions are taken into account in current planning as long as their execution does not countermand gestures already in progress. Anticipatory coarticulation is seen as an active cognitive (phonological) process, whereas carryover coarticulation is a mere consequence of the sluggishness of the articulators. A large number of studies, especially those using EMG, have taken this model as a point of departure. Contradictory results have emerged: some see the transfer of a feature as taking place only if its realization does not prevent the production of a sequence of segments, while others consider that the duration of coarticulation is limited and is time-locked. Finally, it appears that results vary according to the language. Such variation in behavior could arise from the inventory or density of each phonological system (Recasens, 1999). The need to preserve the distinctive character of certain articulatory properties might win out over the inertia of the articulatory system or over the laws of economic governance of the production system. Manuel (1987), moreover, gives formal recognition to perceptual constraints. Lindblom’s (1990) adaptive variability system is the most sophisticated from this point of view. It appeals to two basic principles: that of economy, and of listener facilitation. The principle of economy demands that expenditure of energy should be minimized; this can result in “undershooting” the articulatory target. The facilitation
Coarticulation and Co-production
159
principle is inspired by the need to ensure the intelligibility of the message. For Lindblom, speech production is therefore organized on a continuum which goes from hypo- to hyper-articulation. Coarticulation, in this view, is no longer considered as the product of smoothing occasioned by the inertia of the articulatory organs, but is seen as a continual adaptation to the needs of clarity and perceptual contrast imposed by the communication situation. In this sense, speech production is output oriented. To reconcile these different approaches, Keating (1988) proposes a model in which segments are represented by adjustable windows. These windows are the targets. They possess a range of acceptable values rather than single values placed along a phonetic dimension. The “windows model” has an advantage over the other models in that it allows contextual variations to constitute the rule rather than the exception. Languages may differ as to the width of the window assigned to a particular type of segment. Window width can be determined according to a certain number of factors; these can consist of perceptual exigencies, such as those discussed by Manuel and Krakow (1984) and Manuel (1987), but can also depend on strictly articulatory considerations such as the specific jaw position required for the acoustically satisfactory production of a particular segment, e.g. /s/ (Keating, 1983). When the windows have been assigned to the segments, they are linked by a form of interpolation. For those targets which do not have unique values for one dimension of a given window, there are serious implications for the way in which interpolation can be realized. Keating suggests that the interpolation is controlled not locally but by a function which determines the values of the phonetic dimensions of the features of every segment of every language. Keating’s model nevertheless follows the same paradigm as the theories of translation and suffers from the same conceptual limits. 7.1.3. Limits to translation theories If it is thought, as in translation theories, that the differences between abstract segments and the elements of the speech stream are irreducible, we are obliged to admit that linguistic units are different from production units which in turn do not correspond with perception units. In addition, the units on which perception operates are impoverished by comparison with linguistic units. It is clear, in fact, that not all features are realized in speech, and of those that are, realization is not complete.
160
From Speech Physiology to Linguistic Phonetics
This idea makes it difficult to explain some phenomena of the acquisition of the spoken language. Is it plausible that we should be able to acquire something which has no real existence? Linell (1982) poses precisely this question and reminds us that for linguistic rules to be learnt, they must derive from permanent facts: “One should recall … that linguistic rules or conditions must be learnable, which means that they must be based on facts that are socially accessible and observable” (p. 44). How can the child learn forms he has never encountered? The concept of abstract forms which are only ever realized in debased form also begs the question of their evolution. Did they evolve because they were transformed or are they transformed because of the way they evolved? It is quite difficult to logically imagine how a form evolves by choosing not to resemble its model, rather than the reverse. The more natural tendency would be to reduce the distance. In a translation model, two levels can be approximately identified: an upper level at which abstract entities are selected and ordered, and a lower level where they are translated into articulatory gestures. The division into two independent levels is based on the fact that it is possible to have several equivalent expressions of the same idea. The problem is to know how motor gestures are programmed and at what level. First of all, these models consider that anything which is relevant to planning as opposed to execution will be carried out at the upper level, and that all anatomical and physiological constraints are taken care of at this level. In other words, the processes of adjustment implied by the continuous nature of speech production are phonological processes. However, how can we account, in this hypothesis, for the phenomena of quasi-instantaneous adjustments that can be observed when normal production is disturbed and (in particular) in cases of sensory deprivation? A further anomaly in the translation model is to be found in the absence of a principle of control over articulatory movements which would limit the number of degrees of freedom. It would appear that each change of vocal tract configuration occurs as if it were the result of an infinity of independent commands. Knowing that each movement is itself conditioned by the original state of contraction in each muscle, by the starting position of the organ and by the constraints of inertia or of facilitation, we also know that not only is each muscle commanded individually but that, in addition, the central motor level must be kept constantly informed of the state of each motor unit and of all the relevant environmental factors, so that its orders can be adjusted to achieve the desired end. This notion of articulatory control in the form of a closed loop is completely unrealistic from a physiological point of view. It is generally agreed that the amount of information that can be transmitted over the neural networks for each motor unit is of the order of 50 bits per second; it is materially impossible to encode all the information required by such a
Coarticulation and Co-production
161
mechanism. We have not even allowed for the time constraints which are, in their turn, also incompatible with the observed temporal phenomena. From a theoretical point of view, this control mechanism introduces a logical contradiction which has long remained almost unnoticed: how is it possible to describe the segment as an abstract entity and coarticulation as a phonological phenomenon; how can we postulate absolute invariants and take into account of description the variability of commands, all at the same level? This is a fundamental error of logic. There is a conflation of the notions of assimilation (a phonological process) and coarticulation (a physiological mechanism). The two aspects are not irreconcilable, as long as it is possible to distinguish the levels at which the mechanisms are applied and the nature of the units they command. Then there is the problem of control of time. There can only be extrinsic chronology in such a model; this means that aspects of duration which are not part of the domain of the segment are superimposed and are the object of an external control. This idea of timing makes it extremely awkward to explain the reorganization of articulation which is linked to changes in speech rate. 7.2. Action models The question about the temporal organization of speech is central to the problem of coarticulation. It has been the focus of attention in the most recent theories. In the phonological form of a given word, there appears to be no chronology. Only the sequential order of syllables and segments is indicated. Phonological representations are organized as a linear sequence of discrete segments, as if – and this has not been without methodological consequences – they were a string of orthographic characters or phonetic symbols. Fowler (1980) says: “Segments in a planned sequence are discrete in the sense that their boundaries are straight lines perpendicular to the time axis, so that the terminus of one segment is the beginning of the next segment” (p. 116). Each segment is a matrix of simultaneous features. The organization is more spatial than temporal. Is it possible to envisage coarticulation in this way: as the translation of a spatial configuration with a temporal behavior? Translation theories are the direct descendant of the long American behaviorist tradition in which man is represented not as someone who acts in a sensate fashion in a social context, but as one who undergoes events.
162
From Speech Physiology to Linguistic Phonetics
The work of Neisser (1976) on vision inspired theories attaches great importance to control mechanisms. He insists in particular on coordination as opposed to individual commands. In line with the tenets of articulatory phonology (Browman and Goldstein, 1992; Fougeron, 2005), we believe that the essential properties of linguistic segments and the actual properties of speech sounds are not incompatible. We will briefly review some general principles governing the control of coordinated movements and we will then try to see how a model based on coordinative structures could provide a more adequate account of speech production and explain its variability. 7.2.1. Control of coordinated movement Berstein (1967), Grillner (1976) and Turvey (1977, 1978) observed that the relationship between the command issued to a muscle and the movement produced is equivocal. A command will have different effects on a given organ according to its state, nature and pre-existing forces. Control over a movement consists of constraining a string of forces running in a particular direction. The movement results from a set of complex active and reactive forces. There cannot therefore be any one-to-one relationship between the distinctive features of a phonological segment and the contraction of particular muscles. According to Zemlin (1997), the palatoglossus can, for example, both lower the velum and raise the back part of the tongue; the mylohyoid can raise the hyoid bone and pull the larynx upwards, or lower the jaw. Its action will depend on the relative state of the muscles and the relative positions of the different articulators. There is thus necessarily a contextual variability, and we should not be surprised if studies aiming to correlate phonetic segments with electromyographic invariants produce negative results and contradictions. Such enterprises are doomed to failure by reason the very nature of the phenomena they detail. This is not the only source of contextual variability. The muscles of the vocal tract are innervated by the cranial nerves whose nuclei are situated in the medulla oblongata. Neuromotor organization essentially rests on a system of cortical fibers which innervate the interneurons of the brain stem. A few fibers link the cranial nerves directly with the cortical motor areas. The interneuronal networks are considered as the nodes for the integration of information (Evarts, 1971). They are not the passive receptors of supraspinal commands. It would be more precise to say that orders from the supraspinal level constitute one influence among several on the state of the interneuronal networks. Such a state of organization means that the idea of direct cortical control over the muscles is even less likely. These constant
Coarticulation and Co-production
163
interactions at the level of interneurons mean that the supraspinal influences can have only an organizational role and not an executive role in the strict sense. 7.2.2. Degrees of freedom What we have just reviewed about neurosensory and motor organization is obviously relevant to phonation and articulation. In fact, if we wish to describe nervous and muscular activity in detail with reference to the production of a phonetic sequence, it ought to be possible to draw up a practically infinite catalog to take account of the state of the organism at any given moment, supposing that this is technically possible and that about 100 muscles, some tens of thousands of motor units and some tens of millions of muscular fibers are involved. The number of degrees of freedom resulting from this would be such that an organization of this type could not be controlled. Yet the notion of control and fine adjustment for a given task is one of the essential properties of speech production. This presumes that there is a system and a mechanism for control. For there to be such a system, the number of degrees of freedom must be limited so that it is possible to account for the necessary coordination by a limited set of parameters. On the subject of systematicity, one example will illustrate this question of degrees of freedom: at two extremes, we have a table made of wood and a table made of sand. The rigid object is not a system: it cannot function as such. It has no degrees of freedom. Nor can the table of sand function as a system because the position of each grain of sand is fluid, and an “object” cannot be described by a set of functional relations. For there to be a system, a large number of degrees of freedom has to be controllable by a small number of “supervisors”. To date, the tentative theory has been to reduce description, retaining only those physical aspects which show a distinctive function while excluding all properties that appear redundant. This strategy has allowed the postulation of distinctive features (Jakobson et al., 1952) and phonetic features (Chomsky and Halle, 1968). Having done this, we have still not satisfactorily settled the question of control of movement. It is however at the level of this principle that the number of degrees of freedom can be most efficiently reduced. It would be a fairly substantial error of categorization to imply that the description of features could at any given moment account for production. Articulatory and acoustic manifestations are in fact only parameters acting on substantives: they are not attributes in their own right. Before any emission of a phonic sequence there is a set of anatomical, physiological and acoustic constraints that apply to all speakers. In this sense, we are talking about universals which condition the set of “possible” productions. Such universals include, among others, a special mode of respiration, the phonatory setting of the
164
From Speech Physiology to Linguistic Phonetics
larynx, the alternation of vowels and consonants, the temporal constraints of segments, etc. These constraints can be expressed as an equation. If we have a circle, it would of course be possible to define the position of any point on its circumference. It would be a long and painstaking business which would not take into account the relationship which unites all these points. There is a general equation of the circle, viz: (x-h)2 + (y-k)2 = r2 Any individual circle is characterized by the values accorded to parameters x, y, h, k and r. These parameters are not substantive but attributive properties of the circle. All that needs to be specified in a phonetic plan or phonological representation (in Linell’s (1982) sense) are those aspects of speech production which are constantly idiosyncratic. However, care must be taken not to consider the idiosyncratic features of particular segments as things in themselves, but rather as attributes, so that there is then little sense in describing them out of their context. We are now going to try to show how the notion of coordinative structures makes it possible to propose a theory of direct production and thus account more naturally for coarticulation. 7.2.3. Coordinative structures The need for coordination of muscular activity is directly linked to the heterogenity of neuromotor commands and influences. To achieve a precise movement, taking into account the multiple constraints attendant on a given articulator, there must be a control procedure which harmonizes the different elements of muscular activity, including the ways in which muscles inhibit or facilitate functions; in a word, a procedure which controls the whole set of kinetic influences. Given the excessive number of degrees of freedom, some theorists felt that control of all nervous impulses could not be managed at the highest level. Studies of muscular neurophysiology (Easton, 1972; Turvey et al., 1978) have confirmed that this intuition is well-founded. They have given evidence of the functional grouping of muscles. These results have cleared the way for theories of so-called “direct action”. The muscular systems are grouped in such a way that the achieving of a precise function can be relatively autonomous. Because their essential property is that they control and coordinate, they have been termed “coordinative structures”. A system such as a coordinative structure which achieves a function
Coarticulation and Co-production
165
incorporates an optimal balance between its freedom to undergo change and the limitations imposed on its freedom (Pattee, 1973). It is a specialized structure by reason of the limitation that it places selectively on its degrees of freedom, but it is not constrained to the point of rigidity. Its function consists of achieving coherent activity. A coordinative structure is built on the basis of privileged relations between certain muscles with a view to facilitating or inhibiting particular excitations. These are modulated to promote the realization of equivalent acts. The ability to define classes of equivalence is an absolutely essential property and gives a coordinative structure its flexibility. A single muscular organization can govern activities with superficially different properties (Laboissière et al., 1996; Mooshammer et al., 2006) in order to ensure cohesion between the “gestures” of several articulators. An act is controlled by the functional assembly effected by coordinative structures. In no case is there any question of temporal concatenation. From the moment that a coordinative structure is initiated, the procedure for control is minimal because each coordinative structure accomplishes its task autonomously. These muscle systems can be thought of as entities, and commands are addressed to them as units rather than to individual muscles. 7.3. Towards a direct theory of speech production We begin with the principle that articulatory control is fundamentally similar to control over other types of acts. In this context we will try to show, with the help of some examples, how the use of a concept of coordinative structures allows us to account for surface variability and to define the linguistic “segment” by the same descriptors as used for the sounds of the speech stream. The essential problem lies, as we have recognized, in the implicit recognition of the idea of the segment and its underlying invariance, and whether that is expressed as a target or as a canonical form the question remains basically the same. It could be expressed in the following terms: is it possible for segments that are dynamic and coarticulated to possess properties of invariance? A coordinative structure is a functional relationship between muscles which can be described in terms of an equation. An equation suggests something that is not context-sensitive. Properties which are not context-sensitive are not invariant, properly speaking, but equivalent. The equation describes the dimensions that can be taken by different examples, and it defines the movements that are authorized. The invariant properties correspond to the constraints on the possible movements; each movement is therefore defined by a constraint equation and a set of parameters which will modify the final outcome. These parameters constitute the distinctive features. According to this scheme, several movements can be superficially different
166
From Speech Physiology to Linguistic Phonetics
but possess an underlying dynamic invariance. Because of the lack of unanimity between motor commands, articulation and acoustics, it is not surprising that this invariance cannot be maintained. Moreover, the very nature of the coordinative structure, integrating as it does a great quantity of afferent and efferent information and supervising the execution of a set of equivalent movements, gives rise to the thought that this invariance is more relative than absolute. This interpretation will benefit from the support of some examples of immediate readjustment after perturbation of the normal speech production situation. Lindblom and Sundberg (1971) and Lindblom et al. (1979) report that speakers can produce vowels that are acoustically normal without practice or tuition, and even in the absence of auditory feedback, when holding a bite-block between their teeth. On the subject of rounded vowels, Riordan (1977), together with Savariaux et al. (1995), found that where protrusion is prevented, speakers can nevertheless produce acoustically “normal” vowels at the first attempt with no difficulty. It is very difficult to explain such immediate readjustments in the framework of traditional theories of speech production. It would be necessary to suppose that there would be continual comparisons between the ideal state of the different organs according to the central model and their actual state, and we would then have to imagine a correction process issuing from the center and modifying the different commands. The delays suggested by such a mechanism would be very long, in view of the great quantity of information and the limits on traffic along the nerve fibers. Moreover, and more importantly, there is a one-to-many correspondence between an error signal and the conditions which produced it. Finally, the speech signal succeeds in spite of the absence of auditory feedback or sensory control. If, on the other hand, we postulate a mechanism of control and execution such as a coordinative structure, the explanations are much easier. The coordinative structure effectively defines a class of equivalent acts. The functional grouping of muscles adjusts itself according to the state of each of its components. With a constraint equation and one fixed variable, the other variables are not free to vary but have to take on values which respect the equation. Invariance can exist only in the relationships. As Bothorel (1983; p. 121, translated from the French) very properly observes: “The non-linear interaction of articulators with their specific degrees of freedom is produced by a remarkable ‘internal autocontrol’ in which a high level of synergy and economy of movement makes for the realization of phonetic operative articulations, to the point where they are a dynamic set of indicators (more or less efficient, more or less precise, more or less coordinated) which are integrated into phonetic classes and serve as a basis for the definition of distinctive features.”
Coarticulation and Co-production
167
The constraint relations between the components of the vocal tract create an oscillatory system, i.e. a system with an intrinsic goal which it achieves regardless of its point of departure because of its dynamic configuration. 7.3.1. The coordinative structures of speech The concept of “coordinative structure” is central to explaining the capability of the speech production system to produce classes of equivalent acts. They can be defined as “highly evolved task-specific ensembles of neuromuscular and skeletal components constrained to act as a single unit” (Kelso, 1988, p. 205). Articulation, just like any other skilled movement, is the product of the activity of the organizing “coordinative structures” of muscle groups. They consist of a functional grouping of muscles. The degrees of freedom for each muscle are limited by belonging to the structures. The relationships between the individual muscles can be expressed by functions whose parameters are set to achieve specific tasks. The relationship between the parameters is relatively invariant. The role of the speech act is to provide a set of values to the parameters not already given by the current state of the system. In speech, therefore, we deem the function of the component controlled by the highest levels of representation to be that of selecting the appropriate equations and adjusting the free parameters. The cinematic details are not part of the internal representation. In general, speech production can be described as the functional assembling of several specialized coordinative structures. The organizational systems based on the coordinative structures control: – the respiratory mode; – laryngeal adjustment; – the length of the vocal tract; – vowel/consonant alternation. 7.3.1.1. The respiratory system Studies by Ladefoged (1967) and Anthony (1982) have contributed to showing that respiration during phonation is different from normal respiration. There is a change in the relative duration of inhalation and exhalation, and the duration of the former is mainly due to the lengthened activity of the inhalation muscles at the start of the exhalation phase. In normal respiration, the lungs are passively relaxed during exhalation. In speech, exhalation must be lengthened and airflow very precisely controlled to maintain stable subglottal pressure under the vocal folds. At the start of
168
From Speech Physiology to Linguistic Phonetics
exhalation, the external intercostals remain active so that the lungs should not return too rapidly to their normal size as a result of the elastic forces exerted by the pulmonary tissue and the ribs. The effect of the external intercostals is therefore to slow down the subsiding of the ribcage. Studies by Adam and Munro (1973) and Marchal (1988) show that the coordinated activity of the inhalation and exhalation muscles is prolonged during the whole of the exhalation phase. This synergistic muscular energy, which takes advantage of opposing effects, corresponds to a coordinative structure, functioning as a specific system tailored to speech production. Moreover, Huber (2007), on the topic of sound pressure level variation, shows that the respiratory kinematic patterns during connected speech are taskoriented. 7.3.1.2. The laryngeal system Wyke (1974) identified three servo-systems in the larynx which influence the state of tension and the length of the vocal folds in action: the evaluation of subglottal pressure thanks to receptors in the mucous membrane; the state of tension in the intrinsic muscles; and the movement of the cartilages as measured by mecanoreceptors in the ligaments. These three systems function simultaneously and harmonize their responses. They can react to structures at higher levels defining the phonatory mode appropriate to different registers: falsetto voice, chest voice, murmur, whisper, etc. 7.3.1.3. Vocal tract length The instant readjustment phenomena observed in the experiments involving perturbation of lip protrusion produced an indirect proof of the existence of a mechanism of control over vocal tract length. This structure implies a functional relationship between the height of the larynx and the degree of labial protrusion either required or realized, and thus the degree of jaw-opening. 7.3.1.4. Vowel-consonant alternation On the basis of specific schemas of muscular organization, it is possible to distinguish between vowels and consonants. It is fair to recall that this idea had already been defended by Straka (1965), who observed a difference of behavior between vowels and consonants according to variations of articulatory strength. As far as the motor innervation of the tongue and the vocal tract is concerned, it is known that vowels, which show the slow transformations of the vocal tract, are
Coarticulation and Co-production
169
essentially produced by the extrinsic muscles, whereas consonants are produced by the faster intrinsic muscles. Perkell (1974) thinks that vowel control must be essentially determined by acoustic considerations and by taking into account the muscular tension in the intrinsic muscles of the tongue, whereas control of consonants relies on the evaluation of air pressure and on tactile feedback. This, then, implies a fundamental qualitative difference between these two classes of phonemes. A temporal overlap of activity in the two systems should not blind us to the fact that vowels and consonants are qualitatively discrete, if not temporally. From an acoustic point of view, the products of these two types of muscular activity are also different in nature: they produce either a periodic signal or noise. As pointed out by Fowler (1980), vowels and consonants can be distinguished because they are qualitatively different from the acoustic as well as the articulatory point of view: “They are separate from consonants (and hence their serial ordering with respect to consonants can be detected by a perceiver) because the organizational invariants for vowels perpetrate a different kind of gesture and acoustic event than those for consonants” (p. 413). One coordinative structure would define the vocalic state and be responsible for the vocal tract configuration required for vowel production, whereas a different coordinative structure would define the consonantal state, characterized by a more local action. It would be reasonable to suppose that since all vowels share a set of common characteristics, there is articulatory continuity at the level of vowel production: it would be scarcely ergonomic and very slow if the whole process were re-initialized for each new vowel. This hypothesis is known as the theory of coproduction. Numerous studies (Öhman, 1966) have shown that this theory not only accounts for interaction between vowels within a syllable but also, despite Chistovich’s thesis, beyond the syllable. In English, a co-production unit includes two accented vowels. To date, it has mainly been thought that consonants were interruptions in a chain of vowels. Take, for example, Perkell (1969): “The production of a consonant can be thought of as being a gesture superimposed on the continuously varying vowelproducing system” (p. 65) or Öhman (1966): “A VCV utterance of the kind studied here can, accordingly, not be regarded as a linear sequence of three successive gestures. We have clear evidence that the stop consonant gestures are actually superimposed on a context-dependent vowel substrate that is present during all of the consonant gesture” (p. 165). Fowler (1977) had a similar notion: “It seems appropriate to characterize these effects as evidence of co-production, that is of a superimposition of the trajectory for the unstressed vowed on an on-going trajectory from stressed vowel to stressed vowel” (p. 160).
170
From Speech Physiology to Linguistic Phonetics
Gay (1977) voices a different opinion: “An intervocalic consonant has more than a passive effect on the tongue body movement …” (p. 72). It has been cautiously thought that consonantal action was limited and did not have an effect beyond the bounds of the preceding and following vowels. Our position (Marchal, 1988) is that the idea of co-production should be expanded to include the domain of consonants. We have in fact been able to show that in sequences of two stop consonants, the spatio-temporal organization of articulatory events reveals a decided overlap of articulatory gestures between C1 and C2. We have seen in particular how the preparation for C2 may commence before the complete tract closure required by C1. The influence of C2 would thus extend over V1 in the sequence V1C1C2V2. Independent work by Rossi (1977) on acoustic coarticulation arrived at the same conclusion and clearly demonstrated a reciprocal influence of C1 over C2 in a C1VC2 sequence. The progressive influence of C0 over C1 in the sequence C0V1C1C2V2 has also been demonstrated by our electropalatographic data and gives support to the hypothesis of consonantal articulatory continuity. Such a hypothesis cannot be constructed on similar lines to the co-production of vowels since the activity of extrinsic muscles is far more localized; however, it is possible to suppose that the consonantal coordinative structure is never totally inactive. It could similarly contribute to fine grained secondary distinctions among vowels (Hardcastle, 1976). It is finally possible to admit that the scope of the programming unit encompasses more than the domain of the syllable. 7.3.2. The supervisory system Coordinative structures are embedded. The embedding structure decides the parameters of the embedded structure. The organization of coordinative structures could be represented as a tree diagram showing the hierarchical ordering of muscular activity (see Figure 7.1). The synchronization and coordination of the muscular coordinative structures are effected by a supervisory system. This determines the length and shape of the vocal tract, and the activity of the lips, jaw and soft palate. The lower coordinative structures differentiate articulations: the way in which this operates is that when an activity has been organized, lower level units carry out the necessary actions through innate or acquired reflexes. To achieve this end, a hierarchy of the coordinative structures must be established. This necessitates defining the functional relationship between muscles inside the control unit. The hierarchy enables coordination to happen, but is not responsible for individual muscle commands.
Coarticulation and Co-production
171
Phonetic plan Supervisor
Coordinative structures
Respiration
Supralaryngeal
Larynx
Phonatory mode
Chest
Falsetto
(Head?)
Articulator network Extrinsic muscles
Intrinsic muscles
Vowels
Ant.
Post.
Figure 7.1. Hierarchical speech motor control by embedded coordinative structures
The benefit of the theory of direct production is its ability to recognize underlying relative invariance. It reconciles phonetic exponency with phonological interpretation by relying on phonic units defined by a set of dynamic features to describe linguistic reality. It adopts the notion of the existence of coordinative structures and uses their functional embedding capabilities to explain the realization of classes of phonemes. For both vowels and consonants, the invariants consist of
172
From Speech Physiology to Linguistic Phonetics
the functional organizations of the musculature. They can never be identified with individual muscle states. Phonemes must be defined by these functional relationships which are dynamic by nature and not by abstract targets thought to represent static canonical forms. Coordinative structures define a set of constraints which direct and constrain speech production. Their essential property is to generate classes of equivalent movements. Phonetic facts must be interpreted according to the conditions of their production. Linguistic behavior is the end product of interaction between the functions (communicative, cognitive and social) which language must provide and the biological bases (brain, nervous system, speech organs, hearing organs, etc.). Autosegmental phonology was no stranger to this notion/problem and was also capable of seeing itself as the goal of phonetics; as Goldsmith (1976) puts it: “Autosegmental phonology is an attempt to provide a more adequate understanding of the phonetic side of the linguistic representation (…); it suggests that the phonetic representation is composed of a set of several simultaneous sequences of segments, and more concretely it is a theory of how the various components of the articulatory apparatus, i.e. the tongue, the lips, the larynx, the velum are coordinated” (p. 23). Similarly, articulatory phonology or gestural phonology formulates the organizational principles of coordinated movement in speech in dynamic rather than physiological terms (Browman and Goldstein, 1977). An articulatory “gesture” is an abstract characterization of articulatory movements in which actions are coordinated to achieve a specific task. Each task is precisely defined by the parameters of a set of equations from a model of task dynamics. When the control structure is established, the equations determine the articulatory coordination so that the gestural task of forming a particular constriction can be achieved independently of peripheral constraints. The same structure is used for the description of both the phonological contrast and the articulatory action. Articulatory phonology offers a unifying operational vision of speech production and thus goes beyond the traditional distinction between phonetics and phonology. It suggests articulatory timing or chronology as an intrinsic property of gestures. In this framework, it treats coarticulation as a phenomenon of co-production, as an inherent feature of gestural dynamics. Speech is a behavior that should be studied in its own right. By respecting this condition, it is possible to formulate a more comprehensive theory of linguistic phonetics. In it, phonological representation is a plan for dynamic behavior, as
Coarticulation and Co-production
173
opposed to a sequence of abstract static “objects”. “Behavior” is not the phonetic effects of phonological causes: it is produced and intentionally determined by the speaker in a structured and coordinated way. 7.4. The nature of coarticulation phenomena Coarticulation is generally defined in the literature as the influence of one phonetic segment on another. At first it looks as though such a definition would be too general to refute, with the possible exception of the term “phonetic”. Is it indeed the effect of phonetic-physiological accommodation or is it rather the effect of a phonological assimilation process? All authors agree in recognizing the phoneme as an abstract entity. By contrast, the nature of the allophone is such that it lends itself to frankly contradictory interpretations. It is sometimes seen as a concrete phonetic entity and sometimes as an abstract mental phenomenon. Its nature affects the nature of the procedure by which a phoneme becomes an allophone. 7.4.1. The allophone as phonetic entity This is the most commonly held view. According to this notion, the allophone corresponds to the phoneme as realized in the speech stream. The phoneme never occurs as itself in speech. During production, the phoneme is encoded as an allophone. During perception, the allophone is decoded to discover the phoneme. The phoneme has a canonical form. It is static, discrete and atemporal. It concerns a category imposed by the mind on the acoustic continuum. The allophone is above all characterized by its distance from this ideal form, as imposed by articulatory constraints and the unfolding of the speech act. 7.4.2. The allophone as phonological entity Does it make sense to say that [t] and [ts] are two variants of /t/ in Quebecois French? It would appear so. What is the basis for this answer? Our knowledge of the phonology of this variety of French. In its linguistic system, these are not two distinct phonemes. Is there a basic difference between [t] and [ts]? No, because I can only make a judgment as to their function by referring to the linguistic system. If the phoneme /t/ is a category, / ts/ can only be described as a subcategory; but there is no essential basic difference between them. In fact, if we talk of categories and subcategories we are dealing with
174
From Speech Physiology to Linguistic Phonetics
mental entities. Nothing in the physiological continuum allows us to define what is the “pattern” and what the “example”. The example must follow from the pattern, not the other way round. 7.4.3. Coarticulation: a redundant concept? The view that suggests there is no basic difference between phoneme and allophone seems epistemologically well-founded. Is there then any reason to continue to distinguish between coarticulation and phonological assimilation? At first sight it would appear not, and this is effectively the conclusion reached by Hammarberg (1982): “There is no reason to posit two different kinds of assimilation processes, one phonological and one phonetic (coarticulation)” (p. 125). He considers, quite logically, that the process which transforms two entities of the same nature cannot be basically different. If phonemes and allophones are mental entities, coarticulation must be a mental entity. In support of his thesis, Hammarberg adduces the following: a) that it is impossible to find discrete entities in the acoustic signal or in articulation; b) that segmentation is a mental process; c) that transformation occurs before execution. The bitter battle between Hammarberg and Fowler on the subject of coarticulation is well-known. In a curious turn of events, Fowler (1980) also condemns the concept of coarticulation: a concept that is redundant in her eyes, but not for the same reasons; in her theory, phonemes are four-dimensional entities realized intact in articulation. The fourth dimension is time, an intrinsic property of the segment: “.... I believe that it can account for coarticulatory and other timing effects more plausibly and adequately than the views developed within the extrinsic timing framework. Its major advantage is in incorporating the dimension of time into the specification of a phonological segment with the consequence that the ideal or canonical form is considered to be executed unaltered in an utterance” (p. 131). Articulatory phonology is no stranger to this interpretation. Is the concept of coarticulation therefore unnecessary? We do not think so and will try to demonstrate that it is not by recalling some facts that are forgotten or illunderstood by the adherents of phonetic or phonological theories. It seems that by wishing to be too systematic, the opposing theories do not adequately consider:
Coarticulation and Co-production
175
a) the linguistic aspect of speech; b) the material realities that underlie speech production. The physiological approach bases its case on the existence of a set of constraints and physiological properties (such as inertia of articulatory organs; co-production; coordinative structures) to explain surface variability. This approach does not doubt the existence of underlying relative invariance and explains all “surface manifestations” by the temporal extension of features and by asynchronous but coordinated execution in the functional muscle systems and articulatory gestures. If this is the case, and the hypothesis is true, the phenomena of coarticulation (Hoole et al., 1993) should be the same for all speakers, regardless of the languages they speak. However, this is clearly not the case, as has been amply proved by many examples. The hypothesis is most notably contradicted by looking at the way the place of articulation changes for /k/ before /i/ and /u/ in French and in English. It is known that the vowel has an important influence in English. The case is similar for the rules governing vowel harmony: how can it be explained that they apply in Hungarian but not in English? Everywhere there are instances of how phonetic changes reflect the phonological structure of a given language, and these cannot be explained solely by a set of anatomical, physiological and motor predispositions, even if it could be proved that they exist. They do not suffice to explain the changes because they result in changes only in some languages and not in others. Not all allophones can thus be considered as the products of the same physicallyconditioned process. There is also a final objection: is it possible to falsify such a hypothesis? Is it not tempting to relegate to some unobservable neurological level all contradictory or awkward facts that do not fit the theory? As for the supporters of the mentalist approach, they forget that speech is produced, transmitted and perceived. However, there exists a set of properties that decides whether communication will succeed or fail. In the last analysis, the proponents of this theory are not interested in the material indicators that condition the operation of communication because they consider: 1) that segments do not exist in the speech stream, and therefore not allophones either; and 2) that perception derives purely from a mental mechanism. How does speech get from the abstract to the concrete level and vice versa? There is no coherent answer. They are thus throwing futile abstractionism into the mix. Moreover, their reasoning on coarticulation is fundamentally flawed: it falls into the trap of circularity: 1) coarticulation is inferred: yes, but how?; 2) the segment is defined as a non-coarticulated entity; 3) the allophone is a coarticulated entity.
176
From Speech Physiology to Linguistic Phonetics
So, what is coarticulation? The mechanism which explains the passage from one to the other? This poses a problem with at least three aspects, emerging as: – the definition of a segment; – the recognition of physical constraints; – the absence of a model of performance. 7.4.4. Criteria for evaluating a model of coarticulation It seems reasonable to suppose that a theory of coarticulation should give some indication as to which effects derive from the linguistic level and which from physiological constraints. The first type will be peculiar to each language while the second type will be universal. The danger of an approach limited to consideration of a single language is that it is tempting to relegate phenomena that are hard to explain to a level of neuromuscular organization that is still not thoroughly understood. The risk of circularity in explanations is great. Another pitfall is to treat features that occur regularly in one language as a linguistic fact when they are only the expression of physiological constraints. The problematic consequences of working on a single language are illustrated in automatic speech recognition, when the coarticulation rules of one language are unfortunately borrowed for the interpretation of another language, with a consequent lowering of the recognition rate. Therefore, it seems sensible to tackle the problem of evaluating a coarticulation model by starting with a study comparing the contextual effects of voicing, labiality, nasality and lingual coarticulation in several languages. Such a study would need to go beyond a theoretical discussion of the rules which might apply for the purposes of synthesis and automatic speech recognition and be able to contribute in an independent manner to the evaluation of models. This would enable us to limit the minimum number of templates (reference units), to increase the quality of speech synthesis by improving transition between phonemes (Carré, 2004) and to improve speech recognition by angling research into phonotypes in the direction of phonetic contexts.
Coarticulation and Co-production
177
7.5. Interpretation of coarticulation phenomena The starting point is the principle that control over phonation and articulation is fundamentally similar to the type of control exerted over other acts. This control is the product of activity by the “coordinative structure” organizing the activity of muscle groups. These structures regroup muscular activity in such a way that the degrees of freedom for each muscle are limited by belonging to this structure: these constraints can be expressed by the equations which define the parameters of the structure and the relationships between these parameters as relatively invariant. The speech consists of providing a set of values for parameters that are not already provided in the current state of the system. In speech, it is possible to consider that the function of the component controlled by the highest levels of representation is to select appropriate equations and adjust the free parameters. The interest of a direct theory seems to lie in its recognition of an underlying relative invariance. Coarticulation is then considered (Bell-Berti and Krakow, 1991) as a property of the process of co-production. Allophonic variations in the speech stream are then explained by the interaction of articulators while the essential properties of segments remain unaltered. The various theories of coarticulation in fact express fundamental differences in the appreciation of the nature of the coarticulation phenomena. 7.6. Conclusion The research of action physiologists has shown that commands governing movement do not correspond to individual central commands for each muscle, but rather to the initialization of certain groups of muscles. In their theory, these functional muscular groupings have been given the name of coordinative structures. The essential property of the coordinative structure is to be able to produce classes of equivalent movement. Invariance is not absolute but relative. This concept of control of activity for coordinated movement can be applied to speech production. This would involve a set of coordinative structures to organize the execution of such stereotypes as respiration mode, laryngeal adjustment, the vowel-consonant distinction, length of vocal tract, aperture, etc. The features of segments would correspond to the free parameters of the production functions of the various macroclasses. We therefore feel it necessary to define three levels: 1) a phoneme level; 2) an allophone level; 3) a phone level.
178
From Speech Physiology to Linguistic Phonetics
The phonological rules of each language would apply to each language and produce allophones. Allophones would be described by the set of properties and parameters enabling their unaltered realization in speech. Such a plan would correspond to a detailed physiological description of the speech act. Linguistic forms and rules would have to be considered not merely as mental objects but also – and especially – as conditions of speech production. Phones would correspond to the resulting manifestations in the speech stream. They derive from the operations of the mechanisms devised by the coordinative structures. Co-production is thus one of the properties of the coarticulation process.
Bibliography
ABBERTON, E. and FOURCIN, A. (1997), “Electrolaryngography”, in Ball, M.J. and Code, C. (Eds.), Instrumental Clinical Phonetics, London, Whurr, pp. 119148. ABRY, C., BOË, L.J., GENTIL, M., DESCOUT, R. and GRAILLOT, P. (1979), “La géométrie des lèvres en Français. Protrusion vocalique et protrusion consonantique”, 10ièmes Journées d’études sur la parole, Grenoble, GALF, pp. 99-111. ADAM, C. and MUNRO, R.R. (1973), “The relationship between internal intercostal muscle activity and pause placement in the connected utterance of native and non-native speakers of English”, Phonetica 28, pp. 227-250. AMELOT, A. and MICHAUD, A. (2006), “Effets aérodynamiques du mouvement du velum: le cas des voyelles nasales du français”, 26 èmes Journées d’études sur la parole, Dinard, IRISA, pp. 247-250. ANTHONY, J.K.F. (1982), “Breathing and Speaking”, Proceedings of the Modification of Respiration for Speech Workshop, Wetherby, British Library. ARNOLD, G.E. (1961), “Physiology and pathology of the cricothyroid muscle”, Laryngoscope 71, pp. 687-753. ATKINSON, J.E. (1978), “Correlation analysis of the physiological features controlling fundamental frequency”, Journal of the Acoustical Society of America 63, pp. 211-222.
180
From Speech Physiology to Linguistic Phonetics
BAER, T., ALFONSO, P. and HONDA, K. (1988), “Electromyography of the tongue muscles during vowels in /epVp/ environment”, Annual Bulletin of the Research Institute of Logopedics and Phoniatrics, University of Tokyo 22, pp. 719. BAKEN, R. (1992), “Electroglottography”, Journal of Voice 6, pp. 98-110. BAKEN, R. (1995), “Between organization and chaos: a different view of the voice”, in BELL-BERTI, F. and RAPHAEL, L. (Eds.), Producing Speech: Contemporary Issues, New York, American Institute of Physics, pp. 233-245. BALASUBRAMANIAN, T. (1972), The Phonetics of Colloquial Tamil, PhD Thesis, University of Edinburgh. BELL, A.G. (1879), “Vowel theories”, American Journal of Otology 1, pp. 163-180. BELL-BERTI, F. (1973), “The velopharyngeal mechanism: an electromyographic study”, Haskins Laboratories Status Report on Speech Research, Supplement, pp. 1-159. BELL-BERTI, F. and HIROSE, H. (1975), “Palatal activity in voicing distinctions: a simultaneous fiberoptic and electromyographic study”, Journal of Phonetics 3, pp. 69-74. BELL-BERTI, F., BAER, T., HARRIS, K.S. and MIMI, S. (1979), “Coarticulatory effects of vowel quality on velar elevation”, Phonetica 36, pp. 187-193. BELL-BERTI, F. and KRAKOW, R.A. (1991), “Anticipatory velar lowering: a coproduction account”, Journal of the Acoustical Society of America 90, pp. 112123. BENGUEREL, A.P. (1973), “Corrélats physiologiques de l’accent en français”, Phonetica 27, pp. 21-35. BENGUEREL, A.P. and COWAN, H.A. (1975), “Coarticulation of upper lip protrusion in French”, Phonetica 30, pp. 41-55. BENGUEREL, A.P., HIROSE, H., SAWASHIMA, M. and USHIJIMA, T. (1977), “Velar coarticulation in French: a fiberscopic study”, Journal of Phonetics 4, pp. 137-150.
Bibliography
181
BERNSTEIN, N. (1967), The Coordination and Regulation of Movement, London, Pergamon. BINAZZI, B., LANINI, B., BIANCHI, R., ROMAGNOLI, I., NERINI, M., GIGLIOTTI, F., DURANTI, R., MILIC-EMILI, J. and SCANO, G. (2006), “Breathing patterns and kinematics in normal subjects in speech, singing and loud whispering”, Acta Physiologica Scandinavica 186, pp. 233-246. BLADON, R. and CARBONARO, E. (1978), “Lateral consonants in Italian”, Journal of Italian Linguistics 3, pp. 43-55. BLAIR, C. and SMITH, A. (1986), “EMG recording in human lip muscles: can single muscles be isolated?”, Journal of Speech and Hearing Research 29, pp. 256-266. BOSMA, J. and FLETCHER, S.G. (1962), “The upper pharynx. A review, Part II, physiology”, Annals of Oto Rhinol laryngology 71, pp. 134-157. BOTHOREL, A. (1980), “Déplacements de l’os hyoïde et Fo”, in Boë, L.J., DESCOUT, R. and GUÉRIN, B. (Eds.), Larynx et parole, Grenoble, GALF, pp. 183-212. BOTHOREL, A. (1982), Etude phonétique et phonologique du breton parlé à Argol, Lille, Atelier national de reproduction des thèses. BOTHOREL, A. (1983), “Contraintes physiologiques et indices articulatoires”, Speech Communication 2, pp. 119-122. BOUHUYS, A., PROCTOR, D.F. and MEAD, J. (1966), “Kinetic aspects of singing”, Journal of Applied Physiology 28, pp. 483-496. BOYCE, S. (1990), “Coarticulatory organization for lip rounding in Turkish and English”, Journal of the Acoustical Society of America 88, pp. 2584-2595. BROWMAN, C. and GOLDSTEIN, L. (1992), “Articulatory phonology: an overview”, Phonetica 49, pp. 155-180. BROWMAN, C. and GOLDSTEIN, L. (1997), “The gestural phonology model”, in HULSTIJN, W., PETERS, F. and VAN LIESHOUT, P. (Eds.), Speech Production: Motor Control, Brain Research and Fluency Disorders, Amsterdam, Elsevier, pp. 55-71.
182
From Speech Physiology to Linguistic Phonetics
BUCHAILLARD, S. (2007), Activations musculaires et mouvements linguaux: Modélisation en parole naturelle et en parole pathologique, PhD Thesis, Joseph Fourier University, Grenoble. BUTCHER, A. (2004), “Fortis/Lenis revisited one more time: the aerodynamics of some oral stop contrasts in three continents”, Clinical Linguistics and Phonetics 18, pp. 547 - 557. BUTCHER, A. and TABAIN, M. (2004), “On the back of the tongue: dorsal sounds in Australian languages”, Phonetica 61, pp. 22-52. BUTCHER, A. and WEIHER, E. (1976), “An electropalatographic investigation of coarticulation in VCV sequences”, Journal of Phonetics 4, pp. 59-74. CARNEY, P.J. and MOLL, K.L. (1971), “A cinefluorographic investigation of fricative consonant-vowel articulation”, Phonetica 23, pp. 193-202. CARRÉ, R. (2004), “From an acoustic tube to speech production”, Speech Communication 42, pp. 227-240. CATFORD, J.C. (1977), Fundamental Problems in Phonetics, Edinburgh, Edinburgh University Press. CHAFCOULOFF, M. and MARCHAL, A. (1999), “Velopharyngeal coarticulation”, in HARDCASTLE, W.J. and HEWLETT, N. (Eds.), Coarticulation, Cambridge, Cambridge University Press, pp. 69-104. CHOMSKY, N. and HALLE, M. (1968), The Sound Patterns of English, New York, Harper & Row. CLEMENTS, G.N. (2005), “The role of features in speech sound inventories”, in CAIRNS, C. and RAIMY, E. (Eds.), Phonological Representations and Architecture, Cambridge, Mass, MIT Press, pp. 1-20. CLUMECK, H. (1976), “Patterns of soft palate movements in six languages”, Journal of Phonetics 4, pp. 337-351. COLLIER, R. (1975), “Physiological correlates of intonation patterns”, Journal of the Acoustical Society of America 58, pp. 249-255. DANILOFF, R. and HAMMARBERG, B. (1973), “On defining coarticulation”, Journal of Phonetics 1, pp. 239-248.
Bibliography
183
DE NIL, L. and ABBS, J. (1991), “Influence of speaking rate on the upper lip, lower lip and jaw peak velocity sequencing during bilabial closing movements”, Journal of the Acoustical Society of America 89, pp. 845-849. DEMOLIN, D., DELVAUX, V., METENS, T. and SOQUET, A. (2003), “Determination of the velum opening for French nasal vowels by magnetic resonance”, Journal of Voice 17, pp. 654-667. DICKSON, D.R. and DICKSON, M. (1995), Anatomical and Physiological Bases of Speech, Butterworth-Heinemann Medical. DRAPER, M.H., LADEFOGED, P. and WHITTERIDGE, D. (1959), “Respiratory muscles in speech”, Journal of Speech and Hearing Research 2, pp. 16-27. DROMEY, C. and RAMIG, L.O. (1998), “The effect of lung volume on selected phonatory and articulatory variables”, Journal of Speech and Hearing Research 41, pp. 491-502. EASTON, T.A. (1972), “On the normal use of reflexes”, American Scientist 60, pp. 591-599. ENGSTRAND, O. (1989), “Towards an electropalatographic specification of consonant articulation in Swedish”, PERILUS 10, pp. 115-156. ENGWALL, O. (2003), “Combining MRI, EMA and EPG measurements in a threedimensional tongue model”, Speech Communication 41, pp. 303-329. ERIKSON, D., LIBERMAN, M. and NIIMI, S. (1977), “The geniohyoid and the role of the strap muscles”, Haskins Laboratories Status Report on Speech Research SR-49, pp. 103-110. ERIKSON, D., BAER, T. and HARRIS, K.S. (1983), “The role of the strap muscles in pitch lowering”, in BLESS, D.M. and ABBS, J.H. (Eds.), Vocal Fold Physiology, San Diego, College Hill Press, pp. 281-285. ESLING, J.H. (2006), “States of the glottis”, in BROWN, K. (Ed.), Encyclopedia of Language and Linguistics, Oxford, Elsevier, pp. 129-132. ETTEMA, S.L., KUEHN, D.P., PERLMAN, A.L. and ALPERIN, N. (2002), “Magnetic resonance imaging of the levator palatini during speech”, Cleft Palate Craniofacial Journal 39, pp. 130-144.
184
From Speech Physiology to Linguistic Phonetics
EVARTS, E.V. (1971), “Feedback and corollary discharge: a merging of the concepts”, Neurosciences Research Program Bulletin 9, pp. 86-112. FAABORG-ANDERSEN, K. (1957), “Electromyographic investigation of intrinsic laryngeal muscles in humans”, Acta Physiologica Scandinavica 41, pp. 1-149. FALCH’HUN, F. (1951), Le système consonantique du breton, Rennes, Plihon. FALCH’HUN, F. (1965), “L’énergie articulatoire des occlusives”, Phonetica 13, pp. 31-36. FARNETANI, E. (2007), “Coarticulation and connected speech processes”, in HARDCASTLE, W.J. and LAVER, J. (Eds.), A Handbook of Phonetic Science, Oxford, Blackwell, pp. 371-404. FARNETANI, E. and RECASENS, D. (1999), “Coarticulation models in recent speech production theories”, in HARDCASTLE, W.J. and HEWLETT, N. (Eds.), Coarticulation, Cambridge, Cambridge University Press, pp. 31-65. FENN, W.O. and RAHN, H. (1964), Handbook of Physiology, Respiration I, Washington, American Physiological Society. FERREIN, A. (1741), “De la formation de la voix de l’homme”, Mémoire de l’Académie royale des sciences, pp. 409-432. FINK, B.R. (1962), “Tensor mechanisms in the human larynx”, Acta Otol. Rhinol. Laryngol. 71, pp. 591-600. FINK, B.R. (1975), The Human Larynx, a Functional Study, New York, Raven Press. FOLKINS, J.W. and ZIMMERMAN, G.N. (1982), “Lip and jaw motor control during speech: responses to perturbation of lower-lip movement prior to bilabial closure”, Journal of the Acoustical Society of America 71, pp. 1225-1233. FONAGY, I. (1958), “Elektrophysiologische Beiträge zur Akzentfrage”, Phonetica 2, pp. 12-58. FOUGERON, C. (2005), “La phonologie articulatoire: une introduction”, in NGUYEN, N., WAUQUIER-GRAVELINES, S. and DURAND, J. (Eds.), Phonologie et Phonétique, Paris, Hermes, pp. 265-290.
Bibliography
185
FOURCIN, A. (1974), “Laryngographic examination of vocal fold vibration”, in Wyke, B. (Ed.), Ventilatory and Phonatory Control Systems, Oxford, Oxford University Press, pp. 315-333. FOWLER, C. (1977), Timing Control in Speech Production, Bloomington, Indiana University Linguistics Club. FOWLER, C. (1980), “Coarticulation and theories of extrinsic timing control”, Journal of Phonetics 8, pp. 113-133. FRITZELL, B. (1963), “An electromyographic study of the movements of the soft palate in speech”, Folia Phoniatrica 15, pp. 307-311. FRITZELL, B. (1969), “The velopharyngeal muscles in speech: an electromyographic and cineradiographic study”, Acta Oto Laryngologica Suppl 250, pp. 1-81. GARCIA, M. (1855), “Observations on the human voice”, Proceedings of the Royal Society of London 7, pp. 399-410. GAUFFIN, J. and HAMMARBERG, B. (Eds.), (1991). Vocal Fold Physiology, San Diego, CA, Singular Publishing Group. GAY, T., HIROSE, H., STROME, M. and SAWASHIMA, M. (1972), “Electromyography of the intrinsic laryngeal muscles during phonation”, Annals of Oto Rhinol Laryngology, pp. 401-409. GAY, T., USHIJIMA, T., HIROSE, H. and COOPER, F. (1974), “Effect of speaking rate on labial consonant-vowel articulation”, Journal of Phonetics 2, pp. 47-63. GAY, T. (1977), “Coarticulation in some consonant-vowel and consonant clustervowel syllables”, in LINDBLOM, B. and ÖHMAN, S. (Eds.), Frontiers of Speech Communication Research, London, Academic Press, pp. 69-76. GENTIL, M. and BOË, L.J. (1979), Les lèvres et la parole: données anatomiques et aspects physiologiques, Grenoble, Université des Langues et Lettres. GENTIL, M. and GAY, T. (1986), “Neuromuscular specialization of the mandibular motor system: speech versus non-speech movements”, Speech Communication 5, pp. 69-82.
186
From Speech Physiology to Linguistic Phonetics
GÉRARD, J.M., WILHELMS-TRICARICO, R., PERRIER, P. and PAYAN, Y. (2003), “A 3D dynamical biomechanical tongue model to study speech motor control”, Recent Research and Developments in Biomechanics 1, pp. 49-64. GIOT, J. (1977), “Etude comparative de syllabes accentuées et prétoniques du français sur les plans articulatoire et acoustique”, Travaux de l’Institut de Phonétique de Strasbourg 9, pp. 89-169. GIOVANNI, A., HEIM, C., DEMOLIN, D. and TRIGLIA, J.M. (2000), “Estimated subglottal pressure in normal and dysphonic subjects”, Annals of Oto Rhinol Laryngology, pp. 500-504. GIOVANNI, A., OUAKNINE, M. and TRIGLIA, J.M. (1999), “Determination of the largest exponents of the vocal signal: application to unilateral laryngeal paralysis”, Journal of Voice 13, pp. 341-354. GOLDSMITH, J. (1976), Autosegmental Phonology, Cambridge, Indiana University Linguistics Club. GRILLNER, S. (1976), “Locomotion in vertebrates: central mechanisms and reflex interaction”, Physiological Review 55, pp. 247-304. GUTHRIE, M. (1948), The Classification of the Bantu languages, Oxford, Oxford University Press. HALLE, P. (1994), “Evidence for tone-specific activity of the sternohyoid muscle in modern standard Chinese”, Language and Speech 37, pp. 103-123. HAMMARBERG, B. (1976), “The metaphysics of coarticulation”, Journal of Phonetics 4, pp. 353-363. HAMMARBERG, R. (1982), “On redefining coarticulation”, Journal of Phonetics 10, pp. 123-137. HARDCASTLE, W.J. (1976), Physiology of Speech Production, London, Academic Press. HARDCASTLE, W.J. and HEWLETT, N. (1999), Coarticulation, Cambridge, Cambridge University Press. HARDCASTLE, W.J. and LAVER, J. (1997), Instrumental Clinical Phonetics, Oxford, Blackwell.
Bibliography
187
HARDCASTLE, W.J. and LAVER, J. (2007), The Handbook of Phonetic Sciences, Oxford, Blackwell. HARRINGTON, R. (1944), “A study of the mechanics of velopharyngeal closure”, Journal of Speech and Hearing Disorders 9, pp. 325-345. HARRINGTON, J., FLETCHER, J. and ROBERTS, C. (1995), “Coarticulation and the accented/unaccented distinction: evidence from jaw movement data”, Journal of Phonetics 23, pp. 305-322. HARSHMAN, R., LADEFOGED, P. and GOLDSTEIN, L. (1977), “Factor analysis of tongue shapes”, Journal of the Acoustical Society of America 62, pp. 693-707. HAWKINS, S. (1992), “An introduction to task dynamics”, in Docherty, G.J. and Ladd, D.R. (Eds.), Papers in Laboratory Phonology II. Gesture, Segment, Prosody, Cambridge, Cambridge University Press, pp. 9-25. HENKE, W. (1966), Dynamic Articulatory Model of Speech Production using Computer Simulation, PhD Thesis, MIT. HERTEGARD, S., GAUFFIN, J. and LINDESTAD, P. (1995), “A comparison of sub-glottal and intra-oral pressure measurements during phonation”, Journal of Voice 9, pp. 149 - 155. HINTON, V.A. and AROKIASAMY, W.M. (1997), “Maximum interlabial pressures in normal speakers”, Journal of Speech and Hearing Research 40, pp. 400-404. HIRANO, M. (1982), Clinical Examination of Voice, New York, Springer. HIROSE, H. and GAY, T. (1972), “The activity of the intrinsic laryngeal muscles in voicing control: an electromyographic study”, Phonetica 25, pp. 140-164. HIROSE, H., SIMADA, Z. and FUJIMURA, O. (1970), “An electromyographic study of the activity of the laryngeal muscles during speech utterances”, Annual Bulletin of the Research Institute of Logopedics and Phoniatrics, University of Tokyo 4, pp. 9-25. HIXON, T.J., MEAD, J. and GOLDMAN, M.D. (1976), “Kinematics of the chest wall during speech production: function of the thorax, rib cage, abdomen, and lung”, Journal of Speech and Hearing Research 19, pp. 297-356.
188
From Speech Physiology to Linguistic Phonetics
HOCKETT, C. (1955), A Manual of Phonology, Baltimore, Waverly. HOMBERT, J.M., OHALA, J.J. and EWAN, W.G. (1979), “Phonetic explanations for the development of tones”, Language 55, pp. 37-58. HONDA, K. (1983), “Variability analysis of laryngeal muscle activities”, in TITZE, I.R. and SCHERER, R. (Eds.), Vocal Fold Physiology: Biomechanics, Acoustics, and Phonatory Control, Denver, The Denver Center for the Performing Arts, pp. 286-297. HONDA, K. (1988), “Various laryngeal mechanisms in controlling the voice fundamental frequency”, Journal of the Acoustical Society of America 84, pp. S82. HONDA, K. (1996), “Organization of tongue articulation for vowels”, Journal of Phonetics 23, pp. 39-52. HONDA, K. and FUJINU, A. (2000), “Articulatory compensation and adaptation for unexpected palate shape perturbation”, 6th International Congress of Spoken language Processing, Beijing, pp. 170-173. HONDA, K., HIRAI, H., ESTILL, J. and TOKHURA, Y. (1995), “Contribution of vocal tract shape to voice quality: MRI data and articulatory modeling”, in FUJIMURA, O. and HIRANO, M. (Eds.), Vocal Fold Physiology, San Diego, CA, Singular Publishing Group, pp. 23-38. HONDA, K., HIRAI, H., MASAKI, S. and SHIMADA, Y. (1999), “Role of vertical larynx movement and cervical lordosis in fo control”, Language and Speech 42, pp. 401-411. HONDA, K., KURITA, T., KAKITA, Y. and MAEDA, S. (1995), “Physiology of the lips and modeling of lip gestures”, Journal of Phonetics 23, pp. 243-254. HONDA, K., TAKEMOTO, H., KITAMURA, T., FUJITA, F. and TAKANO, S. (2004), “Exploring human speech production mechanisms by MRI”, IEEE Transactions, Information and Systems E87-D, pp. 1050-1058. HONG, K.H., KIM, H.H. and KIM, Y.H. (2001), “The role of the pars recta and pars obliqua of the cricothyroid muscle in speech production”, Journal of Voice 15, pp. 512-518.
Bibliography
189
HOOLE, P., NGUYEN, N. and HARDCASTLE, W.J. (1993), “A comparative investigation of coarticulation in fricatives: electropalatographic, electromagnetic and acoustic data”, Language and Speech 36, pp. 235-260. HOOLE, P. and KROOS, K. (1998), “Control of larynx height in vowel production”, 5th International Conference of Spoken Language Processing, Sydney, Australian Speech Science and Technology Association, pp. 531-534. HORIGUCHI, S. and BELL-BERTI, F. (1987), “The velotrace: a device for monitoring velar position”, Cleft Palate Journal 24, pp. 104-111. HOSHIKO, M.S. (1960), “Sequence of action of breathing muscles during speech”, Journal of Speech and Hearing Research 3, pp. 291-297. HOSHIKO, M.S. (1962), “Electromyographic investigation of the intercostal muscles during speech”, Archives of Physical Medical Rehabilitation. 43, pp. 115-119. HUBER, J.E. (2007), “Effect of cues to increeased sound pressure level on respiratory kinematic patterns during connected speech”, Journal of Speech, Language and Hearing Research 50, pp. 621-634. ISKAROUS, K. (2005), “Patterns of tongue movement”, Journal of Phonetics 33, pp. 363-383. ISSHIKI, N. (1964), “Regulatory mechanism of voice intensity variations”, Journal of Speech and Hearing Research 7, pp. 17-29. ISSHIKI, N. (1965), “Vocal intensity and flow rate”, Folia Phoniatrica 17, pp. 92104. ITO, T., GOMI, H. and HONDA, M. (2000), “Model of the mechanical linkage of the upper lip-jaw articulatory coordination”, 6th International Conference on Spoken Language Processing, Beijing, pp. 889-892. JAKOBSON, R., FANT, G. and HALLE, M. (1952), Preliminaries to Speech Analysis: The Distinctive Features and their Correlates, Boston, MIT Press. JONES, D. (1962), An Outline of English Phonetics, Cambridge, Heffer.
190
From Speech Physiology to Linguistic Phonetics
KAKITA, Y. and HIKI, S. (1974), “A study of laryngeal control for voice pitch based on anatomical model”, Speech Communication Seminar, Stockholm, pp. 45-54. KATZ, W.F., MACHETANZ, J., ORTH, U. and SCHÖNLE, P. (1990), “A kinematic analysis of anticipatory coarticulation in the speech of anterior aphasic subjects using electromagnetic articulography”, Brain and Language 3, pp. 555575. KEATING, P.A. (1983), “Comments on the jaw and syllable structure”, Journal of Phonetics 11, pp. 401-406. KEATING, P.A. (1988), “Underspecification in phonetics”, Phonology 5, pp. 275292. KEATING, P.A. (1988), “The window model of coarticulation: articulatory evidence”, UCLA Working Papers in Phonetics 69, pp. 3-29. KELSEY, C.A., WOODHOUSE, R.J. and MINIFIE, F. (1969), “Coarticulation in the pharynx”, Journal of the Acoustical Society of America 46, pp. 1016-1018. KENT, R. and MOLL, K. (1969), “Vocal-tract characteristics of the stop cognates”, Journal of the Acoustical Society of America 46, pp. 1549-1555. KENT, R. and MINIFIE, F. (1977), “Coarticulation in recent speech production models”, Journal of Phonetics 5, pp. 115-134. KEYSER, S.J. and STEVENS, K.N. (2006), “Enhancement and overlap in the speech chain”, Language 82, pp. 33-63. KIM, C.W. (1970), “A theory of aspiration”, Phonetica 21, pp. 107-116. KIM, H., HONDA, K. and MAEDA, S. (2005), “Stroboscopic-cine MRI study of the phasing between the tongue and the larynx in the Korean three-way phonation contrast”, Journal of Phonetics 33, pp. 1-26. KITAJIMA, K. and FUJITA, F. (1990), “Estimation of sub-glottal pressure with intra-oral pressure”, Acta Otolaryngologica 109, pp. 473-478. KOZHEVNIKOV, V. and CHISTOVICH, L. (1965), Speech: Articulation and Perception, Washington DC, translated and distributed by Joint Publications Research Services.
Bibliography
191
KRAEHENMANN, A. (2007), “Non-neutralizing quantity in word-initial consonants: articulatory evidence”, 16th International Conference of Phonetic Sciences, Saarebrücken, pp. 465-468. KRAKOW, R. (1999), “Physiological organization of syllables: a review”, Journal of Phonetics 27, pp. 23-54. KUEHN, D.P. and MOON, J.B. (1998), “Velopharyngeal closure force and levator palatini activation levels in varying phonetic contexts”, Journal of Speech, Language and Hearing Research 41, pp. 51-62. KUENZEL, H.J. (1977), “Photoelektrische Untersuchung Zur Velumhöhe bei Vokalen: Erste Anwendungen des Velographen”, Phonetica 34, pp. 352-370. KUMADA, M., TODD, R., BELL BERTI, F., NIITSU, M., HIROSE, H. and NIIMI, S. (1998), “Functions of the muscles of the tongue during speech”, Journal of the Acoustical Society of America 104, pp. 1819-1820. KUSUYAMA, T., FUKUDA, H., SHIOTANI, A., NAKAGAWA, H. and KANZAKI, J. (2001), “Analysis of vocal fold vibration by x-ray stroboscopy with multiple markers”, Otolaryngology – Head and Neck Surgery 124, pp. 317322. LABOISSIÈRE, R., OSTRY, D. and FELDMAN, A.G. (1996), “Control of multimuscle systems”, Biological Cybernetics 74, pp. 373-384. LACERDA, A.D. and HEAD, B. (1963), Analise de sons nasais e sons nasalizados do Portugues, Coimbra, University of Coimbra. LADEFOGED, P., DRAPER, M.H. and WHITTERIDGE, D. (1957), “Respiratory muscles in speech”, Journal of Speech and Hearing Research 2, pp. 16-27. LADEFOGED, P. (1962), “Subglottal activity during speech”, 4th International Congress of Phonetic Sciences, Helsinki, Mouton, pp. 73-91. LADEFOGED, P. (1964), A Phonetic Study of West African Languages, Cambridge, Cambridge University Press. LADEFOGED, P. (1967), Three Areas of Experimental Phonetics, London, Oxford University Press.
192
From Speech Physiology to Linguistic Phonetics
LADEFOGED, P. (1971), Preliminaries to Linguistic Phonetics, Chicago, University of Chicago Press. LADEFOGED, P. (1993), A Course in Phonetics, 3rd edition. London, Harcourt Brace Jovanovitch. LADEFOGED, P. and MADDIESON, I. (1996), Sounds of the World’s Languages, Oxford, Blackwell. LADEFOGED, P. and MCKINNEY, N.P. (1963), “Loudness, sound pressure and subglottal pressure in speech”, Journal of the Acoustical Society of America 35, pp. 454-460. LAVER, J. (1991), The Gift of Speech, Edinburgh, Edinburgh University Press. LAVER, J. (2002), Principles of Phonetics, Cambridge, Cambridge University Press. LEANDERSON, R., SUNDBERG, J. and VON EULER, C. (1987), “Breathing muscle activity and subglottal pressure dynamics in singing and speech”, Journal of Voice 1, pp. 258-261. LEANDERSON, R., PERSSON, A. and OHMAN, S. (1971), “Electromyographic studies of facial muscle activity in speech”, Acta Oto Laryngologica 72, pp. 361369. LEBRUN, Y. (1966), “Sur l’activité du diaphragme au cours de la phonation”, La Linguistique 2, pp. 71-78. LECUIT, V. and DEMOLIN, D. (1998), “The relationship between intensity and subglottal pressure with controlled pitch”, 5th International Congress of Spoken Language Processing, Sydney, Australian Acoustical Society, pp. 3079-3082. LEE, S., BYRD, D. and KRIVOKAPIC`, J. (2006), “Functional data analysis of prosodic effects on articulatory timing”, Journal of the Acoustical Society of America 119, pp. 1666-1671. LEE, J.S., KIM, E., SUNG, M.W., KIM, K.H., SUNG, M.Y. and PARK, K.S. (2001), “A method for assessing the regional vibratory pattern of vocal folds by analyzing the video recording of stroboscopy”, Medical and Biological Engineering and Computing 39, pp. 273-278.
Bibliography
193
LEHISTE, I. (1970), Suprasegmentals, Cambridge, MA, MIT Press. LEHISTE, I., MORTON, K. and TATHAM, M. (1973), “An instrumental study of consonant gemination”, Journal of Phonetics 1, pp. 131-148. LE MUIRE, A. and OHUALLACHAIN, C. (1966), Bunchursa Foghraiochta, Dublin, Olfig an Tsolathair. LENNEBERG, E.H. (1967), Biological Foundations of Language, New York, Wiley. LEVELT, W.J.M. (1989), Speaking: From Intention to Articulation, Cambridge, MA, MIT Press. LIBERMAN, A.M., COOPER, F.S., SHANKWEILER, D. and STUDDERTKENNEDY, M. (1967), “Perception of the speech code”, Psychological Review 74, pp. 431-461. LIEBERMAN, P. (1965), Intonation, Perception and Language, Cambridge, MIT Press. LIEBERMAN, P. (1968), “Direct comparison of subglottal and esophageal pressure during speech”, Journal of the Acoustical Society of America 43, pp. 1157-1164. LIEBERMAN, P. (1977), Speech Physiology and Acoustic Phonetics, London, Macmillan. LIM, M., LIN, E. and BONES, P. (2006), “Vowel effect on glottal parameters and the magnitude of jaw opening”, Journal of Voice 20, pp. 46-54. LINDBLOM, B. and SUNDBERG, J. (1971), “Acoustical consequences of lip, tongue, jaw, and larynx movement”, Journal of the Acoustical Society of America 50, pp. 1166-1179. LINDBLOM, B. (1990), “Explaining phonetic variation: a sketch of the H&H theory”, in HARDCASTLE, W.J. and MARCHAL, A. (Eds.), Speech Production and Speech Modelling, Dordrecht, Kluwer, pp. 403-439. LINDBLOM, B., LUBKER, J. and GAY, T. (1979), “Formant frequencies of some fixed-mandible vowels and a model of speech-motor programming by predictive simulation”, Journal of Phonetics 7, pp. 147-162.
194
From Speech Physiology to Linguistic Phonetics
LINDBLOM, B. and SUNDBERG, J. (2005), The Human Voice in Speech and Singing, Berlin, Springer. LINELL, P. (1982), “The concept of phonological form and the activities of speech production and speech perception”, Journal of Phonetics 10, pp. 37-72. LINKER, W. (1982), Articulatory and Acoustic Correlates of Labial Activity in Vowels: A Cross-Linguistic Study, Los Angeles, CA, UCLA Working papers in Phonetics. LISKER, L. and ABRAMSON, A. (1964), “A cross-language study of voicing in initial stops: acoustical measurements”, Word 20, pp. 384-422. LISKER, L. and ABRAMSON, A. (1971), “Distinctive features and laryngeal control”, Language 47, pp. 767-785. LÖFQVIST, A. (2005), “Lip kinematics in long and short stop and fricative consonants”, Journal of the Acoustical Society of America 117, pp. 858-878. LÖFQVIST, A. and GRACCO, L.C. (1999), “Interarticulator programming in VCV sequences: lip and tongue movements”, Journal of the Acoustical Society of America 105, pp. 1864-1876. LUCERO, J. and LÖFQVIST, A. (2005), “Measures of articulatory variability in VCV sequences”, ARLO 6, pp. 80-84. MACKAY, I.R. (1977), “Tenseness in vowels: an ultrasonic study”, Phonetica 34, pp. 325-351. MACNEILAGE, P. and SHOLES, G. (1964), “An electromyographic study of the tongue during vowel production”, Journal of Speech and Hearing Research 7, pp. 209-232. MADDIESON, I. (1984), Patterns of Sound, Cambridge, Cambridge University Press. MAEDA, S. (1990), “Compensatory articulation during speech: evidence from the analysis and synthesis of vocal tract shapes using an articulatory model”, in HARDCASTLE, W.J. and MARCHAL, A. (Eds.), Speech Production and Speech Modelling, Dordrecht, Kluwer, pp. 131-150.
Bibliography
195
MAEDA, S. and HONDA, K. (1994), “From EMG to formant patterns of vowels: the implication of vowel systems and spaces”, Phonetica 51, pp. 17-29. MAGEN, H.S., KANG, A.M., TIEDE, M.K. and WHALEN, D.H. (2003), “Posterior pharyngeal wall position in the production of speech”, Journal of Speech, Language and Hearing Research 46, pp. 241-251. MALÉCOT, A. (1955), “An experimental study of force of articulation”, Studia Linguistica 9, pp. 35-44. MALÉCOT, A. (1966), “Mechanical pressure as an index of ‘force of articulation’”, Phonetica 14, pp. 169-180. MALÉCOT, A. (1970), “The Lenis/Fortis opposition: its physiological parameters”, Journal of the Acoustical Society of America 47, pp. 1588-1592. MANUEL, S. (1987), Acoustic and Peceptual Consequences of Vowel-to-Vowel Coarticulation in three Bantu Languages, PhD Thesis, Yale University. MANUEL, S. and KRAKOW, R. (1984), “Universal and language particular aspects of vowel-to-vowel coarticulation”, Haskins Laboratories Status Report on Speech Research 77/78, pp. 69-78. MARCHAL, A. (1976), “Quelques notions de physiologie pulmonaire appliquées à la description de l’accent d’insistance en Français”, in SÉGUINOT, A. (Ed.), L’accent d’insistance, Montreal, Didier, pp. 93-121. MARCHAL, A. (1983), “The Fortis-Lenis distinction in stops”, Speech Communication 2, pp. 111-118. MARCHAL, A. (1988), “Contrôle de la respiration dans la phonation”, Folia Phoniatrica 40, pp. 1-11. MARCHAL, A. (1988), “Co-production: Evidence from EPG data”, Speech Communication 7, pp. 287-295. MARCHAL, A. and CARTON, F. (1980), “La pression sous-glottique: mesure et relation avec l’intensité et la fréquence fondamentale”, in BOË, L.J., DESCOUT, R. and GUÉRIN, B. (Eds.), Larynx et parole, Grenoble, Université des Langues et Lettres, pp. 315-327.
196
From Speech Physiology to Linguistic Phonetics
MARCHAL, A. and ESPESSER, R. (1989), “L’asymétrie des appuis linguopalatins”, Journal d’Acoustique 2, pp. 53-57. MARCHAL, A., TIFFOU, E. and WARREN, R. (1977), “A propos du ‘VOT’: le cas du Bourouchaski”, Phonetica 34, pp. 40-53. MARTIN, F., THUMFART, W.F., JOLK, A. and KLINGHOLZ, F. (1990), “The electromyographic activity of the posterior cricoarytenoid muscle during singing”, Journal of Voice 4, pp. 25-29. MASAKI, S., TIEDE, M., HONDA, K., SHIMADA, Y., FUJIMOTO, I., NAKAMURA, Y. and NINOMIYA, N. (1999), “MRI-based speech production study using a synchronized sampling method”, Journal of the Acoustical Society of Japan 20, pp. 375-379. MASANOBU, K., SHINOBUTA, M., HONDA, K., SHIMADA, Y., MORI, K. and NIIMI, S. (2000), “Function of tongue-related muscles during speech: a tagging MRI movie study”, Japan Journal of Logopedics and Phoniatrics 41, pp. 170178. MCCLEAN, M. and CLAY, J.L. (1995), “Activation of lip motor units with variations in speech rate and phonetic structure”, Journal of Speech and Hearing Research 38, pp. 772-782. MCCLEAN, M.D. (2000), “Patterns of orofacial movement velocity across variation in speech rate”, Journal of Speech, Language and Hearing Research 43, pp. 205216. MCGLONE, R. and PROFFIT, W. (1972), “Correlations between functional lingual pressure and oral cavity size”, Cleft Palate Journal 9, pp. 229-235. MEAD, J. and BUNN, J.C. (1974), “Speech as breathing”, in WYKE, B. (Ed.), Ventilatory and Phonatory Control Systems, London, Oxford University Press, Chapter 3. MENZERATH, P. and LACERDA, A.D. (1933), Koartikulation Steuerung und Lautabgrenzung, Bonn, Ferdinand Dümmlers Verlag. MERRIFIELD, W.R. (1963), “Palantla Chinantec syllable types”, Anthropological Linguistics 5, pp. 1-16.
Bibliography
197
MEYNADIER, Y. (2003), Interaction entre prosodie et (co)articulation linguopalatale en français, Doctorate, University of Aix-Marseille 1. MINIFIE, F., ABBS, J., TARLOW, A. and KWATERSKI, M. (1974), “EMG activity within the pharynx during speech production”, Journal of Speech and Hearing Research 17, pp. 497-504. MINIFIE, F., HIXON, T.J., KELSEY, C.A. and WOODHOUSE, R.J. (1970), “Lateral pharyngeal wall movement during speech production”, Journal of Speech and Hearing Research 13, pp. 584-594. MINIFIE, F., HIXON, T.J. and WILLIAMS, F. (Eds.), (1973), Normal Aspects of Speech, Hearing and Language, Englewood Cliffs, Prentice Hall. MIYAWAKI, K. (1974), “A study on the musculature of the human tongue”, Annual Bulletin of the Research Institute of Logopedics and Phoniatrics, University of Tokyo 8, pp. 23-50. MOLL, K. (1962), “Velopharyngeal closure in vowels”, Journal of Speech and Hearing Research 5, pp. 30-37. MOOSHAMMER, C., HOOLE, P. and GEUMAN, A. (2006), “Interarticulator cohesion within coronal consonant production”, Journal of the Acoustical Society of America 120, pp. 1028-1039. MÜLLER, J. (1837), “Von der Stimme und Sprache”, Handbuch der Physiologie des Menschen, Koblenz, Holscher, pp. 133-245. NAPADOW, V.J., CHEN, Q., WEDEEN, V.J. and GILBERT, R.J. (1999), “Intramural mechanics of tongue in association with physiological deformations”, Journal of Biomechanics 32, pp. 1-12. NARAYANAN, S., NAYAK, K., LEE, S., SETHY, A. and BYRD, D. (2004), “An approach to real-time magnetic resonance imaging for speech production”, Journal of the Acoustical Society of America 115, pp. 1771-1776. NAWKA, T. and ANDERS, L.C. (1996), Die auditive Bewertung heiserer Stimmen nach dem RBH - System, Stuttgart, Georg Thieme Verlag. NEISSER, U. (1976), Cognition and Reality, San Francisco, Appleton-CenturyCrofts.
198
From Speech Physiology to Linguistic Phonetics
NGUYEN, N., HOOLE, P. and MARCHAL, A. (1994), “Regenerating the spectral shape of s/ and /S/ from a limited set of articulatory parameters”, Journal of the Acoustical Society of America 96, pp. 33-39. NGUYEN, N., MARCHAL, A. and CONTENT, A. (1996), “Modeling tonguepalate contact patterns in the production of speech”, Journal of Phonetics 24, pp. 77-97. NI CHAISAIDE, A. (1985), Preaspiration in Phonological Stop Contrasts, PhD Thesis, University of North Wales. NI CHAISAIDE, A. and GOBL, C. (2007), “Voice source variation”, in Hardcastle, W.J. and Laver, J. (Eds.), The Handbook of Phonetic Sciences, Malden, MA, Blackwell, pp. 427-461. NIIMI, S. (1979), “The pharyngeal wall movement during speech”, 9th International Congress of Phonetic Sciences, Copenhagen, pp. 204. OHALA, J. (1993), “Coarticulation and phonology”, Language and Speech 36, pp. 155-171. OHALA, J. (1996), “Connected speech in Hindi”, Arbeitsberichte, Institut für Phonetik und Digitale Sprachverarbeitung, Universität Kiel 31, pp. 75-82. OHALA, J. and HIROSE, H. (1970), “The function of the sternohyoid muscle in speech”, Annual Bulletin of the Research Institute of Logopedics and Phoniatrics, University of Tokyo 4, pp. 41-44. OHALA, J. and OHALA, M. (1991), “Nasal epenthesis in Hindi”, Phonetica 48, pp. 207-220. ÖHMAN, S. (1966), “Coarticulation in VCV utterances: spectrographic measurements”, Journal of the Acoustical Society of America 39, pp. 151-168. ÖHMAN, S. (1967), “Numerical model of coarticulation”, Journal of the Acoustical Society of America 41, pp. 310-320. ÖHMAN, S., LEANDERSON, R. and PERSSON, A. (1965), “Electromyographic studies of facial muscles during speech”, Quarterly Progress and Status Report, Royal Institute of Technology 3, pp. 1-11.
Bibliography
199
PANDIT, P.B. (1957), “Nasalisation, aspiration and murmur in Guyarati”, Indian Linguistics 17, pp. 165-172. PATHASARATHY, V., STONE, M., NESSAIVER, M. and PRINCE, J.L. (2007), “Understanding tongue motion from tagged magnetic resonance images using harmonic phase MRI”, Journal of the Acoustical Society of America 106, pp. 2974-2982. PATTEE, H.H. (1973), “The physical basis and origin of hierarchical control”, in Pattee, H.H. (Ed.), Hierarchy Theory: The Challenges of Complex Systems, New York, Braziller, pp. 71-108. PERKELL, J. (1969), Physiology of Speech Production: Results and Implications of a Quantitative Cineradiographic Study, Cambridge, MA, MIT Press. PERKELL, J. (1974), “A physiologically oriented model of tongue activity in speech production”, 8th International Congress of Phonetic Sciences, Leeds. PERRIER, P., PAYAN, Y., ZANDIPOUR, M. and PERKELL, J. (2003), “Influences of tongue biomechanics on speech movements during the production of velar consonants: a modeling study”, Journal of the Acoustical Society of America 114, pp. 1582-1599. PIKE, K. (1955), Phonetics, Ann Arbor, University of Michigan Press. PIQUET, J. and DECROIX, G. (1956), “Etude expérimentale peropératoire du rôle de la pression sous-glottique sur la vibration des cordes vocales”, Comptes rendus de l’Académie des Sciences, pp. 1223-1225. PLANT, R.L. and YOUNGER, R.M. (2000), “The interrelationship of subglottic air pressure, fundamental frequency, and vocal intensity during speech”, Journal of Voice 14, pp. 170-177. PODVINEK, S. (1952), “The physiology and pathology of the soft palate”, Journal of Laryngology and Otology 66, pp. 452-461. POLETTO, C.J., VERDUN, L.P., STROMINGER, R. and LUDLOW, C.L. (2004), “Correspondence between laryngeal vocal fold movement and muscle activity during speech and non-speech gestures”, Journal of Applied Physiology 97, pp. 858-866.
200
From Speech Physiology to Linguistic Phonetics
PORT, R.F. and O’DELL, M.L. (1985), “Neutralization of syllable-final voicing in German”, Journal of Phonetics 13, pp. 455-471. PROCTOR, D.F. (1980), Breathing, Speech and Song, New York, Springer. RASP, O., LOSHELLER, J., DOELLINGER, M., EYSHOLD, U. and HOPPE, V. (2006), “The pitch rise paradigm: a new task for real-time endoscopy of nonstationary phonation”, Folia phoniatrica and Logopaedica 58, pp. 175-185. RECASENS, D. (1999), “Lingual coarticulation”, in HARDCASTLE, W.J. and HEWLETT, N. (Eds.), Coarticulation: Theory, Data and Techniques, Cambridge, Cambridge University Press, pp. 80-104. REIS, C. and ESPESSER, R. (2006), “Estudo Eletropalatografico de Fones Consonantais e Vocalicos do Português Brasileiro”, Estudos da Lingua (gem) 3, pp. 181-204. RIORDAN, C.J. (1977), “Control of vocal tract length in speech”, Journal of the Acoustical Society of America 62, pp. 998-1002. ROCHETTE, C.E. (1973), Les groupes de consonnes en Français: étude de l’enchaînement articulatoire à l’aide de la radiocinématographie et de l’oscillographie, Quebec, Laval University Press. ROHRER, F. (1925), “Physiologie der Atembewegung”, Handbuch der Normalen und Pathologischen Physiologie, Berlin, Springer, pp. 70-127. ROSSATO, S., TEIXEIRA, A. and FERREIRA, L. (2006), “Les nasales du portuguais et du français: une étude comparative sur les données EMMA”, Journées d’études sur la parole, Dinard. ROSSI, M. (1977), “Les traits acoustiques”, La Linguistique 13, pp. 63-82. ROTHENBERG, M. (1977), “Measurement of airflow in speech”, Journal of Speech and Hearing Research 20, pp. 155-176. ROUBEAU, B., CHEVRIE MULLER, C. and ARABIA-GUIDET, C. (1987), “Electromyographic study of the changes of voice registers”, Folia Phoniatrica 39, pp. 280-289.
Bibliography
201
ROUBEAU, B., CHEVRIE MULLER, C. and GUILLY, J.L.S. (1997), “Electromyographic activity ofstrap and cricothyroid muscles in pitch change”, Acta Oto Laryngologica 117, pp. 459-464. SALTZMAN, E. and KELSO, S. (1987), “Skilled actions: a task dynamic approach”, Psychological Review 94, pp. 84-106. SALTZMAN, E., LÖFQVIST, A., KAY, B., KINSELLA-SHAW, J. and RUBIN, P. (1998), “Dynamics of intergestural timing: a perturbation study of lip-larynx coordination”, Experimental Brain Research 123, pp. 412-424. SALTZMAN, E. and MUNHALL, K. (1989), “A dynamical approach to gestural patterning in speech production”, Ecological Psychology 1, pp. 333-382. SANGUINETTI, V., LABOISSIÈRE, R. and OSTRY, D. (1998), “A dynamical biomechanical model for neural control of speech production”, Journal of the Acoustical Society of America 103, pp. 1615-1627. SAVARIAUX, C., PERRIER, P. and ORLIAGUET, J.P. (1995), “Compensation strategies for the perturbation of the rounded vowel /u/ using a lip tube: a control of the study of the control space in speech production”, Journal of the Acoustical Society of America 98, pp. 2428-2442. SAWASHIMA, M., KAKITA, Y. and HIKI, S. (1973), “Activity of the extrinsic laryngeal muscles in relation to Japanese word accent”, Annual Bulletin of the Research Institute of Logopedics and Phoniatrics, University of Tokyo 7, pp. 1925. SAWASHIMA, M. and HIROSE, H. (1980), “Laryngeal gestures in speech production”, Annual Bulletin of the Research Institute of Logopedics and Phoniatrics, University of Tokyo 14, pp. 29-51. SCHOENTGEN, J. (2001), “Stochastic models of jitter”, Journal of the Acoustical Society of America 109, pp. 1631-1650. SCHULMAN, R. (1989), “Articulatory dynamics of loud and normal speech”, Journal of the Acoustical Society of America 85, pp. 295-312. SCHUTTE, H. (1992), “Integrated aerodynamic measurements”, Journal of Voice 6, pp. 127 - 134.
202
From Speech Physiology to Linguistic Phonetics
SCHUTTE, H., SVEC, J.G. and FRANTISEK, S. (1998), “Application of videokymography”, Laryngoscope 108, pp. 1206-1210. SHIPP, T., DOHERTY, E.T. and MORRISSEY, P. (1979), “Predicting vocal frequency from selected physiological measures”, Journal of the Acoustical Society of America 66, pp. 678-684. SIEVERS, E. (1876), Grundzüge der Lautphysiologie zur Einführung in das Studium der indogermanischen Sprachen, Leipzig, Breitkopf und Hartel. SIMADA, Z. and HIROSE, H. (1970), “The function of the laryngeal muscles in respect to the word accent distinction”, Annual Bulletin of the Research Institute of Logopedics and Phoniatrics, University of Tokyo 4, pp. 27-40. SIMADA, Z., NIIMI, S. and HIROSE, H. (1991), “On the timing of the sternohyoid muscle activity associated with accent in the Kinki dialect”, Annual Bulletin of the Research Institute of Logopedics and Phoniatrics, University of Tokyo 25, pp. 39-45. SIMON, P. (1967), Les consonnes françaises, mouvements et positions articulatoires à la lumière de la radiocinématographie, Paris, Klincksieck. SJÖLANDER, P. and SUNDBERG, J. (2004), “Spectrum effects of subglottal pressure variation in professional baritone singers”, Journal of the Acoustical Society of America 115, pp. 1270-1273. SKOLNICK, M.L. (1970), “Videofluoroscopic examination of the velopharyngeal portal during phonation in lateral and base projections. A new technique for studying the mechanics”, Cleft Palate Craniofacial Journal 7, pp. 803-816. SLIFKA, J. (2003), “Respiratory constraints on speech production: starting an utterance”, Journal of the Acoustical Society of America 114, pp. 3343-3353. SLIS, I.M. and COHEN, A. (1969), “On the complex regulating the voicedvoiceless distinction”, Language and Speech 12, pp. 80-102; 137-155. SMITH, C. (1995), “Prosodic patterns in the coordination of vowel and consonant gestures”, in Connel, B. and Arvaniti, A. (Eds.), Phonology and Phonetic Evidence: Papers in Laboratory Phonology IV, Cambridge, Cambridge University Press, pp. 205-222.
Bibliography
203
SOLÉ, M. (1992), “Phonetic and phonological processes: the case of nasalization”, Language and Speech 35, pp. 29-43. SPHRINTZEN, R., MACCALL, G., SKOLNICK, M.L. and LENCIONE, R. (1974), “A three dimensional cinefluorographic analysis of velopharyngeal closure during speech and non-speech activities in normals”, Cleft Palate Craniofacial Journal 11, pp. 418-422. STETSON, R.H. (1951), Motor Phonetics: A Study of Speech Movements in Action, 2nd edition, Amsterdam, North Holland. STEVENS, K.N. (1972), “The quantal nature of speech: evidence from articulatoryacoustic data”, in DAVIS, D. and DENES, D. (Eds.), Human Communication: A Unified View, New York, McGraw-Hill, pp. 51-66. STEVENS, K.N. (1989), “On the quantal nature of speech”, Journal of Phonetics 17, pp. 3-45. STONE, M., EPSTEIN, M. and ISKAROUS, K. (2004), “Functional segments in tongue movement”, Clinical Linguistics and Phonetics 18, pp. 507-521. STONE, M. and LUNDBERG, A. (1996), “Three-dimensional tongue surface shapes of English consonants and vowels”, Journal of the Acoustical Society of America 99, pp. 3728-3737. STONE, M. and VATIKIOTIS-BATESON, E. (1995), “Trade-offs in tongue, jaw, and palate contributions to speech production”, Journal of Phonetics 23, pp. 81100. STRAKA, G. (1965), L’album phonétique, Quebec, Laval University Press. STRENGER, F. (1960), “Methods for direct and indirect determination of the subglottic air pressure”, Studia Linguistica 14, pp. 98-112. STRICK, H. and BOVES, L. (1992), “Control of fundamental frequency, intensity and voice quality in speech”, Journal of Phonetics 20, pp. 15-25. STRICK, H. and BOVES, L. (1995), “Downtrend in Fo and Psb”, Journal of Phonetics 23, pp. 203-220. SUNDBERG, J. (1995), “Vocal fold vibration patterns and modes of phonation”, Folia Phoniatrica 47, pp. 218-228.
204
From Speech Physiology to Linguistic Phonetics
SUNDBERG, J. (2003), “Research on the singing voice in retrospect”, Speech Music and Hearing Laboratory-Quarterly Progress and Status Report, Stockholm 45, pp. 11-22. SUNDBERG, J., ANDERSON, M. and HULTQVIST, C. (1999), “Effects of a subglottal pressure variation on professional baritone singers”, Journal of the Acoustical Society of America 105, pp. 1965-1971. SUNDBERG, J., JOHANSSON, C., WILLBRAND, H. and YTTERBERGH, C. (1987), “From sagittal distance to area: a study of transverse vocal tract cross sectional area”, Phonetica 44, pp. 76-90. SUSSMAN, H.M., MACNEILAGE, P. and HANSON, R.J. (1973), “Labial and mandibular dynamics during the production of bilabial consonants”, Journal of Speech and Hearing Research 16, pp. 397-420. TAKANO, S., HONDA, K., MASAKI, S., SHIMADA, Y. and FUJIMOTO, I. (2003), “Translation and rotation of the cricothyroid joint revealed by phonationsynchronized high resolution MRI”, Eurospeech, Geneva, pp. 2397-2400. TAKEMOTO, H. (2001), “Morphological analyses of the human tongue musculature for three-dimensional modelling”, Journal of Speech, Language and Hearing Research 44, pp. 95-107. TAKEMOTO, H., ADACHI, S., KITAMURA, T., MOKHTARI, P. and HONDA, K. (2006), “Acoustic roles of the laryngeal cavity in vocal tract resonance”, Journal of the Acoustical Society of America 120, pp. 2228-2238. TAKEMOTO, H., HONDA, K., MASAKI, S., SHIMADA, Y. and FUJIMOTO, I. (2006), “Measurement of temporal changes in vocal tract area function from 3D Cine-MRI data”, Journal of the Acoustical Society of America 119, pp. 10371049. TATHAM, M. and MORTON, K. (2006), Speech Production and Perception, New York, Palgrave-Macmillan. TIEDE, M. (1996), “An MRI-based study of pharyngeal volume contrasts in Akan and English”, Journal of Phonetics 24, pp. 399-421. TITZE, I.R. (1989), “On the relation between subglottal pressure and fundamental frequency in phonation”, Journal of the Acoustical Society of America 85, pp. 901-906.
Bibliography
205
TITZE, I.R. (1992), “Vocal efficiency”, Journal of Voice 6, pp. 135-138. TITZE, I.R. (1994), Principles of Voice Production, Englewood Cliffs, New York, Prentice Hall. TITZE, I.R., BAKEN, R.J. and HERZEL, H. (1993), “Evidence of chaos in vocal fold vibration”, in Titze, I.R. (Ed.), Vocal Fold Physiology, San Diego, CA, Singular Publishing Group, pp. 143-188. TITZE, I.R., LUSCHEI, E.S. and HIRANO, M. (1989), “Role of the thyroarytenoid muscle in regulation of fundamental frequency”, Journal of Voice 3, pp. 213224. TURVEY, M.T. (1977), “Preliminaries to a theory of action with reference to vision”, in SHAW, R. and BRANSFORD, J. (Eds.), Perceiving, Acting and Knowing: Toward an Ecological Psychology, Hillsdale, Lawrence Erlbaum, pp. 211-265. TURVEY, M.T., SHAW, R. and MACE, W. (1978), “Issues in the theory of action: degrees of freedom, coordinative structures and coalitions”, in REQUIN, J. (Ed.), Attention and Performance, Hillsdale, Lawrence Erlbaum, pp. 557-595. USHIJIMA, T. and SAWASHIMA, M. (1972), “Fiberscopic observation of velar movements during speech”, Annual Bulletin of the Research Institute of Logopedics and Phoniatrics, University of Tokyo 6, pp. 25-38. VAISSIÈRE, J. (1988), “Prediction of velum movement from phonological specifications”, Phonetica 45, pp. 122-139. VAN DEN BERG, J. (1955), “On the role of the laryngeal ventricle in voice production”, Folia Phoniatrica 7, pp. 57-69. VAN DEN BERG, J. (1956), “Direct and indirect determination of the mean subglottic pressure”, Folia Phoniatrica 8, pp. 1-24. VAN DEN BERG, J. (1958), “Myoelastic-aerodynamic theory of voice production”, Journal of Speech and Hearing Research, pp. 227-244. VAN EIJDEN, T.M. and KOOLSTRA, J. (1998), “A model for mylohyoid muscle mechanics”, Journal of Biomechanics 31, pp. 1017-1024.
206
From Speech Physiology to Linguistic Phonetics
VAN EIJDEN, T.M., KORFAGE, J.A. and BRUGMAN, P. (1997), “Architecture of the human jaw-closing and jaw-opening muscles”, The Anatomical Record 248, pp. 464-474. VAN RIPER, C. and IRWIN, J.V. (1958), Voice and Articulation, Englewood Cliffs, Prentice Hall. VATIKIOTIS-BATESON, E. and OSTRY, D. (1995), “An analysis of the dimensionality of jaw motion in speech”, Journal of Phonetics 23, pp. 101-119. VAXELAIRE, B. and SOCK, R. (1996), “A cineradiographic and acoustic study of velar gestures in French consonant sequences as a function of speech rate”, 1st ESCA Tutorial and Research Workshop on Speech Production Modelling: From Control Strategies to Acoustics, Autrans, pp. 65-68. WARREN, D.W., DALSTON, R.M. and MAYO, R. (1993), “Aerodynamics of nasalization”, in HUFFMAN, M.K. and KRAKOW, R.A. (Eds.), Phonetics and Phonology, Nasals, Nasalization and the Velum, San Diego, Academic Press, pp. 119-144. WESTBURY, J. (1988), “Mandible and hyoid bone movements during speech”, Journal of Speech and Hearing Research 31, pp. 405-416. WESTBURY, J., LINDSTROM, M. and MCCLEAN, M. (2002), “Tongue and lips without jaws: a method for decoupling speech movements”, Journal of Speech, Language and Hearing Research 45, pp. 651-662. WILHELMS-TRICARICO, R. (1996), “A biomechanical and physiologically-based vocal tract model and its control”, Journal of Phonetics 24, pp. 23-28. WIOLLAND, F. (1971), “Les mouvements du maxillaire dans la chaîne parlée”, Travaux de l’Institut de Phonétique de Strasbourg 3, pp. 57-119. WOHLERT, A.B. and HAMMMEN, V.L. (2000), “Lip muscle activity related to speech rate and loudness”, Journal of Speech and Hearing Research 43, pp. 1229-1239. WURM, S.A. (1972), Languages of Australia and Tasmania, The Hague, Mouton. WYKE, B.D. (1974), “Laryngeal myotatic reflexes and phonation”, Folia Phoniatrica 26, pp. 249-264.
Bibliography
207
YANAGISAWA, E. and YANAGISAWA, R. (1991), “Laryngeal photography”, Otolaryngological Clinic of North America 1, pp. 999-1022. ZAGZEBSKI, J.A. (1975), “Ultrasonic measurement of lateral pharyngeal wall motion at two levels in the vocal tract”, Journal of Speech and Hearing Research 18, pp. 308-318. ZEMLIN, W.R. (1997), Speech and Hearing Science; Anatomy and Physiology, Englewood Cliffs, NJ, Prentice Hall. ZEROUAL, C., ESLING, J.H. and CREVIER-BUCHMAN, L. (2006), “Etude des adductions/abductions totales et partielles des cordes vocales”, XXVIèmes Journées d’Etudes sur la Parole, Dinard, IRISA, pp. 549-542. ZINKIN, N.I. (1958), Les mécanismes de la parole, Moscow, Académie des sciences pédagogiques (in Russian).
This page intentionally left blank
Index
A abdominal, 3, 5, 6, 7, 14 abduction, 27, 52, 60 accent, 14, 20 activity potentials, 95 adduction, 27, 53, 57, 60, 111, 113 aerodynamic, 11, 14, 51, 54, 57, 58, 97 aerophonometry, 89 affricates, 148 air channel, 126, 127, 134, 147 airflow, 15, 16, 27, 50, 53, 59, 60, 61, 62, 79, 89, 91, 97, 112, 117, 119, 120, 122, 124, 125, 141, 142, 150, 167 airway, 79 allophone, 62, 112, 155, 173, 174, 175, 177 allophonic variation, 123 alveolar, 66, 67, 70, 90, 101, 119, 120, 128, 140 alveolar ridge, 66, 67, 142, 143, 144, 147 alveolars, 113 alveoli, 3 amygdaloglossus, 70 anticipation, 90, 91, 104, 153, 154
aperture, 73, 97, 110, 114, 115, 125, 126, 131, 135, 136, 138, 177 apex, 33, 68, 80, 143, 144 apicoalveolar, 143, 144 apicodental, 143 apicolabial, 143 apico-postalveolar, 140, 143, 144 approximant, 127, 140, 141, 143, 145, 148 articulators, 64, 65, 97, 104, 110, 111, 112, 113, 114, 116, 119, 125, 127, 129, 135, 142, 143, 147, 148, 150, 154, 158, 162, 165, 166, 177 articulatory mode, 100, 125, 126, 131, 150, 151, 168 articulatory target, 129, 158 arytenoid cartilages, 27, 28, 31, 33, 38, 39, 51, 52, 61 arytenoid, 25, 27, 28, 31, 32, 33, 34, 36, 37, 38, 39, 51, 52, 56, 61 arytenoids, 29, 30, 31, 33, 37, 39, 61, 62, 121, 122, 124, 125 aspiration, 4, 5, 52, 61, 63, 64, 123, 124 assimilation, 89, 151, 154, 161, 173, 174 asynchronicity, 130 attributes, 163, 164
210
From Speech Physiology to Linguistic Phonetics
B beating, 147 Bernoulli, 52, 53, 54, 57, 60 effect, 53, 57, 122, 148 bi-dental, 143 bilabial, 99, 100, 104, 108, 111, 112, 119, 120, 142, 147 bilabials, 111, 112 Body-Cover theory, 53 breathy, 60 breathy mode, 122 breathy voice, 122 buccinator, 100, 101, 112, 113 bucco-pharyngeal cavity, 119, 126 C cardinal vowels, 136, 137 chest voice, 18, 19, 36, 56, 57, 58 chronology, 161, 172 clicks, 119, 120, 131 coarticulation, 89, 91, 104, 115, 153, 154, 157, 158, 159, 161, 164, 170, 172, 173, 174, 175, 176, 177, 178 consonantal continuation, 89 constriction, 14, 46, 60, 63, 64, 79, 93, 100, 112, 126, 129, 134, 141, 142, 148, 149, 172 constrictives, 126, 127, 129, 131, 141, 146, 149 control mechanism, 15, 112, 161, 163, 166, 168 coordination, 15, 104, 114, 115, 129, 162, 163, 164, 170, 172 coordinative structures, 16, 112, 162, 164, 165, 166, 167, 168, 169, 170, 171, 172, 175, 177, 178 co-production, 112, 153, 158, 169, 170, 172, 175, 177, 178 corners of the mouth, 98, 99, 100, 101, 104, 110, 113, 135
coronal, 127 cricoarytenoid, 31, 33, 35, 36, 38, 51 cricoarytenoid joints, 33, 51 cricoid cartilage, 16, 25, 29, 30, 31, 32, 34, 36, 39, 43, 46, 51, 83, 85 cricothyroid muscle, 33, 35, 36, 50, 51 D, E degrees of freedom, 115, 116, 160, 163, 164, 166, 167, 177 depressor labii inferioris, 100, 101, 111, 112, 113 diaphragm, 3, 4, 6 digastric, 44 digastricus, 46, 109 diphthongs, 138, 139 diplophonia, 57 dorsal muscles, 7 dorsopalatal, 145 dorsoprepalatal, 145 dorso-uvular, 146 dorsovelar, 90, 146 dorsum, 68, 69, 70, 72, 75, 77, 89, 114, 127, 143, 145, 146, 147 duration, 4, 11, 56, 111, 123, 129, 130, 131, 137, 138, 149, 150, 158, 161, 167 egressive, 61, 117, 119, 120, 123 ejectives, 119 electro-glottography, 51 electromagnetometry, 98, 114 electromyography, 51, 89, 94, 98, 150, 157 electropalataography, 98, 149, 151 endoscopy, 51, 80 epiglottis, 24, 25, 26, 28, 32, 34, 38, 39, 43, 50, 63, 68, 72, 83, 93, 127, 128, 145, 146 exoexolabial, 142 explosion, 91
Index
explosive, 119 external intercostals, 6, 7, 13, 15, 168 F, G, H f0, 15, 16, 18, 19, 36, 38, 39, 44, 46, 48, 58, 60 falsetto register, 57 feedback, 17, 166, 169 flaps, 131, 147, 148 flow, 5 fMRI, 129 fortis, 123, 141, 150, 151 fricative, 61, 63, 142, 148, 149, 158 fricatives, 44, 63, 70, 91, 93, 101, 111, 119, 125, 127, 145, 146, 148 fundamental frequency, 19, 116, 124, 139 geminates, 149, 150 genioglossus, 44, 46, 69, 70, 72, 73, 75, 77 geniohyoid, 44, 46, 109 gestures, 116, 123, 126, 127, 129, 130, 133, 147, 148, 153, 158, 160, 165, 169, 170, 172, 175 glottal, 14, 18, 19, 27, 35, 37, 51, 52, 53, 55, 58, 59, 60, 62, 63, 64, 93, 122, 124, 125, 127, 128, 147 glottal closure, 62, 125 glottal fry, 124 glottal occlusion, 60, 122, 147 glottal stop, 62, 125 glottis, 16, 18, 25, 27, 29, 33, 39, 51, 52, 53, 57, 59, 60, 62, 63, 93, 121, 122, 125, 128, 133, 146 hard palate, 66, 67, 69, 79, 81, 93, 120, 127, 128, 145 held phase, 46, 48, 61, 93, 112, 119, 121, 123, 141, 149, 150 homorganic, 89, 90, 91, 113, 119, 122, 123 hyoglossus, 44, 46, 69, 72, 73, 77
211
hyoid bone, 23, 34, 43, 44, 45, 46, 48, 53, 65, 67, 69, 72, 73, 75, 77, 84, 94, 106, 109, 116, 162 I, J implosives, 119 inferior constrictor, 83, 84, 86 inferior longitudinal, 69, 70, 72, 73 ingressive, 1, 119, 120 intensity, 4, 15, 16, 18, 19, 36, 38, 39, 56, 58, 112 interarytenoid, 25, 33, 35, 39, 52 interarytenoid muscle, 39 interlabial pressure, 111 internal intercostals, 7, 11, 13, 14, 15 internal pterygoid, 99, 107, 108, 111 interneuronal, 162 intra-oral pressure, 15, 17, 91, 94, 111, 112, 119, 130, 141, 150 invariance, 155, 165, 166, 171, 175, 177 invariant, 155, 157, 161, 162, 165, 167, 171, 177 invariants, 169 jaw, 43, 46, 48, 65, 69, 95, 98, 100, 101, 106, 107, 108, 109, 111, 112, 114, 115, 116, 130, 135, 157, 159, 162, 168, 170 jitter, 59 L labial projection, 147 labiality, 104, 110, 135, 176 labialization, 97 labioalveolar, 112 labiodental, 100, 104, 112, 143, 148 labiopalatal, 112, 113 labiopalatals, 113 labiovelar, 112, 113 labiovelars, 113 laminal, 60, 122, 127, 144
212
From Speech Physiology to Linguistic Phonetics
laminoalveolar, 149 laminodental, 144 laminodentoalveolar, 144 laminopostalveolar, 144 laryngeal adjustment, 18, 57, 58, 59, 121, 167, 177 tension, 15, 19 laryngealization, 60, 61, 122, 124 larynx, 1, 14, 18, 23, 24, 25, 26, 28, 30, 32, 34, 35, 36, 39, 42, 43, 45, 46, 48, 50, 53, 57, 58, 59, 65, 77, 79, 81, 85, 86, 88, 91, 92, 93, 94, 95, 106, 109, 116, 119, 121, 123, 124, 125, 127, 146, 162, 164, 168, 172 lateral articulation, 129 cricoarytenoid, 31, 33, 35, 37, 52 muscle, 31, 35, 37 pterygoid, 107 length, 11, 17, 28, 35, 37, 50, 53, 54, 56, 57, 60, 62, 63, 69, 75, 77, 79, 80, 91, 92, 93, 122, 125, 138, 149, 150, 167, 168, 170, 177 lenis, 123, 141, 150, 151 levator anguli oris, 99, 101, 104, 111, 112, 113 levator labii superioris alaeque nasi, 100 levator palatini, 80, 94 levator veli palatini, 80, 81, 88, 92, 94 lips, 53, 56, 58, 65, 90, 97, 98, 99, 100, 101, 104, 110, 111, 112, 113, 114, 115, 116, 125, 127, 128, 130, 133, 135, 137, 142, 143, 147, 150, 151, 153, 168, 170, 172 lower jaw, 106 lingual, 44 maxillary, 106, 127 pharyngeal constrictor, 46
lungs, 1, 3, 4, 5, 6, 7, 10, 11, 13, 17, 23, 30, 50, 52, 59, 60, 117, 167, 168 M maintained, 10, 90 major zygomatic, 100, 101, 104, 113 mandible, 45, 46, 75, 101, 104, 106, 107, 108, 109, 115 masseter, 99, 101, 107, 108, 111, 157 mentalis, 99, 101 middle constrictor, 83, 85 pharyngeal constrictor, 44 minor zygomatic, 100, 104, 113 modal voice, 59, 60, 122 motor units, 163 MRI, 48, 51, 80, 89, 114 murmur, 60, 61, 122, 124, 141, 168 mylohyoid, 44, 46, 109 N, O , P nasal airflow, 89 nasal cavities, 23, 66, 79, 83, 89, 125, 140, 141 nasality, 88, 89, 95, 110, 153, 176 nasalization, 73, 79, 81, 83, 89, 90, 91, 140, 154 occlusion, 93, 111, 126, 127 occlusives, 93 oesophagus, 4, 17, 23, 83 oral cavity, 16, 23, 65, 66, 67, 68, 69, 79, 83, 88, 90, 119, 125, 135, 140 orbicularis, 99, 101, 104, 111, 112, 113 orbicularis oris, 99 output, 11, 15, 18, 23, 53, 58, 77, 112 output of air, 11, 58 palatal vault, 79, 135 palatoglossus, 73, 81, 88, 94 palatography, 151 palatopharyngeus, 81, 85, 94
Index
perception, 19, 20, 89, 155, 156, 157, 159, 173, 175 perseveration, 90, 104, 153, 154 pharyngeal, 44, 46, 63, 65, 67, 68, 73, 78, 80, 81, 83, 84, 85, 86, 88, 89, 91, 92, 93, 94, 95 pharyngeal wall, 63, 65, 67, 68, 80, 81, 85, 88, 93, 94, 95, 141, 146 pharyngealization, 93, 140, 146 pharyngoglossus, 70 pharyngo-pharyngeal, 146 pharyngo-staphyline muscle, 43, 94 pharynx, 1, 16, 23, 43, 46, 65, 68, 70, 77, 78, 79, 81, 83, 84, 85, 86, 88, 91, 92, 93, 94, 95, 127, 128, 140, 141, 146 phonation, 1, 10, 11, 13, 15, 16, 23, 25, 28, 37, 39, 50, 52, 59, 60, 64, 80, 88, 93, 99, 150, 163, 167, 177 phonatory disorder, 124 modes, 117, 121, 139, 151 phonetic features, 63, 64, 153, 155, 158, 163 physiological effort, 19, 20 piriform sinuses, 88, 91 place of articulation, 88, 111, 112, 126, 127, 133, 135, 148, 154, 175 platysma, 101, 111 plosion, 148 plosive, 18, 44, 46, 61, 62, 63, 65, 69, 70, 80, 88, 89, 90, 91, 100 post-alveolar, 120 posterior cricoarytenoid muscle, 31, 33, 36 post-palatal, 128 pre-palatal, 128, 140 protrusion, 31, 70, 75, 99, 100, 101, 107, 108, 110, 111, 113, 116, 147, 153, 158, 166, 168
213
Q, R quantity, 1, 10, 11, 13, 58, 138, 149, 166 radicopharyngeal, 146 radiocinematography, 93 rate, 5, 15, 60, 61, 80 release, 61, 64, 69, 90, 91, 100, 101, 111, 119, 120, 123, 124, 129, 131, 147, 148, 149 respiration, 1, 2, 4, 5, 6, 7, 10, 11, 13, 15, 18, 20, 52, 60, 89, 163, 167, 177 retraction, 48, 75, 80, 93, 109 retroflexion, 140 rib cage, 1, 2, 3, 4, 5, 14, 168 risorius, 100, 101, 104, 113 root, 43, 45, 68, 69, 70, 73, 83, 88, 127, 141, 143, 146 of the tongue, 43, 45, 70, 73, 83, 88 S salpingopharyngeus, 85 semi-consonants, 131, 142, 148 semivowels, 124 shimmer, 59 soft palate, 43, 66, 67, 70, 78, 80, 81, 83, 85, 90, 146, 170 speech rate, 161 sternothyroid, 24, 48 stop consonants, 101, 111, 112, 119, 121, 122, 123, 124, 170 stops, 4, 62, 69, 93, 99, 101, 104, 111, 112, 119, 121, 122, 123, 124, 125, 126, 130, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 169, 170 strength of articulation, 126, 131 stroboscopy, 51 styloglossus, 69, 70, 72, 73, 75, 77 stylopharyngeus, 43, 81, 85
214
From Speech Physiology to Linguistic Phonetics
subglottal pressure, 11, 15, 16, 17, 18, 19, 20, 50, 52, 53, 57, 58, 60, 63, 64, 121, 122, 130, 141, 167, 168 sublaminopostalveolar, 144 sublamino-prepalatal, 140, 144 superior constrictor, 70, 83, 94 longitudinal, 69, 70 suprahyoid, 46, 101, 106, 107, 109 syllable, 13, 14, 15, 20, 37, 46, 90, 91,123, 134, 137, 138, 149, 158, 161, 169, 170 T task dynamics, 116, 172 teeth, 66, 106, 113, 127, 128, 142, 143, 166 temporal bone, 43, 73, 80 temporalis, 99, 106, 107, 108, 111 temporo-maxillary joints, 106, 107 tension, 27, 33, 35, 37, 38, 39, 42, 46, 50, 52, 53, 56, 57, 58, 60, 61, 85, 88, 89, 92, 93, 94, 95, 121, 122, 124, 131, 140, 141, 146, 149, 150, 168, 169 tensor veli palatini, 80 theory of action, 13 thoracic cage, 3 thyroarytenoid, 27, 29, 34, 35, 37, 38, 39, 58, 60 muscle, 29, 38, 39 thyrohyoid, 24, 31, 34, 44, 48 thyroid cartilage, 24, 28, 29, 30, 31, 32, 34, 36, 38, 39, 43, 46, 48, 51, 56, 81, 84, 85 timing, 64, 89, 116, 123, 126, 161, 172, 174 tip, 69, 72, 73, 75, 77, 113, 114, 127, 130, 140, 143, 144, 147 of the tongue, 69, 72, 143, 147 tones, 18, 139, 140
tongue, 23, 32, 43, 46, 48, 53, 67, 68, 69, 70, 72, 73, 75, 77, 79, 81, 83, 86, 88, 89, 92, 93, 94, 95, 97, 106, 112, 114, 115, 116, 120, 127, 128, 129, 130, 131, 135, 136, 140, 141, 143, 144, 145, 146, 147, 150, 151, 153, 162, 168, 170, 172 blade, 127, 128, 144 root, 53, 68, 88, 94, 95 tonogenesis, 140 transglottal pressure, 15, 19, 46, 52, 122, 138 translation, 36, 154, 155, 156, 159, 160, 161 transverse muscle, 39, 69, 70 triangularis, 99 triphthongs, 139 U, V, W, X, Z ultrasound, 75, 89, 95, 98, 115 upper lingual, 44 uvula, 67, 79, 80, 83, 127, 128, 146, 147 variability, 59, 98, 114, 117, 138, 150, 153, 154, 155, 158, 161, 162, 165, 175 velum, 43, 66, 67, 73, 79, 80, 81, 86, 88, 89, 91, 93, 95, 127, 128, 130, 140, 141, 145, 162, 172 ventricles of Morgagni, 27 vertical muscle, 70 vestibular, 26, 27, 57, 91 vestibular folds, 146, 147 vocal cords, 50 folds, 19, 26, 27, 28, 33, 34, 35, 36, 37, 38, 39, 42, 50, 51, 52, 53, 54, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 85, 93, 95, 122, 124, 125, 130, 138, 147, 167, 168 fry, 61, 124
Index
ligament, 27, 29, 53, 57, 61 processes, 27, 28, 36, 37, 51 tract, 14, 17, 19, 58, 59, 64, 65, 69, 75, 79, 92, 94, 115, 116, 119, 125, 126, 127, 147, 148, 157, 160, 162, 167, 168, 169, 170, 177 vocalis, 27, 29, 35, 60 voice onset time, 63, 123 register, 19
215
voicelessness, 59, 60, 62, 122, 124, 125 voicing, 37, 38, 46, 48, 60, 61, 63, 122, 123, 124, 151, 176 VOT, 63, 64, 123, 129 vowel harmony, 141, 175 timbre, 138 whisper, 60, 62, 122, 125, 168 X-ray, 75, 80, 89, 98, 141, 149 zygomatic arch, 108