Consonants are a major class of sounds occurring in all human languages. Typologically, consonant inventories are richer than vowel inventories. Consonants have been classified according to four basic features. Airstream mechanism is one of these features and describes the direction of airflow in or out of the oral cavity. The outgoing airflow is further separated according to its origin, that is, air coming from the lungs (pulmonic) or the oral cavity (non-pulmonic). Consonants are also grouped according to their phonological voicing contrast, which can be manifested phonetically by the presence or absence of vocal fold oscillations during the oral closure/constriction phase and by the duration from an oral closure release to the onset of voicing. Place of articulation is the third feature and refers to the location at which a consonantal constriction or closure is produced in the vocal tract. Finally, manner of articulation reflects different timing and coordinated actions of the articulators closely tied to aerodynamic properties.
Susanne Fuchs and Peter Birkholz
Child phonology refers to virtually every phonetic and phonological phenomenon observable in the speech productions of children, including babbles. This includes qualitative and quantitative aspects of babbled utterances as well as all behaviors such as the deletion or modification of the sounds and syllables contained in the adult (target) forms that the child is trying to reproduce in his or her spoken utterances. This research is also increasingly concerned with issues in speech perception, a field of investigation that has traditionally followed its own course; it is only recently that the two fields have started to converge. The recent history of research on child phonology, the theoretical approaches and debates surrounding it, as well as the research methods and resources that have been employed to address these issues empirically, parallel the evolution of phonology, phonetics, and psycholinguistics as general fields of investigation. Child phonology contributes important observations, often organized in terms of developmental time periods, which can extend from the child’s earliest babbles to the stage when he or she masters the sounds, sound combinations, and suprasegmental properties of the ambient (target) language. Central debates within the field of child phonology concern the nature and origins of phonological representations as well as the ways in which they are acquired by children. Since the mid-1900s, the most central approaches to these questions have tended to fall on each side of the general divide between generative vs. functionalist (usage-based) approaches to phonology. Traditionally, generative approaches have embraced a universal stance on phonological primitives and their organization within hierarchical phonological representations, assumed to be innately available as part of the human language faculty. In contrast to this, functionalist approaches have utilized flatter (non-hierarchical) representational models and rejected nativist claims about the origin of phonological constructs. Since the beginning of the 1990s, this divide has been blurred significantly, both through the elaboration of constraint-based frameworks that incorporate phonetic evidence, from both speech perception and production, as part of accounts of phonological patterning, and through the formulation of emergentist approaches to phonological representation. Within this context, while controversies remain concerning the nature of phonological representations, debates are fueled by new outlooks on factors that might affect their emergence, including the types of learning mechanisms involved, the nature of the evidence available to the learner (e.g., perceptual, articulatory, and distributional), as well as the extent to which the learner can abstract away from this evidence. In parallel, recent advances in computer-assisted research methods and data availability, especially within the context of the PhonBank project, offer researchers unprecedented support for large-scale investigations of child language corpora. This combination of theoretical and methodological advances provides new and fertile grounds for research on child phonology and related implications for phonological theory.
Christine Ericsdotter Nordgren
Speech sounds are commonly divided into two main categories in human languages: vowels, such as ‘e’, ‘a’, ‘o’, and consonants, such as ‘k’, ‘n’, ‘s’. This division is made on the basis of both phonetic and phonological principles, which is useful from a general linguistic point of view but problematic for detailed description and analysis. The main differences between vowels and consonants are that (1) vowels are sounds produced with an open airway between the larynx and the lips, at least along the midline, whereas consonants are produced with a stricture or closure somewhere along it; and (2) that vowels tend to be syllabic in languages, meaning that they embody a sonorous peak in a syllable, whereas only some kinds of consonants tend to be syllabic. There are two main physical components needed to produce a vowel: a sound source, typically a tone produced by vocal fold vibration at the larynx, and a resonator, typically the upper airways. When the tone resonates in the upper airways, it gets a specific quality of sound, perceived and interpreted as a vowel quality, for example, ‘e’ or ‘a’. Which vowel quality is produced is determined by the shape of the inner space of the throat and mouth, the vocal tract shape, created by the speaker’s configuration of the articulators, which include the lips, tongue, jaw, hard and soft palate, pharynx, and larynx. Which vowel is perceived is determined by the auditory and visual input as well as by the listener’s expectations and language experience. Diachronic and synchronic studies on vowel typology show main trends in the vowel inventories in the worlds’ languages, which can be associated with human phonetic aptitude.
Lawrence D. Rosenblum
Research on visual and audiovisual speech information has profoundly influenced the fields of psycholinguistics, perception psychology, and cognitive neuroscience. Visual speech findings have provided some of most the important human demonstrations of our new conception of the perceptual brain as being supremely multimodal. This “multisensory revolution” has seen a tremendous growth in research on how the senses integrate, cross-facilitate, and share their experience with one another. The ubiquity and apparent automaticity of multisensory speech has led many theorists to propose that the speech brain is agnostic with regard to sense modality: it might not know or care from which modality speech information comes. Instead, the speech function may act to extract supramodal informational patterns that are common in form across energy streams. Alternatively, other theorists have argued that any common information existent across the modalities is minimal and rudimentary, so that multisensory perception largely depends on the observer’s associative experience between the streams. From this perspective, the auditory stream is typically considered primary for the speech brain, with visual speech simply appended to its processing. If the utility of multisensory speech is a consequence of a supramodal informational coherence, then cross-sensory “integration” may be primarily a consequence of the informational input itself. If true, then one would expect to see evidence for integration occurring early in the perceptual process, as well in a largely complete and automatic/impenetrable manner. Alternatively, if multisensory speech perception is based on associative experience between the modal streams, then no constraints on how completely or automatically the senses integrate are dictated. There is behavioral and neurophysiological research supporting both perspectives. Much of this research is based on testing the well-known McGurk effect, in which audiovisual speech information is thought to integrate to the extent that visual information can affect what listeners report hearing. However, there is now good reason to believe that the McGurk effect is not a valid test of multisensory integration. For example, there are clear cases in which responses indicate that the effect fails, while other measures suggest that integration is actually occurring. By mistakenly conflating the McGurk effect with speech integration itself, interpretations of the completeness and automaticity of multisensory may be incorrect. Future research should use more sensitive behavioral and neurophysiological measures of cross-modal influence to examine these issues.
One of the most fundamental problems in research on spoken language is to understand how the categorical, systemic knowledge that speakers have in the form of a phonological grammar maps onto the continuous, high-dimensional physical speech act that transmits the linguistic message. The invariant units of phonological analysis have no invariant analogue in the signal—any given phoneme can manifest itself in many possible variants, depending on context, speech rate, utterance position and the like, and the acoustic cues for a given phoneme are spread out over time across multiple linguistic units. Speakers and listeners are highly knowledgeable about the lawfully structured variation in the signal and they skillfully exploit articulatory and acoustic trading relations when speaking and perceiving. For the scientific description of spoken language understanding this association between abstract, discrete categories and continuous speech dynamics remains a formidable challenge. Articulatory Phonology and the associated Task Dynamic model present one particular proposal on how to step up to this challenge using the mathematics of dynamical systems with the central insight being that spoken language is fundamentally based on the production and perception of linguistically defined patterns of motion. In Articulatory Phonology, primitive units of phonological representation are called gestures. Gestures are defined based on linear second order differential equations, giving them inherent spatial and temporal specifications. Gestures control the vocal tract at a macroscopic level, harnessing the many degrees of freedom in the vocal tract into low-dimensional control units. Phonology, in this model, thus directly governs the spatial and temporal orchestration of vocal tract actions.
The morpheme was the central notion in morphological theorizing in the 20th century. It has a very intuitive appeal as the indivisible and invariant unit of form and meaning, a minimal linguistic sign. Ideally, that would be all there is to build words and sentences from. But this ideal does not appear to be entirely adequate. At least at a perhaps superficial understanding of form as a series of phonemes, and of meaning as concepts and morphosyntactic feature sets, the form and the meaning side of words are often not structured isomorphically. Different analytical reactions are possible to deal with the empirical challenges resulting from the various kinds of non-isomorphism between form and meaning. One prominent option is to reject the morpheme and to recognize conceptually larger units such as the word or the lexeme and its paradigm as the operands of morphological theory. This contrasts with various theoretical options maintaining the morpheme, terminologically or at least conceptually at some level. One such option is to maintain the morpheme as a minimal unit of form, relaxing the tension imposed by the meaning requirement. Another option is to maintain it as a minimal morphosyntactic unit, relaxing the requirements on the form side. The latter (and to a lesser extent also the former) has been understood in various profoundly different ways: association of one morpheme with several form variants, association of a morpheme with non-self-sufficient phonological units, or association of a morpheme with a formal process distinct from affixation. Variants of all of these possibilities have been entertained and have established distinct schools of thought. The overall architecture of the grammar, in particular the way that the morphology integrates with the syntax and the phonology, has become a driving force in the debate. If there are morpheme-sized units, are they pre-syntactic or post-syntactic units? Is the association between meaning and phonological information pre-syntactic or post-syntactic? Do morpheme-sized pieces have a specific status in the syntax? Invoking some of the main issues involved, this article draws a profile of the debate, following the term morpheme on a by-and-large chronological path from the late 19th century to the 21st century.