William R. Leben
Autosegments were introduced by John Goldsmith in his 1976 M.I.T. dissertation to represent tone and other suprasegmental phenomena. Goldsmith’s intuition, embodied in the term he created, was that autosegments constituted an independent, conceptually equal tier of phonological representation, with both tiers realized simultaneously like the separate voices in a musical score.
The analysis of suprasegmentals came late to generative phonology, even though it had been tackled in American structuralism with the long components of Harris’s 1944 article, “Simultaneous components in phonology” and despite being a particular focus of Firthian prosodic analysis. The standard version of generative phonology of the era (Chomsky and Halle’s The Sound Pattern of English) made no special provision for phenomena that had been labeled suprasegmental or prosodic by earlier traditions.
An early sign that tones required a separate tier of representation was the phenomenon of tonal stability. In many tone languages, when vowels are lost historically or synchronically, their tones remain. The behavior of contour tones in many languages also falls into place when the contours are broken down into sequences of level tones on an independent level or representation. The autosegmental framework captured this naturally, since a sequence of elements on one tier can be connected to a single element on another. But the single most compelling aspect of the early autosegmental model was a natural account of tone spreading, a very common process that was only awkwardly captured by rules of whatever sort. Goldsmith’s autosegmental solution was the Well-Formedness Condition, requiring, among other things, that every tone on the tonal tier be associated with some segment on the segmental tier, and vice versa. Tones thus spread more or less automatically to segments lacking them. The Well-Formedness Condition, at the very core of the autosegmental framework, was a rare constraint, posited nearly two decades before Optimality Theory.
One-to-many associations and spreading onto adjacent elements are characteristic of tone but not confined to it. Similar behaviors are widespread in long-distance phenomena, including intonation, vowel harmony, and nasal prosodies, as well as more locally with partial or full assimilation across adjacent segments.
The early autosegmental notion of tiers of representation that were distinct but conceptually equal soon gave way to a model with one basic tier connected to tiers for particular kinds of articulation, including tone and intonation, nasality, vowel features, and others. This has led to hierarchical representations of phonological features in current models of feature geometry, replacing the unordered distinctive feature matrices of early generative phonology. Autosegmental representations and processes also provide a means of representing non-concatenative morphology, notably the complex interweaving of roots and patterns in Semitic languages.
Later work modified many of the key properties of the autosegmental model. Optimality Theory has led to a radical rethinking of autosegmental mapping, delinking, and spreading as they were formulated under the earlier derivational paradigm.
Phonotactics is the study of restrictions on possible sound sequences in a language. In any language, some phonotactic constraints can be stated without reference to morphology, but many of the more nuanced phonotactic generalizations do make use of morphosyntactic and lexical information. At the most basic level, many languages mark edges of words in some phonological way. Different phonotactic constraints hold of sounds that belong to the same morpheme as opposed to sounds that are separated by a morpheme boundary. Different phonotactic constraints may apply to morphemes of different types (such as roots versus affixes). There are also correlations between phonotactic shapes and following certain morphosyntactic and phonological rules, which may correlate to syntactic category, declension class, or etymological origins.
Approaches to the interaction between phonotactics and morphology address two questions: (1) how to account for rules that are sensitive to morpheme boundaries and structure and (2) determining the status of phonotactic constraints associated with only some morphemes. Theories differ as to how much morphological information phonology is allowed to access. In some theories of phonology, any reference to the specific identities or subclasses of morphemes would exclude a rule from the domain of phonology proper. These rules are either part of the morphology or are not given the status of a rule at all. Other theories allow the phonological grammar to refer to detailed morphological and lexical information. Depending on the theory, phonotactic differences between morphemes may receive direct explanations or be seen as the residue of historical change and not something that constitutes grammatical knowledge in the speaker’s mind.
It has been an ongoing issue within generative linguistics how to properly analyze morpho-phonological processes. Morpho-phonological processes typically have exceptions, but nonetheless they are often productive. Such productive, but exceptionful, processes are difficult to analyze, since grammatical rules or constraints are normally invoked in the analysis of a productive pattern, whereas exceptions undermine the validity of the rules and constraints. In addition, productivity of a morpho-phonological process may be gradient, possibly reflecting the relative frequency of the relevant pattern in the lexicon. Simple lexical listing of exceptions as suppletive forms would not be sufficient to capture such gradient productivity of a process with exceptions. It is then necessary to posit grammatical rules or constraints even for exceptionful processes as long as they are at least in part productive. Moreover, the productivity can be correctly estimated only when the domain of rule application is correctly identified. Consequently, a morpho-phonological process cannot be properly analyzed unless we possess both the correct description of its application conditions and the appropriate stochastic grammatical mechanisms to capture its productivity.
The same issues arise in the analysis of morpho-phonological processes in Korean, in particular, n-insertion, sai-siot, and vowel harmony. Those morpho-phonological processes have many exceptions and variations, which make them look quite irregular and unpredictable. However, they have at least a certain degree of productivity. Moreover, the variable application of each process is still systematic in that various factors, phonological, morphosyntactic, sociolinguistic, and processing, contribute to the overall probability of rule application. Crucially, grammatical rules and constraints, which have been proposed within generative linguistics to analyze categorical and exceptionless phenomena, may form an essential part of the analysis of the morpho-phonological processes in Korean.
For an optimal analysis of each of the morpho-phonological processes in Korean, the correct conditions and domains for its application need to be identified first, and its exact productivity can then be measured. Finally, the appropriate stochastic grammatical mechanisms need to be found or developed in order to capture the measured productivity.
Diane Brentari, Jordan Fenlon, and Kearsy Cormier
Sign language phonology is the abstract grammatical component where primitive structural units are combined to create an infinite number of meaningful utterances. Although the notion of phonology is traditionally based on sound systems, phonology also includes the equivalent component of the grammar in sign languages, because it is tied to the grammatical organization, and not to particular content. This definition of phonology helps us see that the term covers all phenomena organized by constituents such as the syllable, the phonological word, and the higher-level prosodic units, as well as the structural primitives such as features, timing units, and autosegmental tiers, and it does not matter if the content is vocal or manual. Therefore, the units of sign language phonology and their phonotactics provide opportunities to observe the interaction between phonology and other components of the grammar in a different communication channel, or modality. This comparison allows us to better understand how the modality of a language influences its phonological system.
Eystein Dahl and Antonio Fábregas
Zero or null morphology refers to morphological units that are devoid of phonological content. Whether such entities should be postulated is one of the most controversial issues in morphological theory, with disagreements in how the concept should be delimited, what would count as an instance of zero morphology inside a particular theory, and whether such objects should be allowed even as mere analytical instruments.
With respect to the first problem, given that zero morphology is a hypothesis that comes from certain analyses, delimiting what counts as a zero morpheme is not a trivial matter. The concept must be carefully differentiated from others that intuitively also involve situations where there is no overt morphological marking: cumulative morphology, phonological deletion, etc.
About the second issue, what counts as null can also depend on the specific theories where the proposal is made. In the strict sense, zero morphology involves a complete morphosyntactic representation that is associated to zero phonological content, but there are other notions of zero morphology that differ from the one discussed here, such as absolute absence of morphological expression, in addition to specific theory-internal interpretations of what counts as null. Thus, it is also important to consider the different ways in which something can be morphologically silent.
Finally, with respect to the third side of the debate, arguments are made for and against zero morphology, notably from the perspectives of falsifiability, acquisition, and psycholinguistics. Of particular impact is the question of which properties a theory should have in order to block the possibility that zero morphology exists, and conversely the properties that theories that accept zero morphology associate to null morphemes.
An important ingredient in this debate has to do with two empirical domains: zero derivation and paradigmatic uniformity. Ultimately, the plausibility that zero morphemes exist or not depends on the success at accounting for these two empirical patterns in a better way than theories that ban zero morphology.
Daniel Aalto, Jarmo Malinen, and Martti Vainio
Formant frequencies are the positions of the local maxima of the power spectral envelope of a sound signal. They arise from acoustic resonances of the vocal tract air column, and they provide substantial information about both consonants and vowels. In running speech, formants are crucial in signaling the movements with respect to place of articulation. Formants are normally defined as accumulations of acoustic energy estimated from the spectral envelope of a signal. However, not all such peaks can be related to resonances in the vocal tract, as they can be caused by the acoustic properties of the environment outside the vocal tract, and sometimes resonances are not seen in the spectrum. Such formants are called spurious and latent, respectively. By analogy, spectral maxima of synthesized speech are called formants, although they arise from a digital filter. Conversely, speech processing algorithms can detect formants in natural or synthetic speech by modeling its power spectral envelope using a digital filter. Such detection is most successful for male speech with a low fundamental frequency where many harmonic overtones excite each of the vocal tract resonances that lie at higher frequencies. For the same reason, reliable formant detection from females with high pitch or children’s speech is inherently difficult, and many algorithms fail to faithfully detect the formants corresponding to the lowest vocal tract resonant frequencies.
Within the Ryukyuan branch of the Japonic family of languages, present-day Okinawan retains numerous regional variants which have evolved for over a thousand years in the Ryukyuan Archipelago. Okinawan is one of the six Ryukyuan languages that UNESCO identified as endangered. One of the theoretically fascinating features is that there is substantial evidence for establishing a high central phonemic vowel in Okinawan although there is currently no overt surface [ï]. Moreover, the word-initial glottal stop [ʔ] in Okinawan is more salient than that in Japanese when followed by vowels, enabling recognition that all Okinawan words are consonant-initial. Except for a few particles, all Okinawan words are composed of two or more morae. Suffixation or vowel lengthening (on nouns, verbs, and adjectives) provides the means for signifying persons as well as things related to human consumption or production. Every finite verb in Okinawan terminates with a mood element. Okinawan exhibits a complex interplay of mood or negative elements and focusing particles. Evidentiality is also realized as an obligatory verbal suffix.
The function of the voice organ is basically the same in classical singing as in speech. However, loud orchestral accompaniment has necessitated the use of the voice in an economical way. As a consequence, the vowel sounds tend to deviate considerably from those in speech. Male voices cluster formant three, four, and five, so that a marked peak is produced in spectrum envelope near 3,000 Hz. This helps them to get heard through a loud orchestral accompaniment. They seem to achieve this effect by widening the lower pharynx, which makes the vowels more centralized than in speech. Singers often sing at fundamental frequencies higher than the normal first formant frequency of the vowel in the lyrics. In such cases they raise the first formant frequency so that it gets somewhat higher than the fundamental frequency. This is achieved by reducing the degree of vocal tract constriction or by widening the lip and jaw openings, constricting the vocal tract in the pharyngeal end and widening it in the mouth. These deviations from speech cause difficulties in vowel identification, particularly at high fundamental frequencies. Actually, vowel identification is almost impossible above 700 Hz (pitch F5).
Another great difference between vocal sound produced in speech and the classical singing tradition concerns female voices, which need to reduce the timbral differences between voice registers. Females normally speak in modal or chest register, and the transition to falsetto tends to happen somewhere above 350 Hz. The great timbral differences between these registers are avoided by establishing control over the register function, that is, over the vocal fold vibration characteristics, so that seamless transitions are achieved.
In many other respects, there are more or less close similarities between speech and singing. Thus, marking phrase structure, emphasizing important events, and emotional coloring are common principles, which may make vocal artists deviate considerably from the score’s nominal description of fundamental frequency and syllable duration.
Sónia Frota and Marina Vigário
The syntax–phonology interface refers to the way syntax and phonology are interconnected. Although syntax and phonology constitute different language domains, it seems undisputed that they relate to each other in nontrivial ways. There are different theories about the syntax–phonology interface. They differ in how far each domain is seen as relevant to generalizations in the other domain, and in the types of information from each domain that are available to the other.
Some theories see the interface as unlimited in the direction and types of syntax–phonology connections, with syntax impacting on phonology and phonology impacting on syntax. Other theories constrain mutual interaction to a set of specific syntactic phenomena (i.e., discourse-related) that may be influenced by a limited set of phonological phenomena (namely, heaviness and rhythm). In most theories, there is an asymmetrical relationship: specific types of syntactic information are available to phonology, whereas syntax is phonology-free.
The role that syntax plays in phonology, as well as the types of syntactic information that are relevant to phonology, is also a matter of debate. At one extreme, Direct Reference Theories claim that phonological phenomena, such as external sandhi processes, refer directly to syntactic information. However, approaches arguing for a direct influence of syntax differ on the types of syntactic information needed to account for phonological phenomena, from syntactic heads and structural configurations (like c-command and government) to feature checking relationships and phase units. The precise syntactic information that is relevant to phonology may depend on (the particular version of) the theory of syntax assumed to account for syntax–phonology mapping. At the other extreme, Prosodic Hierarchy Theories propose that syntactic and phonological representations are fundamentally distinct and that the output of the syntax–phonology interface is prosodic structure. Under this view, phonological phenomena refer to the phonological domains defined in prosodic structure. The structure of phonological domains is built from the interaction of a limited set of syntactic information with phonological principles related to constituent size, weight, and eurhythmic effects, among others. The kind of syntactic information used in the computation of prosodic structure distinguishes between different Prosodic Hierarchy Theories: the relation-based approach makes reference to notions like head-complement, modifier-head relations, and syntactic branching, while the end-based approach focuses on edges of syntactic heads and maximal projections. Common to both approaches is the distinction between lexical and functional categories, with the latter being invisible to the syntax–phonology mapping. Besides accounting for external sandhi phenomena, prosodic structure interacts with other phonological representations, such as metrical structure and intonational structure.
As shown by the theoretical diversity, the study of the syntax–phonology interface raises many fundamental questions. A systematic comparison among proposals with reference to empirical evidence is lacking. In addition, findings from language acquisition and development and language processing constitute novel sources of evidence that need to be taken into account. The syntax–phonology interface thus remains a challenging research field in the years to come.
The study of second language phonetics is concerned with three broad and overlapping research areas: the characteristics of second language speech production and perception, the consequences of perceiving and producing nonnative speech sounds with a foreign accent, and the causes and factors that shape second language phonetics. Second language learners and bilinguals typically produce and perceive the sounds of a nonnative language in ways that are different from native speakers. These deviations from native norms can be attributed largely, but not exclusively, to the phonetic system of the native language. Non-nativelike speech perception and production may have both social consequences (e.g., stereotyping) and linguistic–communicative consequences (e.g., reduced intelligibility). Research on second language phonetics over the past ca. 30 years has resulted in a fairly good understanding of causes of nonnative speech production and perception, and these insights have to a large extent been driven by tests of the predictions of models of second language speech learning and of cross-language speech perception. It is generally accepted that the characteristics of second language speech are predominantly due to how second language learners map the sounds of the nonnative to the native language. This mapping cannot be entirely predicted from theoretical or acoustic comparisons of the sound systems of the languages involved, but has to be determined empirically through tests of perceptual assimilation. The most influential learner factors which shape how a second language is perceived and produced are the age of learning and the amount and quality of exposure to the second language. A very important and far-reaching finding from research on second language phonetics is that age effects are not due to neurological maturation which could result in the attrition of phonetic learning ability, but to the way phonetic categories develop as a function of experience with surrounding sound systems.