Linguistic change not only affects the lexicon and the phonology of words, it also operates on the grammar of a language. In this context, grammaticalization is concerned with the development of lexical items into markers of grammatical categories or, more generally, with the development of markers used for procedural cueing of abstract relationships out of linguistic items with concrete referential meaning. A well-known example is the English verb go in its function of a future marker, as in She is going to visit her friend. Phenomena like these are very frequent across the world’s languages and across many different domains of grammatical categories. In the last 50 years, research on grammaticalization has come up with a plethora of (a) generalizations, (b) models of how grammaticalization works, and (c) methodological refinements. On (a): Processes of grammaticalization develop gradually, step by step, and the sequence of the individual stages follows certain clines as they have been generalized from cross-linguistic comparison (unidirectionality). Even though there are counterexamples that go against the directionality of various clines, their number seems smaller than assumed in the late 1990s. On (b): Models or scenarios of grammaticalization integrate various factors. Depending on the theoretical background, grammaticalization and its results are motivated either by the competing motivations of economy vs. iconicity/explicitness in functional typology or by a change from movement to merger in the minimalist program. Pragmatic inference is of central importance for initiating processes of grammaticalization (and maybe also at later stages), and it activates mechanisms like reanalysis and analogy, whose status is controversial in the literature. Finally, grammaticalization does not only work within individual languages/varieties, it also operates across languages. In situations of contact, the existence of a certain grammatical category may induce grammaticalization in another language. On (c): Even though it is hard to measure degrees of grammaticalization in terms of absolute and exact figures, it is possible to determine relative degrees of grammaticalization in terms of the autonomy of linguistic signs. Moreover, more recent research has come up with criteria for distinguishing grammaticalization and lexicalization (defined as the loss of productivity, transparency, and/or compositionality of former productive, transparent, and compositional structures). In spite of these findings, there are still quite a number of questions that need further research. Two questions to be discussed address basic issues concerning the overall properties of grammaticalization. (1) What is the relation between constructions and grammaticalization? In the more traditional view, constructions are seen as the syntactic framework within which linguistic items are grammaticalized. In more recent approaches based on construction grammar, constructions are defined as combinations of form and meaning. Thus, grammaticalization can be seen in the light of constructionalization, i.e., the creation of new combinations of form and meaning. Even though constructionalization covers many apects of grammaticalization, it does not exhaustively cover the domain of grammaticalization. (2) Is grammaticalization cross-linguistically homogeneous, or is there a certain range of variation? There is evidence from East and mainland Southeast Asia that there is cross-linguistic variation to some extent.
The Kiowa-Tanoan family is a small group of Native American languages of the Plains and pueblo Southwest. It comprises Kiowa, of the eponymous Plains tribe, and the pueblo-based Tanoan languages, Jemez (Towa), Tewa, and Northern and Southern Tiwa. These free-word-order languages display a number of typologically unusual characteristics that have rightly attracted attention within a range of subdisciplines and theories. One word of Taos (my construction based on Kontak and Kunkel’s work) illustrates. In tóm-múlu-wia ‘I gave him/her a drum,’ the verb wia ‘gave’ obligatorily incorporates its object, múlu ‘drum.’ The agreement prefix tóm encodes not only object number, but identities of agent and recipient as first and third singular, respectively, and this all in a single syllable. Moreover, the object number here is not singular, but “inverse”: singular for some nouns, plural for others (tóm-músi-wia only has the plural object reading ‘I gave him/her cats’). This article presents a comparative overview of the three areas just illustrated: from morphosemantics, inverse marking and noun class; from morphosyntax, super-rich fusional agreement; and from syntax, incorporation. The second of these also touches on aspects of morphophonology, the family’s three-tone system and its unusually heavy grammatical burden, and on further syntax, obligatory passives. Together, these provide a wide window on the grammatical wealth of this fascinating family.
Nora C. England
Mayan languages are spoken by over 5 million people in Guatemala, Mexico, Belize, and Honduras. There are around 30 different languages today, ranging in size from fairly large (about a million speakers) to very small (fewer than 30 speakers). All Mayan languages are endangered given that at least some children in some communities are not learning the language, and two languages have disappeared since European contact. Mayas developed the most elaborated and most widely attested writing system in the Americas (starting about 300 BC). The sounds of Mayan languages consist of a voiceless stop and affricate series with corresponding glottalized stops (either implosive and ejective) and affricates, glottal stop, voiceless fricatives (including h in some of them inherited from Proto-Maya), two to three nasals, three to four approximants, and a five vowel system with contrasting vowel length (or tense/lax distinctions) in most languages. Several languages have developed contrastive tone. The major word classes in Mayan languages include nouns, verbs, adjectives, positionals, and affect words. The difference between transitive verbs and intransitive verbs is rigidly maintained in most languages. They usually use the same aspect markers (but not always). Intransitive verbs only indicate their subjects while transitive verbs indicate both subjects and objects. Some languages have a set of status suffixes which is different for the two classes. Positionals are a root class whose most characteristic word form is a non-verbal predicate. Affect words indicate impressions of sounds, movements, and activities. Nouns have a number of different subclasses defined on the basis of characteristics when possessed, or the structure of compounds. Adjectives are formed from a small class of roots (under 50) and many derived forms from verbs and positionals. Predicate types are transitive, intransitive, and non-verbal. Non-verbal predicates are based on nouns, adjectives, positionals, numbers, demonstratives, and existential and locative particles. They are distinct from verbs in that they do not take the usual verbal aspect markers. Mayan languages are head marking and verb initial; most have VOA flexible order but some have VAO rigid order. They are morphologically ergative and also have at least some rules that show syntactic ergativity. The most common of these is a constraint on the extraction of subjects of transitive verbs (ergative) for focus and/or interrogation, negation, or relativization. In addition, some languages make a distinction between agentive and non-agentive intransitive verbs. Some also can be shown to use obviation and inverse as important organizing principles. Voice categories include passive, antipassive and agent focus, and an applicative with several different functions.
Reduplication is a word-formation process in which all or part of a word is repeated to convey some form of meaning. A wide range of patterns are found in terms of both the form and meaning expressed by reduplication, making it one of the most studied phenomenon in phonology and morphology. Because the form always varies, depending on the base to which it is attached, it raises many issues such as the nature of the repetition mechanism, how to represent reduplicative morphemes, and whether or not a unified approach can be proposed to account for the full range of patterns.
Chiyuki Ito and Michael J. Kenstowicz
Typologically, pitch-accent languages stand between stress languages like Spanish and tone languages like Shona, and share properties of both. In a stress language, typically just one syllable per word is accented and bears the major stress (cf. Spanish sábana ‘sheet,’ sabána ‘plain,’ panamá ‘Panama’). In a tone language, the number of distinctions grows geometrically with the size of the word. So in Shona, which contrasts high versus low tone, trisyllabic words have eight possible pitch patterns. In a canonical pitch-accent language such as Japanese, just one syllable (or mora) per word is singled out as distinctive, as in Spanish. Each syllable in the word is assigned a high or low tone (as in Shona); however, this assignment is predictable based on the location of the accented syllable. The Korean dialects spoken in the southeast Kyengsang and northeast Hamkyeng regions retain the pitch-accent distinctions that developed by the period of Middle Korean (15th–16th centuries). For example, in Hamkyeng a three-syllable word can have one of four possible pitch patterns, which are assigned by rules that refer to the accented syllable. The accented syllable has a high tone, and following syllables have low tones. Then the high tone of the accented syllable spreads up to the initial syllable, which is low. Thus, /MUcike/ ‘rainbow’ is realized as high-low-low, /aCImi/ ‘aunt’ is realized as low-high-low, and /menaRI/ ‘parsley’ is realized as low-high-high. An atonic word such as /cintallɛ/ ‘azalea’ has the same low-high-high pitch pattern as ‘parsley’ when realized alone. But the two types are distinguished when combined with a particle such as /MAN/ ‘only’ that bears an underlying accent: /menaRI+MAN/ ‘only parsely’ is realized as low-high-high-low while /cintallɛ+MAN/ ‘only azelea’ is realized as low-high-high-high. This difference can be explained by saying that the underlying accent on the particle is deleted if the stem bears an accent. The result is that only one syllable per word may bear an accent (similar to Spanish). On the other hand, since the accent is realized with pitch distinctions, tonal assimilation rules are prevalent in pitch-accent languages. This article begins with a description of the Middle Korean pitch-accent system and its evolution into the modern dialects, with a focus on Kyengsang. Alternative synchronic analyses of the accentual alternations that arise when a stem is combined with inflectional particles are then considered. The discussion proceeds to the phonetic realization of the contrasting accents, their realizations in compounds and phrases, and the adaptation of loanwords. The final sections treat the lexical restructuring and variable distribution of the pitch accents and their emergence from predictable word-final accent in an earlier stage of Proto-Korean.
Timothy J. Vance
The term rendaku, sometimes translated as sequential voicing, denotes a morphophonemic phenomenon in Japanese. In a prototypical case, an alternating morpheme appears with an initial voiceless obstruent as a word on its own or as the initial element (E1) in a compound but with an initial voiced obstruent as the second element (E2) in a two-element compound. For example, the simplex word /take/ ‘bamboo’ and the compound /take+yabu/ ‘bamboo grove’ (cf. /yabu/ ‘grove’) begin with voiceless /t/, but this morpheme meaning ‘bamboo’ begins with voiced /d/ in /sao+dake/ ‘bamboo (made into a) pole’ (cf. /sao/ ‘pole’). Rendaku was already firmly established in 8th-century Old Japanese (OJ), the earliest variety for which extensive written records exist, and subsequent sound changes have made the alternations phonetically heterogeneous. Many OJ compounds with eligible E2s did not undergo rendaku, and the phenomenon remains pervasively irregular in modern Japanese. There are, however, many factors that promote or inhibit rendaku, and some of these appear to influence native-speaker behavior on experimental tasks. The best known phonological factor is Lyman’s Law, according to which rendaku does not apply to E2s that contain a non-initial voiced obstruent. Many theoretical phonologists endorse the idea that Lyman’s Law is a sub-case of the Obligatory Contour Principle, which rules out identical or similar units if they would be adjacent in some domain. Other well-known factors involve vocabulary stratum (e.g., the resistance to rendaku of recently borrowed E2s) or the morphological/semantic relationship between E2 and E1 (e.g., the resistance to rendaku of coordinate compounds). Some morphemes are idiosyncratically immune to rendaku. Other morphemes alternate but undergo rendaku in some compounds while failing to undergo it in others, even though no known factor is relevant. In addition, many individual compounds vary between a form with rendaku and a form without, and this variability is often not reflected in dictionary entries. Despite its irregularity, rendaku is productive in the sense that it often applies to newly created compounds. Many compounds, of course, are stored (with or without rendaku) in a speaker’s lexicon, but fact that native speakers can apply rendaku not just to existing E2s in novel compounds but even to made-up E2s shows that rendaku as an active process is somehow incorporated into the grammar.
Erich R. Round
The non–Pama-Nyugan, Tangkic languages were spoken until recently in the southern Gulf of Carpentaria, Australia. The most extensively documented are Lardil, Kayardild, and Yukulta. Their phonology is notable for its opaque, word-final deletion rules and extensive word-internal sandhi processes. The morphology contains complex relationships between sets of forms and sets of functions, due in part to major historical refunctionalizations, which have converted case markers into markers of tense and complementization and verbal suffixes into case markers. Syntactic constituency is often marked by inflectional concord, resulting frequently in affix stacking. Yukulta in particular possesses a rich set of inflection-marking possibilities for core arguments, including detransitivized configurations and an inverse system. These relate in interesting ways historically to argument marking in Lardil and Kayardild. Subordinate clauses are marked for tense across most constituents other than the subject, and such tense marking is also found in main clauses in Lardil and Kayardild, which have lost the agreement and tense-marking second-position clitic of Yukulta. Under specific conditions of co-reference between matrix and subordinate arguments, and under certain discourse conditions, clauses may be marked, on all or almost all words, by complementization markers, in addition to inflection for case and tense.
Child phonology refers to virtually every phonetic and phonological phenomenon observable in the speech productions of children, including babbles. This includes qualitative and quantitative aspects of babbled utterances as well as all behaviors such as the deletion or modification of the sounds and syllables contained in the adult (target) forms that the child is trying to reproduce in his or her spoken utterances. This research is also increasingly concerned with issues in speech perception, a field of investigation that has traditionally followed its own course; it is only recently that the two fields have started to converge. The recent history of research on child phonology, the theoretical approaches and debates surrounding it, as well as the research methods and resources that have been employed to address these issues empirically, parallel the evolution of phonology, phonetics, and psycholinguistics as general fields of investigation. Child phonology contributes important observations, often organized in terms of developmental time periods, which can extend from the child’s earliest babbles to the stage when he or she masters the sounds, sound combinations, and suprasegmental properties of the ambient (target) language. Central debates within the field of child phonology concern the nature and origins of phonological representations as well as the ways in which they are acquired by children. Since the mid-1900s, the most central approaches to these questions have tended to fall on each side of the general divide between generative vs. functionalist (usage-based) approaches to phonology. Traditionally, generative approaches have embraced a universal stance on phonological primitives and their organization within hierarchical phonological representations, assumed to be innately available as part of the human language faculty. In contrast to this, functionalist approaches have utilized flatter (non-hierarchical) representational models and rejected nativist claims about the origin of phonological constructs. Since the beginning of the 1990s, this divide has been blurred significantly, both through the elaboration of constraint-based frameworks that incorporate phonetic evidence, from both speech perception and production, as part of accounts of phonological patterning, and through the formulation of emergentist approaches to phonological representation. Within this context, while controversies remain concerning the nature of phonological representations, debates are fueled by new outlooks on factors that might affect their emergence, including the types of learning mechanisms involved, the nature of the evidence available to the learner (e.g., perceptual, articulatory, and distributional), as well as the extent to which the learner can abstract away from this evidence. In parallel, recent advances in computer-assisted research methods and data availability, especially within the context of the PhonBank project, offer researchers unprecedented support for large-scale investigations of child language corpora. This combination of theoretical and methodological advances provides new and fertile grounds for research on child phonology and related implications for phonological theory.
Connectionism is an important theoretical framework for the study of human cognition and behavior. Also known as Parallel Distributed Processing (PDP) or Artificial Neural Networks (ANN), connectionism advocates that learning, representation, and processing of information in mind are parallel, distributed, and interactive in nature. It argues for the emergence of human cognition as the outcome of large networks of interactive processing units operating simultaneously. Inspired by findings from neural science and artificial intelligence, connectionism is a powerful computational tool, and it has had profound impact on many areas of research, including linguistics. Since the beginning of connectionism, many connectionist models have been developed to account for a wide range of important linguistic phenomena observed in monolingual research, such as speech perception, speech production, semantic representation, and early lexical development in children. Recently, the application of connectionism to bilingual research has also gathered momentum. Connectionist models are often precise in the specification of modeling parameters and flexible in the manipulation of relevant variables in the model to address relevant theoretical questions, therefore they can provide significant advantages in testing mechanisms underlying language processes.
Corpus Phonology is an approach to phonology that places corpora at the center of phonological research. Some practitioners of corpus phonology see corpora as the only object of investigation; others use corpora alongside other available techniques (for instance, intuitions, psycholinguistic and neurolinguistic experimentation, laboratory phonology, the study of the acquisition of phonology or of language pathology, etc.). Whatever version of corpus phonology one advocates, corpora have become part and parcel of the modern research environment, and their construction and exploitation has been modified by the multidisciplinary advances made within various fields. Indeed, for the study of spoken usage, the term ‘corpus’ should nowadays only be applied to bodies of data meeting certain technical requirements, even if corpora of spoken usage are by no means new and coincide with the birth of recording techniques. It is therefore essential to understand what criteria must be met by a modern corpus (quality of recordings, diversity of speech situations, ethical guidelines, time-alignment with transcriptions and annotations, etc.) and what tools are available to researchers. Once these requirements are met, the way is open to varying and possibly conflicting uses of spoken corpora by phonological practitioners. A traditional stance in theoretical phonology sees the data as a degenerate version of a more abstract underlying system, but more and more researchers within various frameworks (e.g., usage-based approaches, exemplar models, stochastic Optimality Theory, sociophonetics) are constructing models that tightly bind phonological competence to language use, rely heavily on quantitative information, and attempt to account for intra-speaker and inter-speaker variation. This renders corpora essential to phonological research and not a mere adjunct to the phonological description of the languages of the world.