Noun incorporation (NI) is a grammatical construction where a nominal, usually bearing the semantic role of an object, has been incorporated into a verb to form a complex verb or predicate. Traditionally, incorporation was considered to be a word formation process, similar to compounding or cliticization. The fact that a syntactic entity (object) was entering into the lexical process of word formation was theoretically problematic, leading to many debates about the true nature of NI as a lexical or syntactic process. The analytic complexity of NI is compounded by the clear connections between NI and other processes such as possessor raising, applicatives, and classification systems and by its relation with case, agreement, and transitivity. In some cases, it was noted that no morpho-phonological incorporation is discernable beyond perhaps adjacency and a reduced left periphery for the noun. Such cases were termed pseudo noun incorporation, as they exhibit many properties of NI, minus any actual morpho-phonological incorporation. On the semantic side, it was noted that NI often correlates with a particular interpretation in which the noun is less referential and the predicate is more general. This led semanticists to group together all phenomena with similar semantics, whether or not they involve morpho-phonological incorporation. The role of cases of morpho-phonological NI that do not exhibit this characteristic semantics, i.e., where the incorporated nominal can be referential and the action is not general, remains a matter of debate. The interplay of phonology, morphology, syntax, and semantics that is found in NI, as well as its lexical overtones, has resulted in a wide range of analyses at all levels of the grammar. What all NI constructions share is that according to various diagnostics, a thematic element, usually correlating with an internal argument, functions to a lesser extent as an independent argument and instead acts as part of a predicate. In addition to cases of incorporation between verbs and internal arguments, there are also some cases of incorporation of subjects and adverbs, which remain less well understood.
Inflection is the systematic relation between words’ morphosyntactic content and their morphological form; as such, the phenomenon of inflection raises fundamental questions about the nature of morphology itself and about its interfaces. Within the domain of morphology proper, it is essential to establish how (or whether) inflection differs from other kinds of morphology and to identify the ways in which morphosyntactic content can be encoded morphologically. A number of different approaches to modeling inflectional morphology have been proposed; these tend to cluster into two main groups, those that are morpheme-based and those that are lexeme-based. Morpheme-based theories tend to treat inflectional morphology as fundamentally concatenative; they tend to represent an inflected word’s morphosyntactic content as a compositional summing of its morphemes’ content; they tend to attribute an inflected word’s internal structure to syntactic principles; and they tend to minimize the theoretical significance of inflectional paradigms. Lexeme-based theories, by contrast, tend to accord concatenative and nonconcatenative morphology essentially equal status as marks of inflection; they tend to represent an inflected word’s morphosyntactic content as a property set intrinsically associated with that word’s paradigm cell; they tend to assume that an inflected word’s internal morphology is neither accessible to nor defined by syntactic principles; and they tend to treat inflection as the morphological realization of a paradigm’s cells. Four important issues for approaches of either sort are the nature of nonconcatenative morphology, the incidence of extended exponence, the underdetermination of a word’s morphosyntactic content by its inflectional form, and the nature of word forms’ internal structure. The structure of a word’s inventory of inflected forms—its paradigm—is the locus of considerable cross-linguistic variation. In particular, the canonical relation of content to form in an inflectional paradigm is subject to a wide array of deviations, including inflection-class distinctions, morphomic properties, defectiveness, deponency, metaconjugation, and syncretism; these deviations pose important challenges for understanding the interfaces of inflectional morphology, and a theory’s resolution of these challenges depends squarely on whether that theory is morpheme-based or lexeme-based.
The Kiowa-Tanoan family is a small group of Native American languages of the Plains and pueblo Southwest. It comprises Kiowa, of the eponymous Plains tribe, and the pueblo-based Tanoan languages, Jemez (Towa), Tewa, and Northern and Southern Tiwa. These free-word-order languages display a number of typologically unusual characteristics that have rightly attracted attention within a range of subdisciplines and theories.
One word of Taos (my construction based on Kontak and Kunkel’s work) illustrates. In tóm-múlu-wia ‘I gave him/her a drum,’ the verb wia ‘gave’ obligatorily incorporates its object, múlu ‘drum.’ The agreement prefix tóm encodes not only object number, but identities of agent and recipient as first and third singular, respectively, and this all in a single syllable. Moreover, the object number here is not singular, but “inverse”: singular for some nouns, plural for others (tóm-músi-wia only has the plural object reading ‘I gave him/her cats’).
This article presents a comparative overview of the three areas just illustrated: from morphosemantics, inverse marking and noun class; from morphosyntax, super-rich fusional agreement; and from syntax, incorporation. The second of these also touches on aspects of morphophonology, the family’s three-tone system and its unusually heavy grammatical burden, and on further syntax, obligatory passives. Together, these provide a wide window on the grammatical wealth of this fascinating family.
Young-mee Yu Cho
Due to a number of unusual and interesting properties, Korean phonetics and phonology have been generating productive discussion within modern linguistic theories, starting from structuralism, moving to classical generative grammar, and more recently to post-generative frameworks of Autosegmental Theory, Government Phonology, Optimality Theory, and others. In addition, it has been discovered that a description of important issues of phonology cannot be properly made without referring to the interface between phonetics and phonology on the one hand, and phonology and morpho-syntax on the other. Some phonological issues from Standard Korean are still under debate and will likely be of value in helping to elucidate universal phonological properties with regard to phonation contrast, vowel and consonant inventories, consonantal markedness, and the motivation for prosodic organization in the lexicon.
As might be expected from the difficulty of traversing it, the Sahara Desert has been a fairly effective barrier to direct contact between its two edges; trans-Saharan language contact is limited to the borrowing of non-core vocabulary, minimal from south to north and mostly mediated by education from north to south. Its own inhabitants, however, are necessarily accustomed to travelling desert spaces, and contact between languages within the Sahara has often accordingly had a much greater impact. Several peripheral Arabic varieties of the Sahara retain morphology as well as vocabulary from the languages spoken by their speakers’ ancestors, in particular Berber in the southwest and Beja in the southeast; the same is true of at least one Saharan Hausa variety. The Berber languages of the northern Sahara have in turn been deeply affected by centuries of bilingualism in Arabic, borrowing core vocabulary and some aspects of morphology and syntax. The Northern Songhay languages of the central Sahara have been even more profoundly affected by a history of multilingualism and language shift involving Tuareg, Songhay, Arabic, and other Berber languages, much of which remains to be unraveled. These languages have borrowed so extensively that they retain barely a few hundred core words of Songhay vocabulary; those loans have not only introduced new morphology but in some cases replaced old morphology entirely. In the southeast, the spread of Arabic westward from the Nile Valley has created a spectrum of varieties with varying degrees of local influence; the Saharan ones remain almost entirely undescribed. Much work remains to be done throughout the region, not only on identifying and analyzing contact effects but even simply on describing the languages its inhabitants speak.
The central goal of the Lexical Semantic Framework (LSF) is to characterize the meaning of simple lexemes and affixes and to show how these meanings can be integrated in the creation of complex words. LSF offers a systematic treatment of issues that figure prominently in the study of word formation, such as the polysemy question, the multiple-affix question, the zero-derivation question, and the form and meaning mismatches question.
LSF has its source in a confluence of research approaches that follow a decompositional approach to meaning and, thus, defines simple lexemes and affixes by way of a systematic representation that is achieved via a constrained formal language that enforces consistency of annotation. Lexical-semantic representations in LSF consist of two parts: the Semantic/Grammatical Skeleton and the Semantic/Pragmatic Body (henceforth ‘skeleton’ and ‘body’ respectively). The skeleton is comprised of features that are of relevance to the syntax. These features act as functions and may take arguments. Functions and arguments of a skeleton are hierarchically arranged. The body encodes all those aspects of meaning that are perceptual, cultural, and encyclopedic.
Features in LSF are used in (a) a cross-categorial, (b) an equipollent, and (c) a privative way. This means that they are used to account for the distinction between the major ontological categories, may have a binary (i.e., positive or negative) value, and may or may not form part of the skeleton of a given lexeme. In order to account for the fact that several distinct parts integrate into a single referential unit that projects its arguments to the syntax, LSF makes use of the Principle of Co-indexation. Co-indexation is a device needed in order to tie together the arguments that come with different parts of a complex word to yield only those arguments that are syntactically active.
LSF has an important impact on the study of the morphology-lexical semantics interface and provides a unitary theory of meaning in word formation.
Nora C. England
Mayan languages are spoken by over 5 million people in Guatemala, Mexico, Belize, and Honduras. There are around 30 different languages today, ranging in size from fairly large (about a million speakers) to very small (fewer than 30 speakers). All Mayan languages are endangered given that at least some children in some communities are not learning the language, and two languages have disappeared since European contact. Mayas developed the most elaborated and most widely attested writing system in the Americas (starting about 300 BC).
The sounds of Mayan languages consist of a voiceless stop and affricate series with corresponding glottalized stops (either implosive and ejective) and affricates, glottal stop, voiceless fricatives (including h in some of them inherited from Proto-Maya), two to three nasals, three to four approximants, and a five vowel system with contrasting vowel length (or tense/lax distinctions) in most languages. Several languages have developed contrastive tone.
The major word classes in Mayan languages include nouns, verbs, adjectives, positionals, and affect words. The difference between transitive verbs and intransitive verbs is rigidly maintained in most languages. They usually use the same aspect markers (but not always). Intransitive verbs only indicate their subjects while transitive verbs indicate both subjects and objects. Some languages have a set of status suffixes which is different for the two classes. Positionals are a root class whose most characteristic word form is a non-verbal predicate. Affect words indicate impressions of sounds, movements, and activities. Nouns have a number of different subclasses defined on the basis of characteristics when possessed, or the structure of compounds. Adjectives are formed from a small class of roots (under 50) and many derived forms from verbs and positionals.
Predicate types are transitive, intransitive, and non-verbal. Non-verbal predicates are based on nouns, adjectives, positionals, numbers, demonstratives, and existential and locative particles. They are distinct from verbs in that they do not take the usual verbal aspect markers. Mayan languages are head marking and verb initial; most have VOA flexible order but some have VAO rigid order. They are morphologically ergative and also have at least some rules that show syntactic ergativity. The most common of these is a constraint on the extraction of subjects of transitive verbs (ergative) for focus and/or interrogation, negation, or relativization. In addition, some languages make a distinction between agentive and non-agentive intransitive verbs. Some also can be shown to use obviation and inverse as important organizing principles. Voice categories include passive, antipassive and agent focus, and an applicative with several different functions.
Gemma Rigau and Manuel Pérez Saldanya
This is an advance summary of a forthcoming article in the Oxford Research Encyclopedia of Linguistics. Please check back later for the full article.
Catalan is a Romance language closely related to the Gallo-Romance languages. However, from the 15th century onward, it has adopted some linguistic solutions that have brought it closer to the Ibero-Romance languages, due to close contact with Spanish.
Catalan exhibits five main dialects: Central, Northern, and Balearic, which are ascribed to the Eastern dialectal branch; and Northwestern and Valencian, which belong to the Western one. Central, Northern, and Northwestern Catalan are historical dialects that derived directly from the evolution of the Latin spoken in Old Catalonia (the Catalan-speaking territory located on both sides of the Pyrenees). Conversely, Valencian and Balearic are dialects resulting from the territorial expansion of the old Crown of Aragon in the Middle Ages.
As a Gallo-Romance language, Catalan lost all final unstressed vowels different from a (
Some of the most distinctive morphosyntactic features of Catalan are the following:
(1) Catalan is the only Romance language that exhibits a periphrastic past tense expressed by means of the verb anar “go” + infinitive (Ahir vas cantar “Yesterday you sang”). The periphrastic past coexists with a simple past (Ahir cantares “Yesterday you sang”). Conversely, Catalan does not have a periphrastic future with the movement verb go.
(2) Depending on the dialect, proper names may take the definite article (el, la) or a specific personal article (en, na from the vocative Latin forms
(3) Demonstratives show a two-term system in most Catalan dialects: aquí “here” (proximal) / allà or allí “there” (distal); but in Valencian and some Northwestern dialects there is a three-term system. In contrast with other languages with a two-term system, Catalan expresses proximity both to the speaker and to the addressee with the proximal demonstrative (Aquí on jo sóc “Here where I am”; Aquí on tu ets “There where you are”). The demonstrative systems show the same deictic properties as the movement verbs anar “go” and venir “come” in Catalan dialects.
(4) To express possession by means of a pronoun or a determiner, Catalan may use the genitive clitic en (En conec l’autor “I know its autor”), the genitive personal pronoun (el nostre fill “our son”), the dative clitic (Li rento la cara “I wash his/her face”) or the definite article (Tancaré els ulls “I will close my eyes”).
(5) Existential constructions may contain the predicate haver-hi “there be,” consisting of the locative clitic hi and the verb haver “have” (Hi ha tres estudiants “There are three students”), the copulative verb ser “be” (Tres estudiants ja són aquí “Three students are already here”) or other verbs, whose behavior can be close to an unaccusative verb when preceded by the clitic hi (Aquí hi treballen forners “There are some bakers working here”).
(6) The negative polarity adverb no “not” may be reinforced by the adverbs pas or cap, in some dialects, and it can co-occur with negative polarity items (ningú “anybody/nobody,” res “anything/nothing,” mai “ever/never,” etc.). These polarity items exhibit negative agreement (No hi ha mai ningú “Nobody is ever here”). However, negative polarity items may express positive meaning in some non-declarative syntactic contexts (Si mai vens, truca’m “If you ever come, call me”).
(7) Catalan dialects are rich in yes-no interrogative and confirmative particles (que, o, oi, no, eh, etc.: (Que) plou? “Is it raining?,” Oi que plou? “It’s raining, isn’t it?”
Morphological change refers to change(s) in the structure of words. Since morphology is interrelated with phonology, syntax, and semantics, changes affecting the structure and properties of words should be seen as changes at the respective interfaces of grammar.
On a more abstract level, this point relates to linguistic theory. Looking at the history of morphological theory, mainly from a generative perspective, it becomes evident that despite a number of papers that have contributed to a better understanding of the role of morphology in grammar, both from a synchronic and diachronic point of view, it is still seen as a “Cinderella subject” today. So there is still a need for further research in this area.
Generally, the field of diachronic morphology has been dealing with the identification of the main types of change, their mechanisms as well as the causes of morphological change, the latter of which are traditionally categorized as internal and external change. Some authors take a more general view and state the locus of change can be seen in the transmission of grammar from one generation to the next (abductive change). Concerning the main types of change, we can say that many of them occur at the interfaces with morphology: changes on the phonology–morphology interface like i-mutation, changes on the syntax–morphology interface like the rise of inflectional morphology, and changes on the semantics–morphology like the rise of derivational suffixes. Examples from the history of English (which in this article are sometimes complemented with examples from German and the Romance languages) illustrate that sometimes changes indeed cross component boundaries, at least once (the history of the linking-s in German has even become a prosodic phenomenon). Apart from these interface phenomena, it is common lore to assume morphology-internal changes, analogy being the most prominent example.
A phenomenon regularly discussed in the context of morphological change is grammaticalization. Some authors have posed the question of whether such special types of change really exist or whether they are, after all, general processes of change that should be modeled in a general theory of linguistic change. Apart from this pressing question, further aspects that need to be addressed in the future are the modularity of grammar and the place of morphology.
Some of the basic terminology for the major entities in morphological study is introduced, focusing on the word and elements within the word. This is done in a way which is deliberately introductory in nature and omits a great deal of detail about the elements that are introduced.
Phonotactics is the study of restrictions on possible sound sequences in a language. In any language, some phonotactic constraints can be stated without reference to morphology, but many of the more nuanced phonotactic generalizations do make use of morphosyntactic and lexical information. At the most basic level, many languages mark edges of words in some phonological way. Different phonotactic constraints hold of sounds that belong to the same morpheme as opposed to sounds that are separated by a morpheme boundary. Different phonotactic constraints may apply to morphemes of different types (such as roots versus affixes). There are also correlations between phonotactic shapes and following certain morphosyntactic and phonological rules, which may correlate to syntactic category, declension class, or etymological origins.
Approaches to the interaction between phonotactics and morphology address two questions: (1) how to account for rules that are sensitive to morpheme boundaries and structure and (2) determining the status of phonotactic constraints associated with only some morphemes. Theories differ as to how much morphological information phonology is allowed to access. In some theories of phonology, any reference to the specific identities or subclasses of morphemes would exclude a rule from the domain of phonology proper. These rules are either part of the morphology or are not given the status of a rule at all. Other theories allow the phonological grammar to refer to detailed morphological and lexical information. Depending on the theory, phonotactic differences between morphemes may receive direct explanations or be seen as the residue of historical change and not something that constitutes grammatical knowledge in the speaker’s mind.
This is an advance summary of a forthcoming article in the Oxford Research Encyclopedia of Linguistics. Please check back later for the full article.
Uralic languages are synthetic, agglutinative languages, overwhelmingly suffixing, and they have a rich inflectional morphology in both the nominal and the verbal domain. The Uralic family includes about 30 languages spoken in Europe and in North Eurasia and can be divided in two branches: Finno-Ugric and Samoyed languages. The differentiation of the branches and subgroups is very significant; thus, these general morphological features show a notable variation.
Agglutinating is a general feature, but there are some synchretisms, fusions, and suppletions, and all languages have postpositions beside suffixes.
Nouns and pronouns are inflected for number (singular, plural, and in some languages dual), person, and case but not for gender. All Uralic languages have a case system. However, the number and the nature of cases show a great variety: from three to 18 cases including grammatical cases (nominative, accusative, and genitive) and other, spatial and non-spatial cases. A characteristic feature of these languages is the tripartite location system. The system of personal possessive markers is particularly interesting: The person and the number of the possessor and the number of the thing possessed can be marked by suffixes. Combining the expression of possession and case, the morphotactic rules are different among the languages. Comparative and superlative adjectives are also formed by inflection.
Verbs are inflected for person/number, tense, and mood. Uralic languages generally do not have the passive voice. A characteristic feature of Ugric languages is the double conjugation of transitive verbs depending on the definiteness of the direct object. As verbal aspect is not an inflectional category, several languages use derivational affixes, namely, a rich system of preverbs in expressing aspect and Aktionsart.
Matthew J. Carroll
This is an advance summary of a forthcoming article in the Oxford Research Encyclopedia of Linguistics. Please check back later for the full article.
The Yam Languages are a primary language family that is spoken in Southern New Guinea across an area spanning around 180 km west to east across both the Indonesian province of Papua and Papua New Guinea.
The Yam languages are morphologically remarkable for their complex verbal inflection characterized by a tendency to distributed inflectional exponence across multiple sites. Under this pattern of distributed exponence, segmental formatives, that is, affixes, are identifiable but assigning any coherent semantics to these elements is often difficult and instead the inflectional meanings can only be determined once multiple formatives have been combined. This raises interesting theoretical and typological questions about monotonic notions of morpheme and the isomorphic alignment of meaning and form. Yam languages are known for their complex inflectional morphology but display comparatively impoverished word formation or derivational morphology.
Nominal inflection is characterized by moderately large inventories of cases, the largest displaying 16 cases. Nouns may also be marked for number but this is typically restricted to certain case values. Verbal paradigms are also large; verbs mark agreement with up to two arguments in person, number, and at times natural gender. Additionally, languages display numerous tense, aspect, and mood values; this typically involves at least two aspect values, multiple past tense values, and some level of grammatical mood marking. Verbs may also be marked for diathesis, direction, and/or pluractionality.
Architecturally, nominal inflection is rather straightforward with nominal taking case suffixes or clitics with little to no inflectional classes. The true complexity lies in the organization of the verbal inflectional system and the prevalence of distributed exponence. While each language exploits distributed exponence in a unique manner, there are a number of architectural generalizations that can be made across the family. The languages display a remarkably similar inflectional template for verbs and inflectional classes are organized along similar lines. The primary inflectional class divide is between prefixing and ambifixing verbs. Prefixing verbs mark their agreement with a prefix only while ambifixing verbs mark agreement with the suffix, for monovalent clauses, or with both a prefix and a suffix for bivalent verbs. The verbal template involves these agreement prefixes and suffixes that also mark tense, aspect, and mood. The most prominent of those are a set of agreement prefixes known as undergoer prefixes, which mark tense, aspect, and mood in a non-transparent or morphomic manner.
It has been an ongoing issue within generative linguistics how to properly analyze morpho-phonological processes. Morpho-phonological processes typically have exceptions, but nonetheless they are often productive. Such productive, but exceptionful, processes are difficult to analyze, since grammatical rules or constraints are normally invoked in the analysis of a productive pattern, whereas exceptions undermine the validity of the rules and constraints. In addition, productivity of a morpho-phonological process may be gradient, possibly reflecting the relative frequency of the relevant pattern in the lexicon. Simple lexical listing of exceptions as suppletive forms would not be sufficient to capture such gradient productivity of a process with exceptions. It is then necessary to posit grammatical rules or constraints even for exceptionful processes as long as they are at least in part productive. Moreover, the productivity can be correctly estimated only when the domain of rule application is correctly identified. Consequently, a morpho-phonological process cannot be properly analyzed unless we possess both the correct description of its application conditions and the appropriate stochastic grammatical mechanisms to capture its productivity.
The same issues arise in the analysis of morpho-phonological processes in Korean, in particular, n-insertion, sai-siot, and vowel harmony. Those morpho-phonological processes have many exceptions and variations, which make them look quite irregular and unpredictable. However, they have at least a certain degree of productivity. Moreover, the variable application of each process is still systematic in that various factors, phonological, morphosyntactic, sociolinguistic, and processing, contribute to the overall probability of rule application. Crucially, grammatical rules and constraints, which have been proposed within generative linguistics to analyze categorical and exceptionless phenomena, may form an essential part of the analysis of the morpho-phonological processes in Korean.
For an optimal analysis of each of the morpho-phonological processes in Korean, the correct conditions and domains for its application need to be identified first, and its exact productivity can then be measured. Finally, the appropriate stochastic grammatical mechanisms need to be found or developed in order to capture the measured productivity.
The Dravidian languages, spoken mainly in southern India and south Asia, were identified as a separate language family between 1816 and 1856. Four of the 26 Dravidian languages, namely Tamil, Telugu, Kannada, and Malayalam, have long literary traditions, the earliest dating back to the 1st century
A typical characteristic of Dravidian, which is also an areal characteristic of south Asian languages, is that experiencers and inalienable possessors are case-marked dative. Another is the serialization of verbs by the use of participles, and the use of light verbs to indicate aspectual meaning such as completion, self- or nonself-benefaction, and reflexivization. Subjects, and arguments in general (e.g., direct and indirect objects), may be nonovert. So is the copula, except in Malayalam.
A number of properties of Dravidian are of interest from a universalist perspective, beginning with the observation that not all syntactic categories N, V, A, and P are primitive. Dravidian postpositions are nominal or verbal in origin. A mere 30 Proto-Dravidian roots have been identified as adjectival; the adjectival function is performed by inflected verbs (participles) and nouns. The nominal encoding of experiences (e.g., as fear rather than afraid/afeared) and the absence of the verb have arguably correlate with the appearance of dative case on experiencers. “Possessed” or genitive-marked N may fulfill the adjectival function, as noticed for languages like Ulwa (a less exotic parallel is the English of-possessive construction: circles of light, cloth of gold). More uniquely perhaps, Kannada instantiates dative-marked N as predicative adjectives. A recent argument that Malayalam verbs originate as dative-marked N suggests both that N is the only primitive syntactic category, and the seminal role of the dative case.
Other important aspects of Dravidian morphosyntax to receive attention are anaphors and pronouns (not discussed here; see separate article, anaphora in Dravidian), in particular the long-distance anaphor taan and the verbal reflexive morpheme; question (wh-) words and the question/disjunction morphemes, which combine in a semantically transparent way to form quantifier words like someone; the use of reduplication for distributive quantification; and the occurrence of ‘monstrous agreement’ (first-person agreement in clauses embedded under a speech predicate, triggered by matrix third-person antecedents).
Traditionally, agreement has been considered the finiteness marker in Dravidian. Modals, and a finite form of negation, also serve to mark finiteness. The nonfinite verbal complement to the finite negative may give the negative clause a tense interpretation. Dravidian thus attests matrix nonfinite verbs in finite clauses, challenging the equation of finiteness with tense.
The Dravidian languages are considered wh-in situ languages. However, wh-words in Malayalam appear in a pre-verbal position in the unmarked word order. The apparently rightward movement of some wh-arguments could be explained by assuming a universal VO order, and wh-movement to a preverbal focus phrase. An alternative analysis is that the verb undergoes V-to-C movement.
Natural Morphology (NM) is a functionalist theory that aims to account for morphological preferences on the basis of extra-linguistic motivations. It is hierarchically structured in three (partially conflicting) sub-theories. The first sub-theory of universal naturalness (markedness) focuses on cognitive and semiotic principles such as transparency, iconicity, and bi-uniqueness, which are modeled in terms of parametric relations. Within the second sub-theory of typological naturalness, choices on the universal preference parameters are coordinated. The third sub-theory of language-specific naturalness elaborates what is normal in the potential system of a specific language. NM also puts special emphasis on the interface of morphology with other linguistic and non-linguistic components, thereby opening the new fields of morpho-pragmatics, morpho-notactics (as a special part of morphonology), and extra-grammatical morphology. A range of gradual clines are designed to assess not only transitions between adjacent components of grammar, but also within morphology between compounding, derivation, and inflection; and for notions such as regularity—sub-regularity—irregularity/suppletion, degrees of productivity, or of headedness. A double model of rivaling input-based productive dynamic morphology and of word-form-based stored static morphology is assumed within language-specific naturalness. Theoretical constructs are supported by ample external evidence, especially from diachrony and psycholinguistics.
Nominalization refers both to the process by which complex nouns are created and to the complex nouns that are derived by that process. Nominalizations common in the languages of the world include event/result nouns, personal or participant nouns (agent, patient, location, etc.), as well as collectives and abstracts. It is common for nominalizations to be highly polysemous. Theoretical issues concerning nominalization typically stem from the question of how to account for this pervasive polysemy. Within generative grammar, both syntactic and lexicalist approaches have been proposed. The issue of polysemy in nominalization has also been of interest within cognitive and functional frameworks. The article considers, finally, the extent to which nominalization is subject to competition and blocking.
Number is the category through which languages express information about the individuality, numerosity, and part structure of what we speak about. As a linguistic category it has a morphological, a morphosyntactic, and a semantic dimension, which are variously interrelated across language systems. Number marking can apply to a more or less restricted part of the lexicon of a language, being most likely on personal pronouns and human/animate nouns, and least on inanimate nouns. In the core contrast, number allows languages to refer to ‘many’ through the description of ‘one’; the sets referred to consist of tokens of the same type, but also of similar types, or of elements pragmatically associated with one named individual. In other cases, number opposes a reading of ‘one’ to a reading as ‘not one,’ which includes masses; when the ‘one’ reading is morphologically derived from the ‘not one,’ it is called a singulative. It is rare for a language to have no linguistic number at all, since a ‘one–many’ opposition is typically implied at least in pronouns, where the category of person discriminates the speaker as ‘one.’ Beyond pronouns, number is typically a property of nouns and/or determiners, although it can appear on other word classes by agreement. Verbs can also express part-structural properties of events, but this ‘verbal number’ is not isomorphic to nominal number marking. Many languages allow a variable proportion of their nominals to appear in a ‘general’ form, which expresses no number information. The main values of number-marked elements are singular and plural; dual and a much rarer trial also exist. Many languages also distinguish forms interpreted as paucals or as greater plurals, respectively, for small and usually cohesive groups and for generically large ones. A broad range of exponence patterns can express these contrasts, depending on the morphological profile of a language, from word inflections to freestanding or clitic forms; certain choices of classifiers also express readings that can be described as ‘plural,’ at least in certain interpretations. Classifiers can co-occur with other plurality markers, but not when these are obligatory as expressions of an inflectional paradigm, although this is debated, partly because the notion of classifier itself subsumes distinct phenomena. Many languages, especially those with classifiers, encode number not as an inflectional category, but through word-formation operations that express readings associated with plurality, including large size. Current research on number concerns all its morphological, morphosyntactic, and semantic dimensions, in particular the interrelations of them as part of the study of natural language typology and of the formal analysis of nominal phrases. The grammatical and semantic function of number and plurality are particularly prominent in formal semantics and in syntactic theory.
Within the Ryukyuan branch of the Japonic family of languages, present-day Okinawan retains numerous regional variants which have evolved for over a thousand years in the Ryukyuan Archipelago. Okinawan is one of the six Ryukyuan languages that UNESCO identified as endangered. One of the theoretically fascinating features is that there is substantial evidence for establishing a high central phonemic vowel in Okinawan although there is currently no overt surface [ï]. Moreover, the word-initial glottal stop [ʔ] in Okinawan is more salient than that in Japanese when followed by vowels, enabling recognition that all Okinawan words are consonant-initial. Except for a few particles, all Okinawan words are composed of two or more morae. Suffixation or vowel lengthening (on nouns, verbs, and adjectives) provides the means for signifying persons as well as things related to human consumption or production. Every finite verb in Okinawan terminates with a mood element. Okinawan exhibits a complex interplay of mood or negative elements and focusing particles. Evidentiality is also realized as an obligatory verbal suffix.
Old and Middle Japanese are the pre-modern periods of the attested history of the Japanese language. Old Japanese (OJ) is largely the language of the 8th century, with a modest, but still significant number of written sources, most of which is poetry. Middle Japanese is divided into two distinct periods, Early Middle Japanese (EMJ, 800–1200) and Late Middle Japanese (LMJ, 1200–1600). EMJ saw most of the significant sound changes that took place in the language, as well as profound influence from Chinese, whereas most grammatical changes took place between the end of EMJ and the end of LMJ. By the end of LMJ, the Japanese language had reached a form that is not significantly different from present-day Japanese.
OJ phonology was simple, both in terms of phoneme inventory and syllable structure, with a total of only 88 different syllables. In EMJ, the language became quantity sensitive, with the introduction of a long versus short syllables. OJ and EMJ had obligatory verb inflection for a number of modal and syntactic categories (including an important distinction between a conclusive and an (ad)nominalizing form), whereas the expression of aspect and tense was optional. Through late EMJ and LMJ this system changed completely to one without nominalizing inflection, but obligatory inflection for tense.
The morphological pronominal system of OJ was lost in EMJ, which developed a range of lexical and lexically based terms of speaker and hearer reference. OJ had a two-way (speaker–nonspeaker) demonstrative system, which in EMJ was replaced by a three-way (proximal–mesial–distal) system.
OJ had a system of differential object marking, based on specificity, as well as a word order rule that placed accusative marked objects before most subjects; both of these features were lost in EMJ. OJ and EMJ had genitive subject marking in subordinate clauses and in focused, interrogative and exclamative main clauses, but no case marking of subjects in declarative, optative, or imperative main clauses and no nominative marker. Through LMJ genitive subject marking was gradually circumscribed and a nominative case particle was acquired which could mark subjects in all types of clauses.
OJ had a well-developed system of complex predicates, in which two verbs jointly formed the predicate of a single clause, which is the source of the LMJ and NJ (Modern Japanese) verb–verb compound complex predicates. OJ and EMJ also had mono-clausal focus constructions that functionally were similar to clefts in English; these constructions were lost in LMJ.