The central goal of the Lexical Semantic Framework (LSF) is to characterize the meaning of simple lexemes and affixes and to show how these meanings can be integrated in the creation of complex words. LSF offers a systematic treatment of issues that figure prominently in the study of word formation, such as the polysemy question, the multiple-affix question, the zero-derivation question, and the form and meaning mismatches question.
LSF has its source in a confluence of research approaches that follow a decompositional approach to meaning and, thus, defines simple lexemes and affixes by way of a systematic representation that is achieved via a constrained formal language that enforces consistency of annotation. Lexical-semantic representations in LSF consist of two parts: the Semantic/Grammatical Skeleton and the Semantic/Pragmatic Body (henceforth ‘skeleton’ and ‘body’ respectively). The skeleton is comprised of features that are of relevance to the syntax. These features act as functions and may take arguments. Functions and arguments of a skeleton are hierarchically arranged. The body encodes all those aspects of meaning that are perceptual, cultural, and encyclopedic.
Features in LSF are used in (a) a cross-categorial, (b) an equipollent, and (c) a privative way. This means that they are used to account for the distinction between the major ontological categories, may have a binary (i.e., positive or negative) value, and may or may not form part of the skeleton of a given lexeme. In order to account for the fact that several distinct parts integrate into a single referential unit that projects its arguments to the syntax, LSF makes use of the Principle of Co-indexation. Co-indexation is a device needed in order to tie together the arguments that come with different parts of a complex word to yield only those arguments that are syntactically active.
LSF has an important impact on the study of the morphology-lexical semantics interface and provides a unitary theory of meaning in word formation.
Article
Réka Benczes
The investigation of morphology and lexical semantics is an investigation into the very essence of the semantics of word formation: the meaning of morphemes and how they can be combined to form meanings of complex words. Discussion of this question within the scholarly literature has been dependent on (i) the adopted morphological model (morpheme-based or word-based); and (ii) the adopted theoretical paradigm (such as formal/generativist accounts vs. construction-based approaches)—which also determined what problem areas received attention in the first place.
One particular problem area that has surfaced most consistently within the literature (irrespective of the adopted morphological model or theoretical paradigm) is the so-called semantic mismatch question, which also serves as the focus of the present chapter. In essence, semantic mismatch pertains to the question of why there is no one-to-one correspondence between form and meaning in word formation. In other words, it is very frequently not possible out of context to give a precise account of what the meaning of a newly coined word might be based simply on the constituents that the word originates from. The article considers the extent to which the meaning of complex words is (at least partly) based on nondecompositional knowledge, implying that the meaning-bearing feature of morphemes might in fact be a graded affair. Thus, depending on the entrenchment and strength of the interrelations among sets of words, the meaning of the components contributes only more or less to a meaning of a word, suggesting that “mismatches” might be neither unusual nor uncommon.
Article
‘Folk etymology’ and ‘contamination’ each involve associative formal influences between words which have no ‘etymological’ (i.e., historical), connexion. From a morphological perspective, in folk etymology a word acquires at least some elements of the structure of some other, historically unrelated, word. The result often looks like a compound, of a word composed of other, independently existing, words. These are usually (but not necessarily) ‘compounds’ lacking in any semantic compositionality, which do not ‘make sense’: for example, French beaupré ‘bowsprit’, but apparently ‘beautiful meadow’, possibly derived from English bowsprit. Typically involved are relatively long, polysyllabic, words, characteristically belonging to erudite or exotic vocabulary, whose unfamiliarity is accommodated by speakers unfamiliar with the target word through replacement of portions of that word with more familiar words. Contamination differs from folk etymology both on the formal and on the semantic side, usually involving non-morphemic elements, and acting between words that are semantically linked: for example, Spanish nuera ‘daughter-in-law’, instead of etymologically expected **nora, apparently influenced by the vowel historically underlying suegra ‘mother-in-law’. While there is nothing uniquely Romance about these phenomena, Romance languages abound in them.
Article
Emmanuel Keuleers
Computational psycholinguistics has a long history of investigation and modeling of morphological phenomena. Several computational models have been developed to deal with the processing and production of morphologically complex forms and with the relation between linguistic morphology and psychological word representations. Historically, most of this work has focused on modeling the production of inflected word forms, leading to the development of models based on connectionist principles and other data-driven models such as Memory-Based Language Processing (MBLP), Analogical Modeling of Language (AM), and Minimal Generalization Learning (MGL). In the context of inflectional morphology, these computational approaches have played an important role in the debate between single and dual mechanism theories of cognition. Taking a different angle, computational models based on distributional semantics have been proposed to account for several phenomena in morphological processing and composition. Finally, although several computational models of reading have been developed in psycholinguistics, none of them have satisfactorily addressed the recognition and reading aloud of morphologically complex forms.
Article
Marios Andreou
The category of Personal/Participant/Inhabitant derived nouns comprises a conglomeration of derived nouns that denote among others agents, instruments, patients/themes, inhabitants, and followers of a person. Based on the thematic relations between the derived noun and its base lexeme, Personal/Participant/Inhabitant nouns can be classified into two subclasses. The first subclass comprises derived nouns that are deverbal and carry thematic readings (e.g., driver). The second subclass consists of derived nouns with athematic readings (e.g., Marxist).
The examination of the category of Personal/Participant/Inhabitant nouns allows one to delve deeply into the study of multiplicity of meaning in word formation and the factors that bear on the readings of derived words. These factors range from the historical mechanisms that lead to multiplicity of meaning and the lexical-semantic properties of the bases that derived nouns are based on, to the syntactic context into which derived nouns occur, and the pragmatic-encyclopedic facets of both the base and the derived lexeme.
Article
Jesús Fernández-Domínguez
The onomasiological approach is a theoretical framework that emphasizes the cognitive-semantic component of language and the primacy of extra-linguistic reality in the process of naming. With a tangible background in the functional perspective of the Prague School of Linguistics, this approach believes that name giving is essentially governed by the needs of language users, and hence assigns a subordinate role to the traditional levels of linguistic description. This stance characterizes the onomasiological framework in opposition to other theories of language, especially generativism, which first tackle the form of linguistic material and then move on to meaning.
The late 20th and early 21st centuries have witnessed the emergence of several cognitive-onomasiological models, all of which share an extensive use of semantic categories as working units and a particular interest in the area of word-formation. Despite a number of divergences, such proposals all confront mainstream morphological research by heavily revising conventional concepts and introducing model-specific terminology regarding, for instance, the independent character of the lexicon, the (non-)regularity of word-formation processes, or their understanding of morphological productivity. The models adhering to such a view of language have earned a pivotal position as an alternative to dominant theories of word-formation.
Article
Natalia Beliaeva
Blending is a type of word formation in which two or more words are merged into one so that the blended constituents are either clipped, or partially overlap. An example of a typical blend is brunch, in which the beginning of the word breakfast is joined with the ending of the word lunch. In many cases such as motel (motor + hotel) or blizzaster (blizzard + disaster) the constituents of a blend overlap at segments that are phonologically or graphically identical. In some blends, both constituents retain their form as a result of overlap, for example, stoption (stop + option). These examples illustrate only a handful of the variety of forms blends may take; more exotic examples include formations like Thankshallowistmas (Thanksgiving + Halloween + Christmas). The visual and audial amalgamation in blends is reflected on the semantic level. It is common to form blends meaning a combination or a product of two objects or phenomena, such as an animal breed (e.g., zorse, a breed of zebra and horse), an interlanguage variety (e.g., franglais, which is a French blend of français and anglais meaning a mixture of French and English languages), or other type of mix (e.g., a shress is a type of clothes having features of both a shirt and a dress).
Blending as a word formation process can be regarded as a subtype of compounding because, like compounds, blends are formed of two (or sometimes more) content words and semantically either are hyponyms of one of their constituents, or exhibit some kind of paradigmatic relationships between the constituents. In contrast to compounds, however, the formation of blends is restricted by a number of phonological constraints given that the resulting formation is a single word. In particular, blends tend to be of the same length as the longest of their constituent words, and to preserve the main stress of one of their constituents. Certain regularities are also observed in terms of ordering of the words in a blend (e.g., shorter first, more frequent first), and in the position of the switch point, that is, where one blended word is cut off and switched to another (typically at the syllable boundary or at the onset/rime boundary). The regularities of blend formation can be related to the recognizability of the blended words.
Article
Corpora are an all-important resource in linguistics, as they constitute the primary source for large-scale examples of language usage. This has been even more evident in recent years, with the increasing availability of texts in digital format leading more and more corpus linguistics toward a “big data” approach. As a consequence, the quantitative methods adopted in the field are becoming more sophisticated and various.
When it comes to morphology, corpora represent a primary source of evidence to describe morpheme usage, and in particular how often a particular morphological pattern is attested in a given language. There is hence a tight relation between corpus linguistics and the study of morphology and the lexicon. This relation, however, can be considered bi-directional. On the one hand, corpora are used as a source of evidence to develop metrics and train computational models of morphology: by means of corpus data it is possible to quantitatively characterize morphological notions such as productivity, and corpus data are fed to computational models to capture morphological phenomena at different levels of description. On the other hand, morphology has also been applied as an organization principle to corpora. Annotations of linguistic data often adopt morphological notions as guidelines. The resulting information, either obtained from human annotators or relying on automatic systems, makes corpora easier to analyze and more convenient to use in a number of applications.
Article
Ulrich Detges
Cognitive semantics (CS) is an approach to the study of linguistic meaning. It is based on the assumption that the human linguistic capacity is part of our cognitive abilities, and that language in general and meaning in particular can therefore be better understood by taking into account the cognitive mechanisms that control the conceptual and perceptual processing of extra-linguistic reality. Issues central to CS are (a) the notion of prototype and its role in the description of language, (b) the nature of linguistic meaning, and (c) the functioning of different types of semantic relations. The question concerning the nature of meaning is an issue that is particularly controversial between CS on the one hand and structuralist and generative approaches on the other hand: is linguistic meaning conceptual, that is, part of our encyclopedic knowledge (as is claimed by CS), or is it autonomous, that is, based on abstract and language-specific features? According to CS, the most important types of semantic relations are metaphor, metonymy, and different kinds of taxonomic relations, which, in turn, can be further broken down into more basic associative relations such as similarity, contiguity, and contrast. These play a central role not only in polysemy and word formation, that is, in the lexicon, but also in the grammar.
Article
James Myers
Acceptability judgments are reports of a speaker’s or signer’s subjective sense of the well-formedness, nativeness, or naturalness of (novel) linguistic forms. Their value comes in providing data about the nature of the human capacity to generalize beyond linguistic forms previously encountered in language comprehension. For this reason, acceptability judgments are often also called grammaticality judgments (particularly in syntax), although unlike the theory-dependent notion of grammaticality, acceptability is accessible to consciousness. While acceptability judgments have been used to test grammatical claims since ancient times, they became particularly prominent with the birth of generative syntax. Today they are also widely used in other linguistic schools (e.g., cognitive linguistics) and other linguistic domains (pragmatics, semantics, morphology, and phonology), and have been applied in a typologically diverse range of languages. As psychological responses to linguistic stimuli, acceptability judgments are experimental data. Their value thus depends on the validity of the experimental procedures, which, in their traditional version (where theoreticians elicit judgments from themselves or a few colleagues), have been criticized as overly informal and biased. Traditional responses to such criticisms have been supplemented in recent years by laboratory experiments that use formal psycholinguistic methods to collect and quantify judgments from nonlinguists under controlled conditions. Such formal experiments have played an increasingly influential role in theoretical linguistics, being used to justify subtle judgment claims or new grammatical models that incorporate gradience or lexical influences. They have also been used to probe the cognitive processes giving rise to the sense of acceptability itself, the central finding being that acceptability reflects processing ease. Exploring what this finding means will require not only further empirical work on the acceptability judgment process, but also theoretical work on the nature of grammar.
Article
Paolo Acquaviva
Number is the category through which languages express information about the individuality, numerosity, and part structure of what we speak about. As a linguistic category it has a morphological, a morphosyntactic, and a semantic dimension, which are variously interrelated across language systems. Number marking can apply to a more or less restricted part of the lexicon of a language, being most likely on personal pronouns and human/animate nouns, and least on inanimate nouns. In the core contrast, number allows languages to refer to ‘many’ through the description of ‘one’; the sets referred to consist of tokens of the same type, but also of similar types, or of elements pragmatically associated with one named individual. In other cases, number opposes a reading of ‘one’ to a reading as ‘not one,’ which includes masses; when the ‘one’ reading is morphologically derived from the ‘not one,’ it is called a singulative. It is rare for a language to have no linguistic number at all, since a ‘one–many’ opposition is typically implied at least in pronouns, where the category of person discriminates the speaker as ‘one.’ Beyond pronouns, number is typically a property of nouns and/or determiners, although it can appear on other word classes by agreement. Verbs can also express part-structural properties of events, but this ‘verbal number’ is not isomorphic to nominal number marking. Many languages allow a variable proportion of their nominals to appear in a ‘general’ form, which expresses no number information. The main values of number-marked elements are singular and plural; dual and a much rarer trial also exist. Many languages also distinguish forms interpreted as paucals or as greater plurals, respectively, for small and usually cohesive groups and for generically large ones. A broad range of exponence patterns can express these contrasts, depending on the morphological profile of a language, from word inflections to freestanding or clitic forms; certain choices of classifiers also express readings that can be described as ‘plural,’ at least in certain interpretations. Classifiers can co-occur with other plurality markers, but not when these are obligatory as expressions of an inflectional paradigm, although this is debated, partly because the notion of classifier itself subsumes distinct phenomena. Many languages, especially those with classifiers, encode number not as an inflectional category, but through word-formation operations that express readings associated with plurality, including large size. Current research on number concerns all its morphological, morphosyntactic, and semantic dimensions, in particular the interrelations of them as part of the study of natural language typology and of the formal analysis of nominal phrases. The grammatical and semantic function of number and plurality are particularly prominent in formal semantics and in syntactic theory.
Article
Taro Kageyama
Compound and complex predicates—predicates that consist of two or more lexical items and function as the predicate of a single sentence—present an important class of linguistic objects that pertain to an enormously wide range of issues in the interactions of morphology, phonology, syntax, and semantics. Japanese makes extensive use of compounding to expand a single verb into a complex one. These compounding processes range over multiple modules of the grammatical system, thus straddling the borders between morphology, syntax, phonology, and semantics. In terms of degree of phonological integration, two types of compound predicates can be distinguished. In the first type, called tight compound predicates, two elements from the native lexical stratum are tightly fused and inflect as a whole for tense. In this group, Verb-Verb compound verbs such as arai-nagasu [wash-let.flow] ‘to wash away’ and hare-agaru [sky.be.clear-go.up] ‘for the sky to clear up entirely’ are preponderant in numbers and productivity over Noun-Verb compound verbs such as tema-doru [time-take] ‘to take a lot of time (to finish).’
The second type, called loose compound predicates, takes the form of “Noun + Predicate (Verbal Noun [VN] or Adjectival Noun [AN]),” as in post-syntactic compounds like [sinsya : koonyuu] no okyakusama ([new.car : purchase] GEN customers) ‘customer(s) who purchase(d) a new car,’ where the symbol “:” stands for a short phonological break. Remarkably, loose compounding allows combinations of a transitive VN with its agent subject (external argument), as in [Supirubaagu : seisaku] no eiga ([Spielberg : produce] GEN film) ‘a film/films that Spielberg produces/produced’—a pattern that is illegitimate in tight compounds and has in fact been considered universally impossible in the world’s languages in verbal compounding and noun incorporation.
In addition to a huge variety of tight and loose compound predicates, Japanese has an additional class of syntactic constructions that as a whole function as complex predicates. Typical examples are the light verb construction, where a clause headed by a VN is followed by the light verb suru ‘do,’ as in Tomodati wa sinsya o koonyuu (sae) sita [friend TOP new.car ACC purchase (even) did] ‘My friend (even) bought a new car’ and the human physical attribute construction, as in Sensei wa aoi me o site-iru [teacher TOP blue eye ACC do-ing] ‘My teacher has blue eyes.’ In these constructions, the nominal phrases immediately preceding the verb suru are semantically characterized as indefinite and non-referential and reject syntactic operations such as movement and deletion. The semantic indefiniteness and syntactic immobility of the NPs involved are also observed with a construction composed of a human subject and the verb aru ‘be,’ as Gakkai ni wa oozei no sankasya ga atta ‘There was a large number of participants at the conference.’ The constellation of such “word-like” properties shared by these compound and complex predicates poses challenging problems for current theories of morphology-syntax-semantics interactions with regard to such topics as lexical integrity, morphological compounding, syntactic incorporation, semantic incorporation, pseudo-incorporation, and indefinite/non-referential NPs.
Article
Rochelle Lieber
Derivational morphology is a type of word formation that creates new lexemes, either by changing syntactic category or by adding substantial new meaning (or both) to a free or bound base. Derivation may be contrasted with inflection on the one hand or with compounding on the other. The distinctions between derivation and inflection and between derivation and compounding, however, are not always clear-cut. New words may be derived by a variety of formal means including affixation, reduplication, internal modification of various sorts, subtraction, and conversion. Affixation is best attested cross-linguistically, especially prefixation and suffixation. Reduplication is also widely found, with various internal changes like ablaut and root and pattern derivation less common. Derived words may fit into a number of semantic categories. For nouns, event and result, personal and participant, collective and abstract noun are frequent. For verbs, causative and applicative categories are well-attested, as are relational and qualitative derivations for adjectives. Languages frequently also have ways of deriving negatives, relational words, and evaluatives. Most languages have derivation of some sort, although there are languages that rely more heavily on compounding than on derivation to build their lexical stock. A number of topics have dominated the theoretical literature on derivation, including productivity (the extent to which new words can be created with a given affix or morphological process), the principles that determine the ordering of affixes, and the place of derivational morphology with respect to other components of the grammar. The study of derivation has also been important in a number of psycholinguistic debates concerning the perception and production of language.