Most linguists have heard of semantic compositionality. Some will have heard that it is the fundamental truth of semantics. Others will have been told that it is so thoroughly and completely wrong that it is astonishing that it is still being taught. The present article attempts to explain all this. Much of the discussion of semantic compositionality takes place in three arenas that are rather insulated from one another: (a) philosophy of mind and language, (b) formal semantics, and (c) cognitive linguistics and cognitive psychology. A truly comprehensive overview of the writings in all these areas is not possible here. However, this article does discuss some of the work that occurs in each of these areas. A bibliography of general works, and some Internet resources, will help guide the reader to some further, undiscussed works (including further material in all three categories).
Francis Jeffry Pelletier
The investigation of morphology and lexical semantics is an investigation into the very essence of the semantics of word formation: the meaning of morphemes and how they can be combined to form meanings of complex words. Discussion of this question within the scholarly literature has been dependent on (i) the adopted morphological model (morpheme-based or word-based); and (ii) the adopted theoretical paradigm (such as formal/generativist accounts vs. construction-based approaches)—which also determined what problem areas received attention in the first place. One particular problem area that has surfaced most consistently within the literature (irrespective of the adopted morphological model or theoretical paradigm) is the so-called semantic mismatch question, which also serves as the focus of the present chapter. In essence, semantic mismatch pertains to the question of why there is no one-to-one correspondence between form and meaning in word formation. In other words, it is very frequently not possible out of context to give a precise account of what the meaning of a newly coined word might be based simply on the constituents that the word originates from. The article considers the extent to which the meaning of complex words is (at least partly) based on nondecompositional knowledge, implying that the meaning-bearing feature of morphemes might in fact be a graded affair. Thus, depending on the entrenchment and strength of the interrelations among sets of words, the meaning of the components contributes only more or less to a meaning of a word, suggesting that “mismatches” might be neither unusual nor uncommon.
Computational semantics performs automatic meaning analysis of natural language. Research in computational semantics designs meaning representations and develops mechanisms for automatically assigning those representations and reasoning over them. Computational semantics is not a single monolithic task but consists of many subtasks, including word sense disambiguation, multi-word expression analysis, semantic role labeling, the construction of sentence semantic structure, coreference resolution, and the automatic induction of semantic information from data. The development of manually constructed resources has been vastly important in driving the field forward. Examples include WordNet, PropBank, FrameNet, VerbNet, and TimeBank. These resources specify the linguistic structures to be targeted in automatic analysis, and they provide high-quality human-generated data that can be used to train machine learning systems. Supervised machine learning based on manually constructed resources is a widely used technique. A second core strand has been the induction of lexical knowledge from text data. For example, words can be represented through the contexts in which they appear (called distributional vectors or embeddings), such that semantically similar words have similar representations. Or semantic relations between words can be inferred from patterns of words that link them. Wide-coverage semantic analysis always needs more data, both lexical knowledge and world knowledge, and automatic induction at least alleviates the problem. Compositionality is a third core theme: the systematic construction of structural meaning representations of larger expressions from the meaning representations of their parts. The representations typically use logics of varying expressivity, which makes them well suited to performing automatic inferences with theorem provers. Manual specification and automatic acquisition of knowledge are closely intertwined. Manually created resources are automatically extended or merged. The automatic induction of semantic information is guided and constrained by manually specified information, which is much more reliable. And for restricted domains, the construction of logical representations is learned from data. It is at the intersection of manual specification and machine learning that some of the current larger questions of computational semantics are located. For instance, should we build general-purpose semantic representations, or is lexical knowledge simply too domain-specific, and would we be better off learning task-specific representations every time? When performing inference, is it more beneficial to have the solid ground of a human-generated ontology, or is it better to reason directly with text snippets for more fine-grained and gradual inference? Do we obtain a better and deeper semantic analysis as we use better and deeper manually specified linguistic knowledge, or is the future in powerful learning paradigms that learn to carry out an entire task from natural language input and output alone, without pre-specified linguistic knowledge?
Laura A. Michaelis
Meanings are assembled in various ways in a construction-based grammar, and this array can be represented as a continuum of idiomaticity, a gradient of lexical fixity. Constructional meanings are the meanings to be discovered at every point along the idiomaticity continuum. At the leftmost, or ‘fixed,’ extreme of this continuum are frozen idioms, like the salt of the earth and in the know. The set of frozen idioms includes those with idiosyncratic syntactic properties, like the fixed expression by and large (an exceptional pattern of coordination in which a preposition and adjective are conjoined). Other frozen idioms, like the unexceptionable modified noun red herring, feature syntax found elsewhere. At the rightmost, or ‘open’ end of this continuum are fully productive patterns, including the rule that licenses the string Kim blinked, known as the Subject-Predicate construction. Between these two poles are (a) lexically fixed idiomatic expressions, verb-headed and otherwise, with regular inflection, such as chew/chews/chewed the fat; (b) flexible expressions with invariant lexical fillers, including phrasal idioms like spill the beans and the Correlative Conditional, such as the more, the merrier; and (c) specialized syntactic patterns without lexical fillers, like the Conjunctive Conditional (e.g., One more remark like that and you’re out of here). Construction Grammar represents this range of expressions in a uniform way: whether phrasal or lexical, all are modeled as feature structures that specify phonological and morphological structure, meaning, use conditions, and relevant syntactic information (including syntactic category and combinatoric potential).
M. Teresa Espinal and Jaume Mateu
Idioms, conceived as fixed multi-word expressions that conceptually encode non-compositional meaning, are linguistic units that raise a number of questions relevant in the study of language and mind (e.g., whether they are stored in the lexicon or in memory, whether they have internal or external syntax similar to other expressions of the language, whether their conventional use is parallel to their non-compositional meaning, whether they are processed in similar ways to regular compositional expressions of the language, etc.). Idioms show some similarities and differences with other sorts of formulaic expressions, the main types of idioms that have been characterized in the linguistic literature, and the dimensions on which idiomaticity lies. Syntactically, idioms manifest a set of syntactic properties, as well as a number of constraints that account for their internal and external structure. Semantically, idioms present an interesting behavior with respect to a set of semantic properties that account for their meaning (i.e., conventionality, compositionality, and transparency, as well as aspectuality, referentiality, thematic roles, etc.). The study of idioms has been approached from lexicographic and computational, as well as from psycholinguistic and neurolinguistic perspectives.
Modification is a combinatorial semantic operation between a modifier and a modifiee. Take, for example, vegetarian soup: the attributive adjective vegetarian modifies the nominal modifiee soup and thus constrains the range of potential referents of the complex expression to soups that are vegetarian. Similarly, in Ben is preparing a soup in the camper, the adverbial in the camper modifies the preparation by locating it. Notably, modifiers can have fairly drastic effects; in fake stove, the attribute fake induces that the complex expression singles out objects that seem to be stoves, but are not. Intuitively, modifiers contribute additional information that is not explicitly called for by the target the modifier relates to. Speaking in terms of logic, this roughly says that modification is an endotypical operation; that is, it does not change the arity, or logical type, of the modified target constituent. Speaking in terms of syntax, this predicts that modifiers are typically adjuncts and thus do not change the syntactic distribution of their respective target; therefore, modifiers can be easily iterated (see, for instance, spicy vegetarian soup or Ben prepared a soup in the camper yesterday). This initial characterization sets modification apart from other combinatorial operations such as argument satisfaction and quantification: combining a soup with prepare satisfies an argument slot of the verbal head and thus reduces its arity (see, for instance, *prepare a soup a quiche). Quantification as, for example, in the combination of the quantifier every with the noun soup, maps a nominal property onto a quantifying expression with a different distribution (see, for instance, *a every soup). Their comparatively loose connection to their hosts renders modifiers a flexible, though certainly not random, means within combinatorial meaning constitution. The foundational question is how to work their being endotypical into a full-fledged compositional analysis. On the one hand, modifiers can be considered endotypical functors by virtue of their lexical endowment; for instance, vegetarian would be born a higher-ordered function from predicates to predicates. On the other hand, modification can be considered a rule-based operation; for instance, vegetarian would denote a simple predicate from entities to truth-values that receives its modifying endotypical function only by virtue of a separate modification rule. In order to assess this and related controversies empirically, research on modification pays particular attention to interface questions such as the following: how do structural conditions and the modifying function conspire in establishing complex interpretations? What roles do ontological information and fine-grained conceptual knowledge play in the course of concept combination?
Linking elements occur in compound nouns and derivatives in the Indo-European languages as well as in many other languages of the world. They can be described as sound material or graphemes with or without a phonetic correspondence appearing between two parts of a word-formation product. Linking elements are meaningless per definition. However, in many cases the clear-cut distinction between them and other, meaningful elements (like inflectional or derivational affixes) is difficult. Here, a thorough examination is necessary. Simple rules cannot describe the occurrence of linking elements. Instead, their distribution is fully erratic or at least complex, as different factors including the prosodic, morphological, or semantic properties of the word-formation components play a role and compete. The same holds for their productivity: their ability to appear in new word-formation products differs considerably and can range from strongly (prosodically, morphologically, or lexically) restricted to the virtual absence of any constraints. Linking elements should be distinguished from singular, isolated insertions (cf. Spanish rousseau-n-iano) or extensions of one specific stem or affix (cf. ‑l- in French congo-l-ais, togo-l-ais, English Congo-l-ese, Togo-l-ese). As they link two parts of a word formation, they also differ from word-final elements attached to compounds like ‑(s)I in Turkish as in ana‑dil‑i (mother‑tongue‑i) ‘mother tongue’. Furthermore, they are also distinct from infixes, i.e., derivational affixes that are inserted into a root, as well as from confixes, which are for bound, but meaningful (lexical) morphemes. Linking elements are attested in many Indo-European languages (Slavic, Romance, Germanic, Baltic languages, and Greek) as well as in other languages across the world. They seem to be more common in compounds than in derivatives. Additionally, some languages display different sets of linking elements in both compounds and derivatives. The linking inventories differ strongly even between closely related languages. For example, Frisian and Dutch, each of which has five different linking elements, share only two linking forms (‑s- and ‑e-). In some languages, linking elements are homophonous to other (meaningful) elements, e.g., inflectional or derivational suffixes. This is mostly due to their historical development and to the degree of the dissociation from their sources. This makes it sometimes difficult to distinguish between linking elements and meaningful elements. In such cases (e.g., in German or Icelandic), formal and functional differences should be taken into account. It is also possible that the homophony with the inflectional markers is incidental and not a remnant of a historical development. Generally, linking elements can have different historical sources: primary suffixes (e.g., Lithuanian), case markers (e.g., many Germanic languages), derivational suffixes (e.g., Greek), prepositions (e.g., Sardinian and English). However, the historical development of many linking elements in many languages still require further research. Depending on their distribution, linking elements can have different functions. Accordingly, the functions strongly differ from language to language. They can serve as compound markers (Greek), as “reopeners” of closed stems for further morphological processes (German), as markers of prosodically and/or morphologically complex first parts (many Germanic languages), as plural markers (Dutch and German), and as markers of genre (German).
Huei-ling Lai and Yao-Ying Lai
Sentential meaning that emerges compositionally is not always transparent as one-to-one mapping from syntactic structure to semantic representation; oftentimes, the meaning is underspecified (morphosyntactically unsupported), not explicitly conveyed via overt linguistic devices. Compositional meaning is obtained during comprehension. The associated issues are explored by examining linguistic factors that modulate the construal of underspecified iterative meaning in Mandarin Chinese (MC). In this case, the factors include lexical aspect of verbs, the interval-lengths denoted by post-verbal durative adverbials, and boundary specificity denoted by preverbal versus post-verbal temporal adverbials. The composition of a punctual verb (e.g., jump, poke) with a durative temporal adverbial like Zhangsan tiao-le shi fenzhong. Zhangsan jump-LE ten minute ‘Zhangsan jumped for ten minutes’ engenders an iterative meaning, which is morphosyntactically absent yet fully understood by comprehenders. Contrastively, the counterpart involving a durative verb (e.g., run, swim) like Zhangsan pao-le shi fenzhong Zhangsan run-LE ten minute ‘Zhangsan ran for ten minutes’ engenders a continuous reading with identical syntactic structure. Psycholinguistically, processing such underspecified meaning in real time has been shown to require greater effort than the transparent counterpart. This phenomenon has been attested cross-linguistically; yet how it is manifested in MC, a tenseless language, remains understudied. In addition, durative temporal adverbials like yizhi/buduandi ‘continuously,’ which appear preverbally in MC, also engender an iterative meaning when composed with a punctual verb like Zhangsan yizhi/buduandi tiao. Zhangsan continuously jump ‘Zhangsan jumped continuously.’ Crucially, unlike the post-verbal adverbials that encode specific boundaries for the denoted intervals, these preverbal adverbials refer to continuous time spans without specific endpoints. The difference in boundary specificity between the two adverbial types, while both being durative, is hypothesized to modulate the processing profiles of aspectual comprehension. Results of the online (timed) questionnaire showed (a) an effect of boundary specificity: sentences with post-verbal adverbials that encode [+specific boundary] were rated lower in the naturalness-rating task and induced longer response time (RT) in iterativity judgements, as compared to preverbal adverbials that encode [−specific boundary]; (b) in composition with post-verbal adverbials that are [+specific boundary], sentences involving durative verbs elicited lower rating scores and longer RT of iterativity judgements than the counterpart involving punctual verbs. These suggest that the comprehension of underspecified iterative meaning is modulated by both cross-linguistically similar parameters and language-specific systems of temporal reference, by which MC exhibits a typological difference in processing profiles. Overall, the patterns are consistent with the Context-Dependence approach to semantic underspecification: comprehenders compute the ultimate reading (iterative versus continuous) by taking both the sentential and extra-sentential information into consideration in a given context.
Chuansheng He and Min Zhang
Numerical expressions are linguistic forms related to numbers or quantities, which directly reflect the relationship between linguistic symbols and mathematical cognition. Featuring some unique properties, numeral systems are somewhat distinguished from other language subsystems. For instance, numerals can appear in various grammatical positions, including adjective positions, determiner positions, and argument positions. Thus, linguistic research on numeral systems, especially the research on the syntax and semantics of numerical expressions, has been a popular and recurrent topic. For the syntax of complex numerals, two analyses have been proposed in the literature. The traditional constituency analysis maintains that complex numerals are phrasal constituents, which has been widely accepted and defended as a null hypothesis. The nonconstituency analysis, by contrast, claims that a complex numeral projects a complementative structure in which a numeral is a nominal head selecting a lexical noun or a numeral-noun combination as its complement. As a consequence, additive numerals are transformed from full NP coordination. Whether numerals denote numbers or sets has aroused a long-running debate. The number-denoting view assumes that numerals refer to numbers, which are abstract objects, grammatically equivalent to nouns. The primary issue with this analysis comes from the introduction of a new entity, numbers, into the model of ontology. The set-denoting view argues that numerals refer to sets, which are equivalent to adjectives or quantifiers in grammar. One main difficulty of this view is how to account for numerals in arithmetic sentences.