Show Summary Details

Page of

Printed from Oxford Research Encyclopedias, Linguistics. Under the terms of the licence agreement, an individual user may print out a single article for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: Google Scholar Indexing; date: 17 April 2024

Idioms and Phraseologyunlocked

Idioms and Phraseologyunlocked

  • M. Teresa EspinalM. Teresa EspinalCenter for Theoretical Linguistics, Autonomous University of Barcelona
  •  and Jaume MateuJaume MateuCenter for Theoretical Linguistics, Autonomous University of Barcelona


Idioms, conceived as fixed multi-word expressions that conceptually encode non-compositional meaning, are linguistic units that raise a number of questions relevant in the study of language and mind (e.g., whether they are stored in the lexicon or in memory, whether they have internal or external syntax similar to other expressions of the language, whether their conventional use is parallel to their non-compositional meaning, whether they are processed in similar ways to regular compositional expressions of the language, etc.). Idioms show some similarities and differences with other sorts of formulaic expressions, the main types of idioms that have been characterized in the linguistic literature, and the dimensions on which idiomaticity lies. Syntactically, idioms manifest a set of syntactic properties, as well as a number of constraints that account for their internal and external structure. Semantically, idioms present an interesting behavior with respect to a set of semantic properties that account for their meaning (i.e., conventionality, compositionality, and transparency, as well as aspectuality, referentiality, thematic roles, etc.). The study of idioms has been approached from lexicographic and computational, as well as from psycholinguistic and neurolinguistic perspectives.


  • Applied Linguistics
  • Linguistic Theories
  • Semantics
  • Syntax

1. Types of Idiomatic Expressions

In the study of language special attention is usually devoted to multi-word expressions that are syntactically complex and fixed to some degree. These expressions are usually referred to by means of the term idiom in the Anglo-Saxon tradition (Fraser, 1970; Fillmore, Kay, & O’Connor, 1988; Gibbs, 1993, i.a.), and by the term phraseme in the Romance and Germanic traditions (Burger, Buhofer, & Sialm, 1982; Burger, Dobrovol’skij, Kühn, & Norrick, 2007; Mel'čuk, 1995, i.a.).

These terms gather expressions that correspond somehow to formulaic language, in the sense that they are multi-word units fixed to some extent, either in form (that is, in their morpho-syntactic properties) or in meaning (considering that it cannot be built by regular principles of grammar). Their idiomaticity has been analyzed either as being stored in the lexicon or placed in the periphery of grammar (Katz & Postal, 1963; Fraser, 1970; Chomsky, 1980), which correlates with the idea of idioms as holistic units in memory, retrieved as a unit in processing (Mueller & Gibbs, 1987; Nenonen, Niemi, & Laine, 2002, i.a.), or as being the expression of creative conceptual metaphors (Lakoff, 1990, 1993; Lakoff & Johnson, 1999; Gibbs, 1994, 1995, 2007; i.a.).

1.1 Some Terminological Distinctions

A rough list of the different forms of idioms and formulaic language mentioned in the literature includes the following categories, here illustrated with English examples for convenience (Gibbs, 1994, 2007; Sailer, 2013):



Sayings (e.g., take it easy, come on!).


Proverbs, which describe recurrent situations (e.g., a bird in the hand is worth two in the bush, a stitch in time saves nine).


Phrasal verbs (e.g., come in, take off).


Binominals, typically of the form X-Conjunction-Y (e.g., by and large, black and white, spick and span, more or less) or of the form Noun-Preposition-Noun, which conveys plurality (e.g., day by day, student after student, face to face).


Phrasal compounds (e.g., red herring, dead-line, accident-prone, shopping center).


Formulaic expressions (e.g., at first sight, once upon a time).


Lexical bundles, identifiable statistically and by intuition as recurrent sequences (e.g., a little bit of, you don’t have to).


Collocations, or conventionalized co-occurrences including those formed by light verbs followed by specific object nouns (e.g., infinite patience, a hard frost, do a favor, give a look, take a step).


Idioms, formulaic constructions whose interpretations are unpredictable from individual lexical meanings under the effects of regular compositional rules. Different types of idioms have been also identified in the literature:

Quasi-idioms, whose meaning includes the meaning of the lexical components plus an additional non-compositional meaning (Mel'čuk, 1995) (e.g., give the breast, start a family).

Phrasal idioms (lexically headed) versus clausal idioms (headed by a sentential functional head: a fixed tense or mood, a modal, obligatory (or impossible) sentential negation, or CP-material such as a complementizer or a wh-phrase) (Horvath & Siloni, 2016, 2017) (e.g., land on one’s feet vs. can’t hold a candle to someone/something).

idiomatic phrases (which do not distribute their meanings to their components) versus Idiomatically combining expressions (whose meanings—while conventional—are distributed among their parts) (Nunberg, Sag, & Wasow, 1994) (e.g., saw logs ‘to sleep’ vs. pull strings ‘to exploit connections’).

Constructional idioms, which are not lexically fixed in full and whose meaning is associated with the whole construction (Jackendoff, 1997b, 2008) (e.g., the way construction’, ‘the X-er, the Y-er’ comparative correlative construction, ‘the V time-away construction’, ‘the V one’s part-of-the-body out/off construction’, ‘the X after X construction’ (cf. formal or lexically open idioms, Fillmore et al., 1988).

These terms do in no way represent a systematic taxonomy of multi-word units, and in fact describe different sets of idiomatic expressions. Thus, semantic phrasemes (following Mel'čuk, 1995, p. 179) can on the one hand be classified as idioms, collocations, and quasi-idioms. On the other hand, while all idioms are collocations, insofar as they are conventionalized or fixed, most collocations have been claimed not to be idioms insofar as they are compositional (Larson, 2017, p. 400).1 Additionally, some formulaic expressions have been analyzed as grammatical constructions (e.g., let alone, Fillmore et al., 1988), but not all grammatical constructions can be considered members of formulaic language (e.g., ‘the X-er, the Y-er’ comparative correlative construction, Borsley, 2004; Dikken, 2005; Abeillé, Borsley, & Espinal, 2006).

Furthermore, it should be noted that the mere existence of this typology derives from the various dimensions on which idiomaticity lies (Nunberg et al., 1994):

(In)flexibility. The form of idioms may show special syntax, and they appear in a limited number of syntactic structures as compared to regular expressions.

Conventionality. The meaning of idioms cannot be (entirely) predicted on the basis of knowledge of the independent conventions that determine the use of their constituent parts when they appear in isolation from one another.

(Lack of) compositionality. The meaning of idioms cannot be (entirely) composed from the meaning of lexical items and their specific combination predicted from the syntactic structure in which these items appear.

Figuration. The meaning of idioms involves metaphors, hyperboles, and other kinds of cognitive figures, and these figurative meanings appear to show particular processing properties in comparison to the literal uses of these same expressions (Gibbs, 1980, 2007; Cacciari & Tabosi, 1988, 1993).

Finally, it is worth pointing out that, even though idioms are often claimed to be characteristic of informal language and popular speech, most characteristically they are very specific to particular communication systems or registers. Furthermore, they have been said to be “especially useful in terminating a topic because of their distinctive manner of characterizing abstract themes in concrete ways” (Gibbs, 2007, p. 703), which means that they have a significant role in a theory of communication and cognition (Sperber & Wilson, 1995; Vega-Moreno, 2003). As such, idioms also reflect the interaction of the system of language with the emotion system, a system that deals with the assignment of values, positive and negative, including emotions, attitudes, and opinions to some entity, event, or situation (Foolen, 1997).

2. The Syntax of Idioms

Idioms introduce a challenge to linguistic theory due to the fact that they are not easily analyzed under a derivational approach to the theory of grammar, such as the one that has been developed within mainstream generative grammar.2 This challenge follows in part from the way that idioms are usually defined in the literature, conceived of as constituents or series of constituents for which the semantic interpretation is not a compositional function of the formatives it is composed of (Fraser, 1970).

The compositionality issue has been conceived in the generative literature as the central reason for the deficient syntactic behavior of idioms. In accordance with this view, it has been argued that “the reluctance of some idiom parts to undergo certain syntactic operations follows from the fact that idioms are not built up in a compositional manner, because a compound idiomatic expression corresponds to one primitive meaning expression” (Schenk, 1995, p. 253).

This, in turn, relates to the so-called flexibility issue, that is, how the varying degrees of syntactic flexibility can be captured (Sailer, 2013). “Idioms typically appear only in a limited number of syntactic phrases or constructions, unlike freely composed expressions” (Nunberg et al., 1994, p. 492). The proposal made by Nunberg et al. (1994) is to explain the variety of “transformational deficiencies” (Fraser, 1970) of idioms by distinguishing between flexible, decomposable idioms, and fixed, non-decomposable idioms. The former, in Nunberg et al.’s terminology, correspond to idiomatically combining expressions (ICEs) and the latter to idiomatic phrases (IPs). Only ICEs permit, to various degrees, syntactic processes such as passivization, modification and quantification, ellipsis, topicalization, and the like, processes that in the recent history of linguistics have sometimes been analyzed as transformational and sometimes as lexical.3

What has just been pointed out makes explicit a third syntax versus lexicon issue introduced by idioms. If idioms are conceived as multi-word expressions (not as single morpheme or word expressions, cf. Marantz, 1997), the question that arises is how idioms are to be represented in the lexicon, since they are not Xº categories. As multi-word constructions, idioms are traditionally conceived to be part of the lexicon; they are generally assumed to be included in a repository of language that lists basic relations between forms of linguistic objects and other properties of these objects. However, because of their common multi-word status, idioms represent a serious challenge for a standard theory of lexical insertion. An alternative analysis, based on a series of correspondence rules between phonological structures, syntactic structures, and conceptual structures, is at the basis of a lexical licensing theory that has been proposed in a seminal work by Jackendoff (1995, 1997a, 2002, a.o.).

Given these challenges and issues, in the rest of this section devoted to the syntax of idioms (cf. Fellbaum, 2015), the following topics will be dealt with:


The categories of idioms.


Syntactic properties: constituency, selection.


Possible and impossible idioms.


Idioms as constructions, with and without a canonical syntax.

2.1 The Categories of Idioms

Studies on idioms usually suffer from two main flaws. Most studies deal exclusively with idioms in English, and most of them deal only with verbal idioms (those whose head is a verb; cf. Marantz, 1984; Fillmore et al., 1988; Nunberg et al., 1994; Croft & Cruse, 2004; Svenonius, 2005; Evans & Green, 2006, a.o.). However, it has to be pointed out that some attempts exist to study idioms both from cross-linguistic and cross-dialectal perspectives (Harwood et al., 2016; Markantonatou & Sailer, 2018).

Unfortunately, from a cross-linguistic perspective one can find basically lexicographic studies that provide lists of form-meaning pairs for various sorts of lexicalized multi-word expressions. Some of these studies contain in addition to the collection of idioms and their conventionalized meaning information regarding the syntactic manipulations each idiom allows, based on single (or few) speakers’ judgements.

A second common failure extends to the syntactic category of the idioms studied, usually verbal, and sometimes adverbial and/or prepositional.4 We hereby wish to remark that idiomatic expressions are multicategorial, depending on whether the head of the idiom is either a lexical or a functional category, with the result that idiomatic expressions can be associated with either VP, NP, AP, AdvP, and even with PP, DP, QP, ConjP, CoordP, and S/TP.5

2.2 Syntactic Properties

In this section four issues will be discussed: the internal versus external syntax of idioms (Harwood et al., 2016), the syntactic flexibility and referential availability of DPs (Nunberg et al., 1994; Harwood et al., 2016; Mateu, in press), the continuity constraint (O’Grady, 1998), and the selection constraint (Bruening, 2010, 2017).

It is believed that the internal syntax of idioms is built up through the same regular, compositional, structure-building mechanism that is at the basis of non-idiomatic structures (cf. Fellbaum, 1993; Nunberg et al., 1994; Svenonius, 2005, a.o.). In particular, it has been assumed that, if the object of study is a verbal idiom consisting of a V + a direct object DP (e.g., English spill the beans, kick the bucket), manipulations on the internal syntax of these idioms may consist on zooming in at the level of the DP structure, and exploring the need of the definite article as a contributing trigger for idiomaticity, as well as the possibility of using other determiners (indefinites, demonstratives, possessives, quantifiers, etc.) without altering this idiomaticity.

In relation to this issue, the distinction of Nunberg et al. (1994) between ICEs and IPs has become extremely relevant. An ICE such as spill the beans is characterized by being syntactically flexible and by having a referential DP, which means that the definite article can be replaced by other determiners and that the use of either a definite or an indefinite article constrains the referential properties of the DP while keeping its idiomaticity. Syntactic flexibility and referential availability allow not only for the change of the D, but also for modification, topicalization, and for the mapping of the complement noun into a figurative meaning. By contrast, an IP such as kick the bucket is characterized by not being syntactically flexible and by not having a referential DP, which means that in this case the definite D is not associated with a referential interpretation. Lack of syntactic flexibility and lack of reference go hand in hand, in such a way that only the idiom as a whole, but neither the full DP nor the complement noun, can be mapped onto a figurative meaning.

This distinction could be analyzed as postulating for an ICE such as spill the beans, which can be associated with both a literal, compositional interpretation, and with an idiomatic reading ‘divulge the secrets’, an internal syntax with a DP structure. By contrast, the internal syntax of an IP such as kick the bucket, which can be associated with both a literal, compositional interpretation, and the idiomatic meaning ‘die’, would project a DP layer only in the literal interpretation, whereas in the idiomatic reading the complement of the verb would be of the category N. Hence, the definite D in IPs has been said not to introduce a phase and to be semantically vacuous or expletive, so that the nominal expressions in object position have been analyzed as forming a complex predicate with the verb (Espinal, 2001).

The external syntax of idioms is relevant as far as it has been claimed that idioms represent opacity domains (i.e., phases) and, therefore, that there is a size limitation to idioms. In relation to this issue it has been claimed that all parts of a phrasal idiom must be minimally dominated by the same phrasal node (vP, CP) (Svenonius, 2005). Thus, it has been asserted that verbal idioms are composed of material from the vP domain and cannot be composed of material from the TP domain, the general idea being that while material in the TP domain (i.e., Tense, Mood, Aspect, Voice) can embed idioms, verbal idioms are never dependent on such material.6

According to this analysis of the external syntax of idioms, which stems from Chomsky (1980, 1981) and Marantz (1984, 1997), verbal idioms consist of verbal predicates and their arguments, as represented in (1) and (2) (cf. Harwood et al., 2016). Notice that idiomatic patterns consisting of a V + DP theme and a subject + VP are both analyzed as vP.7



Following this analysis, all verbal idioms, even the ones that look like sentential idioms, are generated within the vP domain: verbal idioms are constrained by the clausal-internal phase boundary, which is equivalent to saying that there are locality constraints on idiomatic constructions.

Let us now focus on a much discussed constraint that has been claimed to regulate the internal syntax of idioms, the so-called continuity constraint (O’Grady, 1998), which deals with idioms in terms of dependency relations.

O’Grady shows that, although it has frequently been suggested that idioms might form constituents at some level of representation, this hypothesis finds several counterexamples:

idioms with open genitive positions (e.g., cool one’s heels),

idioms that can take non-idiomatic modifiers (e.g., leave no (legal) stone unturned), and

discontinuous idioms (e.g., show (someone) the door).

Therefore, the conclusion (against Fraser, 1970) is that idioms need not form constituents, but still they are subject to an important grammatical constraint: a head licenses its dependents (i.e., their arguments, modifiers, specifiers) in that its syntactic and semantic properties determine the number and/or type of other elements with which it can or must occur within a local syntactic environment. Consider the continuity constraint in (3) (from O’Grady, 1998, p. 284, (12)).


The proposed analysis reduces idioms to a continuous chain of head-to-head relationships, independently of the categorical status of the idiom (VP, S, etc.).8 This approach seems to have a number of advantages: (i) it straightforwardly accounts for the existence of non-constituent idioms (e.g., get one’s goat), (ii) it also accounts for the optional co-occurrence of some idioms with adjectives, quantifiers, and other types of modifiers (e.g., kick the (filthy) habit, pull (yet more) strings), and (iii) it allows handling discontinuous idioms in general (e.g., bring (something) to light, lead (someone) a merry chase).

Besides the continuity constraint, a principle of idiomatic interpretation has been proposed in the literature (Bruening, 2010, 2017), whose basic idea is that two syntactic constituents, X and Y, can be interpreted idiomatically only if one selects the other.9 In addition, Bruening (2010, p. 532, (25)) postulates the following constraint:


If X selects a lexical category Y, and X and Y are interpreted idiomatically, all of the selected arguments of Y must be interpreted as part of the idiom that includes X and Y.

This hypothesis starts from the assumption that non-selected elements, like adjectives, possessors, and the like, can appear in between pieces of idioms because they do not disrupt this selection. Adjectives and adverbs may be part of an idiom (e.g., beat a dead horse), because they select for the projection they adjoin to. Unlike the continuity constraint approach, according to which adjuncts are dependent constituents on nominal or verbal heads, the selection approach postulates that adjunts select the categories they adjoin to. Thus, in beat a dead horse the verb beat selects an NP headed by the N horse, and dead selects a nominal projection also headed by horse.

Lexical selection can specify a single lexical item, a list of lexical items, or even a semantic class of lexical items, a range of possibilities that is particularly important in the case of idioms that allow some degree of lexical substitution of the selected idiom item.

Furthermore, according to this hypothesis idioms can cross the vP boundary by involving modals, aspect, negation, imperatives, and other functional material. Hence, it is hypothesized that, in addition to lexical material inside the VP, idioms can also cross the CP phase boundary. Thus, an idiom such as the cat (has) got (someone)’s tongue ‘Someone is speechless’ is claimed to involve not only a verb get, a subject cat and an object tongue, but also a matrix Voice head that selects an external argument (the NP headed by cat) and a VP headed by get, as represented in (5) (Bruening, 2017, p. 188, (15)). Notice that in this representation, and coherent with the idea that the selection constraint applies only to lexical categories (see (4)), functional elements such as Determiners and Tense are excluded.


To sum up, in this section the main syntactic properties and constraints that apply to idioms have been synthesized. Still, a question remains: from a syntactic perspective what is an (im)possible idiom? This issue is addressed in section 2.3.

2.3 Possible and Impossible Idioms

The study of the syntax of idiomatic expressions has become relevant not only for identifying permissible idioms but also for predicting those sequences that are considered impossible idioms, because they violate some syntactic requirements. “Possible idiom types tell us about possible selectional configurations, non accidentally non existing idiom types tell us about impossible selectional configurations” Sportiche (2005, p. 79). Thus, there are no V + D idioms to the exclusion of the complement N (*kick the N), there are no V + N idioms to the exclusion of the P that selects the N (*fly P the handle), and there are no subject + object idioms to the exclusion of the lexical verb that connects the two (*the shit V the fan). Idioms have to meet this structural description (Sportiche, 2005, p. 81):


According to O’Grady (1998, pp. 287–289), the continuity constraint also makes specific predictions about the types of idioms that cannot occur:


Given that no licensing relation holds between the heads of a subject and a direct object, there should be no idioms with an idiomatic subject and direct object but with an open verb position (e.g., *a wolf in sheep’s clothing V a son of a gun). Within a sentential idiom head-to-head relations crucially involve the verb.


There are no idioms that consist of just a V and a genitive with the following head left open (e.g., *play the devil’s x), because the head of the complement is crucially dependent on the verb (e.g., play the devil’s advocate).


There are no idioms consisting of a V and an NP inside a PP complement, with the choice of preposition left open (e.g., *beat P the bush), because a specific prepositional head is dependent on the verb to form an idiom (e.g., beat around the bush).

According to a selection-based approach to the syntax of idioms (Bruening, 2010, 2017), even though no syntactic locality constraint is assumed to regulate the syntax of idioms, there still exist syntactic constraints on idiomatic phrases. The first such constraint is imposed by the Principle of Idiomatic Interpretation, which—as previously postulated—imposes selectional relations between all items of an idiom. This principle, which rules out discontinuities in idiomatic phrases: “where X selects Y and Y selects Z but only X and Z are part of the idiom” (Bruening, 2017, p. 189), predicts that an idiom such as *Someone has V-en an old shoe, with the main verb free, is impossible.

The second constraint that idioms are subject to is the Constraint on Idiomatic Interpretation, also previously formulated, which captures the fact that there are many verb-object idioms, with an open slot for the subject, but there are by far less subject-verb idioms with an open slot for the object (e.g., a little bird has told me that . . .).

According to Bruening, selection takes place exclusively between lexical categories (V, N, A, Adv). However, against this approach one might consider not only those idioms that have functional elements as part of idioms, but also those idioms whose head is a functional category (e.g., D, Conj, P, etc.). These idioms abound in many natural languages. Consider, for example, the Catalan la mare del Tano lit. the mother of.the Tano [expression of surprise], i ara! lit. and now [expression of disapproval], de gom a gom lit. from gom to gom ‘full’ (Espinal, 2004a); in spite of the fact that neither the definite article, nor the conjunction, nor the weak preposition are lexical categories, these functional categories together with their selected arguments (the noun mare ‘mother’, the adverb ara ‘now’, and the nonexistent word gom beyond that idiomatic expression) are part of the idioms, which leads us to conclude that functional categories should not be excluded from possible idioms.

Additional restrictions on possible idioms affect (i) the number of subordinate clauses, which must be restricted to one (Catalan No tenir on caure mort lit. not have where fall dead ‘poor’, no saber quin dia menja pa lit. not know what day eats bread ‘ignorant’), and the obligatory presence of a dative clitic (Catalan aixafar-li la guitarra (a algú) lit. crash.cldat the guitar to somebody ‘disrupt’).

Let us finally consider asymmetries in idioms built with ditransitive verbs (Bruening, 2010; Larson, 2017). Bruening observes that V + someone + NP (e.g., give (someone) the boot) are common, but there is no V + NP + someone (*throw the wolves someone), the reason being that the second object is what is selected by the V, while the first object (the experiencer/goal) is selected by an Applicative head. “If an idiom includes the first object, it necessarily includes Appl selecting that object and V, but then by [the Constraint on Idiomatic Interpretation] the argument of V must be included as well” (Bruening, 2017, p. 190).

In turn Larson (2017) argues that in English there are no dative idioms at all, of either the oblique or the double object variety. According to him structures of the sort give + someone + NP are not idioms because they appear to be fully compositional: give is always interpreted as conveying caused possession, and the figurative meaning is linked exclusively to the contribution of the final NP and is not a property of the construction as a whole. Similarly, expressions such as show + someone + the door should not be considered an idiom either, because the meaning of the verb is basically one of caused motion. “Putative double object idioms are in fact not idioms but collocations (. . .). Putative PP dative idioms either aren’t datives (rather, they are caused motion constructions) or aren’t ditransitives” (Larson, 2017, p. 424).

2.4 Idioms as Constructions

The previous discussion leads to some theoretical implications (Harwood et al., 2016): (i) idioms in general and verbal idioms in particular are larger than simply the verb and its arguments, as standardly assumed, and (ii) the size of verbal idioms is subject to cross-linguistic variation, sometimes including vP internal subjects (e.g., the shit hit the fan), sometimes including adjuncts (e.g., take the bull by the horns).

It has been claimed that idioms are a type of construction (Nunberg et al., 1994, p. 507; O’Grady, 1998, p. 290). “Constructions may be idiomatic in the sense that a large construction may specify a semantics (and/or pragmatics) that is distinct from what might be calculated from the associated semantics of the set of smaller constructions that could be used to build the same morphosyntactic object” (Fillmore et al., 1988, p. 501). Constructions, therefore, are form-meaning pairs that associate syntactic structures with semantic and pragmatic meanings not mediated by the pieces they are composed of.10

According to Jackendoff (2002, 2008) constructional idioms (constructions with a (partially) non-compositional meaning, of which not all terminal elements are fixed) can either have canonical syntax (e.g., Jim joked his way out of the meeting, That fool of a politician; Booij, 2002) or idiosyncratic non-canonical syntax (e.g., let alone, the more . . . the less, student after student; Fillmore et al., 1988). Still, what is more relevant from Jackendoff’s Parallel Architecture approach (Jackendoff, 1997a, 2007, a.o.) is that it can account for items that most characteristically do not have phonology, morphosyntax, or semantics, as in (7):


It can also account for items that show special links (interface connections) between phonological structures, syntactic structures, and conceptual structures. Thus, Jackendoff distinguishes among:


Finally, within a Parallel Architecture approach for a VP idiom like kick the bucket Jackendoff proposes a link, specified by means of subindex 4 between the three representations in (9).11


Overall, the idea is that idiomatic phrases are associated with syntactic structures, but their meaning is not composed from parts of it. Rather, they are associated with clusters of information that include conventionalized meaning and specific pragmatic meanings that can be represented by means of conceptual structures.

3. The Meaning of Idioms

Following Nunberg et al. (1994), three semantic properties can be distinguished when dealing with the interpretation of idioms: conventionality, compositionality, and transparency, which are discussed in subsections 3.1, 3.2, and 3.3.

3.1 Conventionality

According to their definition of conventionality, expressions can be defined as conventional when their meaning or use can’t be predicted, or at least entirely predicted, on the basis of a knowledge of the independent conventions that determine the use of their constituents when they appear in isolation from one another. So the conventionality of idioms, if understood in this narrow sense, refers to the discrepancy between the figurative or idiomatic reading and the predicted literal meaning of the expression. For example, idiomatic expressions like kick the bucket or spill the beans are both considered to be conventional because their meaning (‘to die suddenly’ and ‘divulge secret information’, respectively) cannot be predicted in the relevant sense previously defined.

To a certain extent conventionality is a semantic property that idioms can be said to share with so-called “collocations” (see Sinclair, 1991; Torner & Bernal, 2017, i.a.). Typically, collocations have been said to be expressions that can be interpreted, unlike idioms, more or less correctly out of context, but cannot be produced correctly if the conventional expression is not already known to the speech community: for example, compare AE thumb tack and BE drawing pin (see Croft & Cruse, 2004, pp. 249−250). To put it in the words of Fillmore et al. (1988), collocations are “encoding idioms,” that is, idioms that can be interpretable by the standard rules of interpretation but are conventional for these particular expressions with this meaning. In contrast, “decoding idioms” like kick the bucket are the ones that cannot be decoded by the hearer. Any decoding idiom is then an encoding idiom: if the hearer cannot figure out what it means, then she is also not going to be able to guess that it is a conventional way to express that meaning in the language.

The distinction between idioms and collocations has been said to be a gradual one. For example, compare the reductive definition of collocations provided by Nunberg et al. (1994, pp. 493–494): “when we encounter a fixed expression that is missing several of the relevant properties—say one that involves no figuration, lacks a proverbial character, and has no strong association with popular speech—we become increasingly reluctant to call it an idiom. Examples might be collocations like tax and spend, resist temptation, or right to life.”

3.2 Compositionality

Compositionality refers to the degree to which the phrasal meaning, once known, can be analyzed in terms of how it is distributed among the individual parts of the expression. It is precisely the semantic property of compositionality that allows Nunberg et al. (1994) to make an important dual distinction between non-compositional idioms (aka “Idiomatic Phrases”; IPs) and compositional idiomatic expressions (aka “Idiomatically Combining Expressions”; ICEs). For example, the meaning of a prototypical idiom like kick the bucket is not distributed among the parts of the idiom: that is, the expression as a whole is mapped onto the meaning of the idiom. In contrast, the meaning of an idiomatic expression like spill the beans is distributed among their parts: the individual parts of the literal expression can be mapped onto individual parts of the figurative/idiomatic meaning (e.g., spill: ‘divulge’ // the beans: ‘the information/a secret/ . . .’). Importantly, Nunberg et al. (1994, p. 499) only intend the weaker claim that speakers are capable of recognizing the compositionality of a idiom like spill the beansafter the fact, having first divined its meaning on the basis of contextual cues” (emphasis ours). In short, what these authors have done is dissociate conventionality from non-compositionality. According to them, all idioms are conventional per definition but most of them are semantically compositional (although they do not provide any statistical information to show that the lexicon of English contains more ICEs than IPs).

Moreover, there are some authors (Mendívil, 2009; Mateu, in press) who prefer avoiding the contradictory label of “compositional idioms” and use the alternative label of “idiomatic collocations.” Indeed, a positive consequence of using this terminology is that it allows one to maintain the standard claim that the meaning of idioms is non-compositional, whereas the meaning of collocations is compositional. Accordingly, there appear to be two types of idiomatic expressions: proper idioms and idiomatic collocations. See also Larson (2017) for the relevance of this distinction in the realm of dative idioms.

The semantic property of (non-)compositionality has been shown to be related to syntactic (in-)flexibility (Gibbs & Nayak, 1989; Nunberg et al., 1994, i.a.): accordingly, those idioms that are semantically non-compositional are predicted to be syntactically inflexible, whereas idiomatic collocations (aka idiomatically combining expressions; ICEs) are expected to have a more flexible syntax. As argued by Nunberg et al. (1994, pp. 500–503), grammatical operations like the ones provided by modification, quantification, topicalization, ellipsis, and anaphora, among other things, provide important evidence that the pieces of compositional idiomatic expressions have identifiable meanings that interact semantically with each other: for example, pull strings can be safely claimed to be an ICE from the following tests, which show that parts of this idiomatic expression should be assigned independent meanings (e.g., pull: ‘exploit’ // strings: ‘personal connections’), contributing to the interpretation of the whole: compare Pat got the job by pulling strings that weren’t available to anyone else // We could pull yet more strings // Those strings, he wouldn’t pull for you // Kim’s family pulled some strings on her behalf, but they weren’t enough to get her the job (examples taken from Nunberg et al., 1994).12

3.3 Transparency

The third relevant semantic property is transparency/opacity, which should not be confused with compositionality/non-compositionality, respectively. An idiomatic expression can be claimed to be conceptually transparent when there is a metaphorical motivation for the meaning it involves. For example, consider the following examples that have to do with anger (Gibbs, 1995, 2007): blow your stack, flip your lid, hit the ceiling, get hot under the collar, and get steamed up, among others. Following Gibbs, these related idiomatic expressions can be said to be conceptually transparent because they can be claimed to be motivated by the conceptual metaphor ANGER IS HEATED FLUID IN A COINTAINER.

The existence of so-called families of idioms like the previous one goes against the traditional claims that idioms are dead metaphors and that their form-meaning mappings are arbitrary. Idioms are not isolated multi-word constructions stored in our mental lexicon but form networks thanks to the intervention of conceptual metaphors that motivate their related meanings (i.e., conceptual metaphors make their meanings [more] transparent).

Importantly, Lakoff (1987), Gibbs and O’Brien (1990), and Gibbs (1995, 2007, i.a.) argue that conceptual metaphors should not be conceived of as mere generalizations of linguistic meaning. For example, one could say that the mentioned idioms refer to the idea of getting very angry not because of conceptual metaphor but because the words stack, lid, ceiling, hot, and steam also have abstract meanings that convey the idea of anger. Against this possible objection Gibbs and his colleagues argue that one way of uncovering metaphorical knowledge in idiomaticity is through a detailed analysis of speakers’ conceptual images for idioms. Through different psycholinguistic analyses, which are reviewed in Gibbs (1995, 2007), this cognitive scientist has shown that, when imagining anger idioms, the subjects of his experiments know that pressure causes the action; that one has little control over the pressure; that its violent release is unintended; and that once the release has taken place (i.e., once the stack has been blown, the lid flipped, or the ceiling hit), it is difficult to reverse the action. Alternative lexical theories are said to be unable to explain the presence of these specific conceptual inference patterns, which leads Gibbs and his colleagues to conclude that the motivation for these particular folk conceptions comes from two conceptual metaphors (ANGER IS PRESSURIZED HEAT and THE MIND IS A CONTAINER) and that the mapping of information from a source domain (e.g., heated fluid in a container) to a target domain (e.g., the anger emotion) defines our conceptualization of anger and motivates the idiomatic expressions used to talk about anger. It is also worth noting that the first metaphor can be applied not only to idioms like the previous ones but also motivates the meaning of non-idiomatic expressions such as I exploded with anger.

Similarly, Espinal and Mateu (2010) claim that the meaning of Catalan idioms like treure el fetge per la boca ‘to expel the liver through the mouth’ (cf. Engl. to work one’s guts out); petar-se el cul ‘to explode the butt’ (cf. Engl. to laugh one’s butt off); sortir-li els ulls de les òrbites ‘(the eyes) to leave the orbits’ (cf. Engl. to cry one’s eyes out), and more can be said to be transparent by positing the underlying conceptual metaphor AN EXTREME INTENSITY IS AN EXCESSIVE DETACHMENT OF A BODY PART. Their claim is that the meaning of these idioms can be shown to be transparent to the extent that it is motivated by the relevant conceptual metaphor. This notwithstanding, it is important to realize that their meaning is not compositional: for example, the direct object el cul ‘the butt’ (cf. Ens vam petar el cul; We laughed our butts off) is not referential and the meaning of the idiom (i.e., ‘to laugh a lot’) is not distributed onto the individual parts of the expression (e.g., butts-off: ‘a lot’). Importantly, the relevance and recurrence of body parts in many idiomatic constructions has been claimed to reflect the embodiment nature of human thought (see Johnson, 1987; Gibbs, 2006, i.a.).

3.4 Semantic Compositionality Versus Conceptual Transparency

After having reviewed the three main semantic properties of idiomatic expressions (namely, conventionality, non-/compositionality, and transparency/opacity), it is worth noting that some formal linguists have taken pains to show that semantic compositionality and conceptual transparency must not be mixed when dealing with the analyzability of idioms. Indeed, it is important to point out that, although some correlation is expected to be found between these two properties (e.g., clearly transparent idioms tend to be compositional and, conversely, opaque idiomatic expressions tend to be non-compositional), it is relatively easy to find transparent idioms that are non-compositional.

For example, evidence for the distinction between semantic compositionality and conceptual transparency comes from the fact that there are idioms like hit the ceiling ‘to get very angry’, which have been claimed to be transparent (i.e., metaphorically motivated) but are not compositional (e.g., cf. #The ceiling was hit by Joe; notice that this passive example is only well-formed on the literal reading).13

Incidentally, a piece of evidence for this distinction between conceptual transparency and semantic compositionality also comes from the different effects of lexical versus syntactic flexibility of idiomatic expressions.

On the one hand, lexical flexibility can be related to conceptual transparency but not to semantic compositionality:for example, compare transparent ICEs such as throw one’s {hat/cap} into the ring with transparent IPs such as hit {the ceiling/the roof} (NB: of course, such a lexical flexibility is not generally at all due to the typical lexicalized nature of idiomatic expressions: e.g., cf. ICEs spill {the beans/#the peas} and IPs kick {the bucket/#the pail}).14

On the other, syntactic flexibility can only be found in semantically compositional idiomatic expressions (ICEs). Conceptual meaning, being it transparent or not, is not relevant for syntactic flexibility. Rather it is only semantic meaning that is relevant to syntactic processes. Following this thread, Mateu and Espinal (2007) argue that, when dealing with the meaning of idiomatic expressions, two different kinds of meanings can be distinguished: syntactically transparent meaning (i.e., the one relevant for semantic compositionality) and syntactically non-transparent meaning (e.g., the one relevant for conceptual/metaphorical motivation).

3.5 Aspectuality and Referentiality in Idioms

In relation to the important claim of Nunberg et al. (1994) that “Idiomatic phrases” (IPs) are non-compositional but “Idiomatically combining expressions” (ICEs) are compositional, it has been noted that the former do not necessarily preserve the aspectual class associated to the literal meaning, whereas the latter do preserve it (see Glasbey, 2003, 2007; Espinal & Mateu, 2010; contra McGinnis, 2002). For example, consider the figurative meanings of idioms such as to work one’s guts out ‘to work a lot’ and to paint the town red ‘to have a very enjoyable time’, which are non-compositional (e.g., notice that the direct object in these examples is not referential). It is interesting to point out that the aspectual reading of these idioms is different from the literal one: that is, the aspectual reading associated to the idiomatic meaning is atelic (cf. John worked his guts out {for hours/*in ten minutes}; They painted the town red {for hours/*in two hours}). In contrast, and as expected, the aspectual reading that corresponds to the literal resultative constructions is telic: for example, compare John worked his debts off in ten months; They painted the wall red in two hours. Glasbey (2003, 2007) claims that the activity reading of the eventuality in these idioms has to do with the absence of a gradual patient (in the sense of Krifka, 1992), which is indeed present in their literal resultative-like counterparts. For an alternative account, see Espinal and Mateu (2010), who claim that the telic-to-atelic event type-shifting involved in idioms like work one’s guts out could have a conceptual explanation related to the unbounded nature of the notion of intensity involved in the figurative meaning.

Interestingly, McGinnis (2005) points out that the atelicity of these idioms could be due to their having a non-referential object, but she finally discards this proposal by noting that the direct object of kick the bucket is not referential but its idiomatic aspectual reading is still telic. However, one could rebut this observation by noting that kick the bucket does not involve a gradual patient in the literal reading either (as shown by Levin, 1993; and Levin & Rappaport Hovav, 2005, i.a., impact verbs like kick or hit behave differently from change of state and incremental theme verbs), whereby the qualification of McGinnis (2005) could be claimed to lose ground. That is to say, the generalization could be explored that those idioms that have a gradual patient in the literal reading become atelic when the object DP is non-referential.

Finally, in connection with these related issues (compositional aspectuality and referentiality of the object) Espinal (2001, 2004b, 2009) has argued for the proposal in which the object nominals and object clitics of certain idioms in Romance are property denoting expressions rather than entity-denoting expressions. For example, in her analysis of some verb plus object idioms, this author argues that the nominal expression in object position of unergative argument structures (e.g., Cat. fer via, lit. “to make way”, i.e., ‘to hurry up’) denotes a gradable property and conveys event modification over a degree scale: for example, compare fer {més/molta/força/massa} via, lit. “to make {more/much/a lot/too much} way,” that is, ‘to hurry up (to a certain degree)’. These objects, because of their denotation, have been claimed to be semantically incorporated into the verbal predicate. That is to say, Espinal’s proposal is that, at some level of representation, namely at the syntax–semantics interface, these object nouns denote properties and form complex predicate units with the target verb (cf. Farkas & De Swart, 2003; Dayal, 2011, i.a., for semantic incorporation).

3.6 Thematic Constraints in the Formation of Idioms

To conclude this section on the meaning of idioms, it will be important to make some remarks on thematic role constraints mentioned in the literature. Due to its major theoretical relevance, one of the most relevant constraints has even been transformed into a hypothesis: that is, the so-called No Agent Idioms hypothesis (e.g., see Harley & Stone, 2013). It is often noted that there are few idioms, if any, that can be claimed to include agents. However, some apparent counterexamples can be found, such as A little bird told me that . . . In any case it should be noted that the agent in this example does not form a non-compositional IP but a compositional ICE: indeed, the subject must have a referential status (cf. ‘And I replied to him/her that . . .’), unlike the direct object of non-compositional idioms like kick the bucket, where this nominal phrase does lack referentiality.

Furthermore, some authors have claimed that there is an important thematic distinction that separates the object of non-compositional idioms (IPs) like kick the bucket ‘to die’ from the object of compositional idiomatic expressions (ICEs) like bury the hatchet ‘to reconcile disagreement’: the former is said to lack a thematic role (the bucket is a non-meaningful object), whereas the latter does have one (i.e., the one that would correspond to “disagreement”). For example, see Jackendoff (1997a, p. 169): ‘‘In other words, the hatchet is linked to bury via its theta-role; but the bucket has to be linked to kick syntactically because it has no theta-role. Hence the hatchet is movable and the bucket is not.”

However, if thematic roles are conceived in terms of conceptual structures, there appears to be a non-trivial problem when comparing conceptually opaque IPs like kick the bucket (where the bucket can be said to be non-meaningful; cf. also kick off, which in slang also means ‘die’) with conceptually transparent IPs like saw logs or hit the ceiling (where it is less clear that the object lacks a conceptual thematic role). A similar qualm could be claimed to hold for transparent idioms like to laugh one’s head off (see the relevant metaphor: AN EXTREME INTENSITY IS AN EXCESSIVE DETACHMENT OF A BODY PART). Assuming that there is a displacement involved in our conceptualization of the source domain of this idiom (one’s head could be claimed to be the displaced object, i.e., the Theme, in this domain), it is indeed dubious that the meaning of this idiom can be reduced, as Jackendoff claims, to ‘laugh a lot’. Assuming the dual distinction of meaning pointed out between (syntactically relevant) semantic structure versus (syntactically irrelevant) conceptual structure (a distinction that Jackendoff does not accept; see Jackendoff, 1983, 2002), the relevant conclusion seems to be that the fact that the object lacks referentiality in semantic structure does not involve that it has no thematic role in conceptual structure: for example, although the object the bucket in kick the bucket lacks both referentiality and theta-role, it can be claimed that the object logs in saw logs, or the ceiling in to hit the ceiling, lacks referentiality (that’s why these idioms can be classified as IPs) but does have a conceptual thematic role (at least in the conceptually structured source domain).15

4. Applied and Experimental Studies

This section deals with some of the most relevant orientations in the study of idioms from lexicographic and computational approaches, as well as from psycholinguistic and neurolinguistic perspectives.

4.1 Lexicographic Studies

Many words have a phraseological tendency (the idioms principle; Sinclair, 1991) in addition to a terminological tendency (the open-choice principle). According to the idioms principle “a language user has available to him a large number of semi-preconstructed phrases that constitute single choices, even though they might appear to be analyzable into segments” (Sinclair, 1991, p. 110). According to the open-choice principle practically each position in a clause offers a choice.16

The existence of a phraseological tendency in language has inspired an extensive number of studies concerned with different aspects of ‘chunking’, but the basic question that linguists and lexicographers aim at answering is how idioms should be recorded in monolingual and bilingual dictionaries. The first problem raised at the time of answering this question is that meaning is context-dependent, which means that the meaning of an idiom is highly dependent on strict co-occurrences, for which corpus analyses can provide highly relevant information and measure the statistical significance of existing collocations.

Additionally, given a significantly frequent co-occurring group of words, a second problem has to do with its form and, more specifically, with the identification of the word-lemma to whose entry the idiom will be sorted in a dictionary. Is it the first lexical item or rather the last one? Or perhaps the identification of this word-lemma does not depend on word order but rather on structural criteria; that is, in the case of a VP idiom, the head is the V, and in the case of an NP idiom, the head is the N. To illustrate this claim, consider the verb blow in English. This verb is associated in the online Cambridge Dictionary with the idiomatic expression blow a fuse ‘become very angry’, while the noun fuse is associated with the idiomatic expression have a short fuse ‘get angry very easily’. Notice that in the latter expression, even though it has the form of a VP, the verbal head is a light verb with little semantic content and the predicate relies on the complement object noun. This is probably the justification why the idiom have a short fuse is a subentry of the N, the head of the predicate.17

The contrast just exemplified illustrates divergent decisions made in lexicography regarding the codification of idioms and how they should be registered in monolingual dictionaries. Additional differences may appear regarding the sort of information (definition, category, example, semantic relations with other idioms, etc.) that each entry provides.18 Interestingly, however, a quick look at the Collins Cobuild English Language Dictionary, a monolingual dictionary for learners of English, provides a different solution, given that both blow a fuse and have a short fuse are subentries under the noun fuse. The conclusion is that there is not a unique and perhaps an optimal way of registering and accessing idioms in lexicography. Different solutions seem to be the response to diverse lexicographic needs and analyses of what the syntactic or semantic head of an idiom is.

Lexicographic research has addressed the question of accessibility of phraseological information in dictionaries. Some studies of dictionary use and dictionary making have investigated whether their potential users would be able to find different sorts of idioms and collocations depending on the specific goal each type of dictionary is designed for (Lew, 2011, 2015).

Overall, current research shows that the study of idioms and collocations (Fellbaum, 2011) has been very productive from corpus-based linguistic and lexicographic studies interested in describing what constitutes a multi-word expression in different natural languages (Stubbs, 2001; Fellbaum, 2016). Other areas on which lexicographic research is crucial are computational linguistics (see section 4.2), machine translation (Volk, 1998), contrastive idiom analyses (Dobrovol'skij, 2000), corpus-based analyses (Moon, 1998; Fellbaum, 2007; Stubbs, 2007), dictionary-making (Mel’cuk, 1995, 2012; Moon, 2015; i.a.), and foreign-language learning (Hausmann, 2004).

4.2 Computational Studies

Researchers working on aspects of computational models of idiomaticity or “computational phraseology” (Heid, 2007, 2008) typically speak of “multi-word expressions” (MWEs) in their studies of Natural Language Processing (NLP). Sag, Baldwin, Bond, Copestake, and Flickinger (2002), in a work that has become a classic, call MWEs a “pain in the neck for NLP,” and they describe some of the phenomena that contribute to this designation. Idioms are of course classified under a group of MWEs. Indeed, one of the main issues that computational approaches to idioms have to deal with is how to automatically tell them apart from compositional constructions (e.g., see Fazly & Stevenson, 2006).

Since the turn of the 21st century, there has been an increasing interest in the computational analyses of MWEs. A good example is the LinGO project (Copestake et al., 2004), which has developed a format for representing MWEs (Copestake et al., 2002). Similarly, large NLP dictionaries, such as FrameNet (Ruppenhofer, Baker, & Fillmore, 2002), include descriptions of, and representation formats for, MWEs. For the LinGO lexicon, see Villavicencio, Copestake, Waldron, and Lambeau (2004).

Knowing whether an expression receives a literal meaning or an idiomatic meaning is important for NLP applications that require some sort of semantic interpretation. It is clear that some applications that would benefit from knowing this distinction are machine translation, (multilingual) information retrieval, and the like. For example, drawing on previous works on these applied fields, Villada-Moirón and Tiedemann (2006) explore to what extent word-alignment in parallel corpora can be used to distinguish idiomatic multi-word expressions from more transparent multi-word expressions and fully productive expressions.

4.3 Psycholinguistic and Neurolinguistic Studies

Since the 1970s a lot of experimental research on the comprehension of idioms from psycholinguistic and neurolinguistic perspectives has been carried out. In this section some of the most important experimental approaches to idioms that deal with processing (4.3.1), language acquisition (4.3.2), and disordered language (4.3.3) are reviewed.

4.3.1 Processing Studies

Three traditional processing models of idioms have been proposed in the literature: literal-first (serial) processing (Bobrow & Bell, 1973), literal and idiomatic (parallel) processing (Swinney & Cutler, 1979), and direct access of idiomatic meaning without any analysis of the literal meaning (Gibbs, 1980, 1986).

According to the first model, idioms are “big” lexical items stored in memory and, when encountering them, the first interpretation given by the processor is a literal one. After rejecting this literal reading as inappropriate, the idiomatic one is retrieved from the lexicon. The second model also claims that idioms are non-compositional (in this approach idioms are also interpreted as big/long words, with literal computation playing no direct role in the processing of idiomatic expressions) but differs in arguing for parallel processing: that is, the access to the idiom representation and the computation of the literal meaning proceed in parallel. Importantly, idioms can be accessed directly in the mental lexicon without need for additional computational analysis. Swinney and Cutler (1979) termed this model the Lexical Representation Hypothesis. Finally, the third model argues that the idiomatic meaning is accessed directly without computing the literal one unless there is sufficient contextual reason to do so. This model is known as the Direct Access Hypothesis.

In contrast to these three previous classical accounts, where literal processing and idiomatic processing are regarded as independent, there are some subsequent models that argue otherwise: for example, compare the Configuration Hypothesis (Cacciari & Tabossi, 1988) and the Hybrid Representation Hypothesis (Cutting & Bock, 1997; Sprenger, Levelt, & Kempen, 2006), both of which propose that literal computation has priority over access to idiomatic meaning.

According to the first hypothesis, the idiom’s potential literal meaning is activated until the comprehender is faced with sufficient cues (cf. the “idiomatic key”) to recognize the expression as being idiomatic (see also Titone & Connine, 1999, who argue that idiomatic meanings are both directly retrieved and literally analyzed during comprehension). Incidentally, the relevance of literal processing put forward by the Configuration Hypothesis can be said to fit well with those findings that indicate that many idioms have internal structure (Gibbs & Nayak, 1989; Nunberg et al., 1994).

Similary, the Hybrid Representation Hypothesis also argues for a distributed representation and a primacy of literal processing: in this model, idiomatic expressions are represented as phrasal frames in a lexical-conceptual level of the lexicon. Like words, idioms are associated directly with conceptual content. Like structures, access is claimed to be mediated via the literal structural components of the expression. See also Sprenger et al. (2006) for an extension or revised version of the hybrid model, where idiomatic representations are instantiated as super-lemmas, which serve as a representation of the syntactic properties of the idiom. This superlemma is stored and processed as a whole, but because it contains syntactic and semantic information of various kinds, it can link to other parts of the lexicon and grammar.

It is also worth noting that the findings of the Hybrid Representation Hypothesis have also been said to be compatible with the so-called dual process model of language competence (Van Lancker Sidtis, 2011), which separates formulaic and novel language as two disparate modes of language competence. But see Gibbs and Colston (2007, p. 834), among others, for a critique of such a strict separation. For years Gibbs and his colleagues have claimed that there is no relevant distinction to be drawn between the interpretation of formulaic language and the interpretation of novel expressions: importantly, conceptual metaphors have been claimed to be relevant not only for the interpretation of formulaic speech (e.g., idioms) but also for the interpretation of non-idiomatic expressions. See Gibbs (1995, 2007), and Gibbs and Colston (2007) for a summary of experimental research on the role of conceptual metaphors in idiom comprehension.

Finally, it is worth noting that Peterson, Burgess, Dell, and Eberhard (2001) is an important reference for evidence for dissociation between syntactic and semantic processing during idiom comprehension.

4.3.2 Acquisition Studies

In her study of how children learn idiomatic expressions Levorato (1993) argues that learning to use idioms is not just learning to associate the expressions with their corresponding meanings (i.e., idioms are not simply learned as “big words”; cf. supra), but rather it is a process that requires the development of “figurative competence” (cf. also Gibbs, 1994; Gibbs & Colston, 2007). By using different experimental evidence the author shows that this competence is achieved in different gradual stages or levels during which the ability to understand and produce idioms grows in parallel with the child’s increasing mastery of linguistic and pragmatic abilities.

In contrast, Van Lancker Sidtis (2011, p. 262) claims, in the context of her dual process model of language competence, that “formulaic expressions, under specialized circumstances not yet understood, ascend quickly and suddenly into a native speaker’s language competence.” This author puts forward some evidence for a specialized form of knowledge acquisition that may be operative for formulaic expressions.

4.3.3 Neurolinguistic Studies

As pointed out by Van Lancker Sidtis (2011), it is important to examine the naturalistic speech of subjects with left or right hemisphere damage due to stroke to determine the effect of localized damage on the use of formulaic expressions. For example, Van Lancker Sidtis and Postman (2006) argue that novel and formulaic language are affected differently by different types of brain damage: left hemisphere damage leads to selective impairment of novel language and relative preservation of formulaic language, while right hemisphere and/or subcortical damage lead to selective impairment of formulaic language, sparing novel language.

Another group of studies have suggested that right brain damaged patients’ difficulties in idiom comprehension, as demonstrated by a sentence-to-picture matching task, may be due to deficits in their visuospatial abilities (see Papagno, Curti, Rizzo, Crippa, & Colombo, 2006). Moreover, Papagno, Oliveri, and Lauro (2002) conclude that the neural correlates involved in opaque idiom comprehension apparently do not differ from those involved in literal sentence comprehension. Opaque idiom interpretation is shown to require left temporal activity. All in all, their data do not support the right hemisphere hypothesis of idiomatic language processing (contra Kempler, van Lancker, Marchman, & Bates, 1999) and suggest that it is not advisable to treat idioms having different characteristics as a single class. Importantly, as stressed by Papagno et al. (2002), different kinds of idioms follow different interpretation strategies and, consequently, have different anatomical correlates. Finally, see Lauro, Tettamanti, Cappa, and Papagno (2008) for some findings that point to the fact that prefrontal cortex is crucial for idiom comprehension.

Future research in this field should continue to investigate which brain regions are associated with idiom processing, in order to clarify the ongoing debate on hemispheric specialization in figurative language comprehension (e.g., see Kasparian, 2013, for a recent review that critically synthesizes the literature on hemispheric differences in idiom and metaphor comprehension).

In this section some of the most relevant applied and experimental studies of idioms have been reviewed. Special attention has been paid to basic questions that are often addressed in lexicographic studies (how idioms should be recorded in monolingual and bilingual dictionaries), in computational studies (how idioms, as a class of “multi-word expressions”, can be automatically identified), in processing studies (how idioms can be accessed in the mental lexicon, and to what extent literal processing and idiomatic processing are related), in acquisition studies (how children learn idiomatic expressions), and in neurolinguistic studies (how brain regions are associated with idiom processing).


We would like to acknowledge the support of the following grants: Spanish MINECO (FFI2017-82547-P and FFI2017-87140-C4-1-P) and Generalitat de Catalunya (2017SGR634). The first author also acknowledges an ICREA Academia award. We thank the reviewers and editors of the Oxford Research Encyclopedia of Linguistics for their comments and suggestions.

Critical Analysis of Scholarship

  • Cacciari, C., & Tabossi, P. (1988). The comprehension of idioms. Journal of Memory and Language, 27, 668–683.

In this classical work on idiom comprehension experiments the authors put forward the so-called Configuration Hypothesis whereby the subjects are shown to activate the literal meanings of words in a phrase and recognize the idiomatic meaning of a polysemous expression only when they process significant input and recognize an idiom-specific configuration of lexemes or encounter an idiomatic “key” lexeme. Their hypothesis can be compared with two previous alternative proposals: compare the so-called Lexical Representational Hypothesis (Swinney & Cutler, 1979), whereby the literal and idiomatic interpretations of a given ambiguous expression are processed in parallel, and the so-called Direct Access Hypothesis (Gibbs, 1980, 1986), whereby the idiomatic meaning is accessed without any analysis of the literal meaning.

  • Fillmore, C. J., Kay, P., & O’Connor, M. K. (1988). Regularity and idiomaticity in grammatical constructions: The case of let alone. Language, 64, 501–538.

Through the case study of the construction let alone, the authors argue for a model of grammar where idiomaticity is not treated as a problematic phenomenon for linguistic theory but to the contrary becomes the basis for a new model of grammatical organization. This article is considered as one of the foundational works of the grammatical model known as Construction Grammar, where constructions are not to be seen as mere epiphenomena (as in Chomsky’s works) but rather as true theoretical entities that may specify, not only syntactic, but also lexical, semantic, and pragmatic information.

  • Fraser, B. (1970). Idioms within a transformational grammar. Language, 6(1), 22–42.

This is the first relevant study on the syntax of idioms developed within the early generative transformational grammar framework (Chomsky, 1965). It focuses on phrasal idioms and postulates that an idiom is a constituent or a series of constituents for which the semantic interpretation is not a compositional function of the parts it is composed of. The paper is mainly devoted to the study of the syntactic restrictions that idioms show and introduces a Frozeness Hierarchy of syntactic processes according to which they can be characterized. Any idiom marked as belonging to one level in this hierarchy should be automatically marked as belonging to any lower level.

  • Gibbs, R. W. (2007). Idioms and formulaic language. In D. Geeraerts & H. Cuyckens (Eds.), The Oxford handbook of cognitive linguistics (pp. 697–725). Oxford, U.K.: Oxford University Press.

The author provides a comprehensive review of the cognitive linguistic work on idiomaticity and the related research from cognitive psycholinguistics. He argues that idioms are not “dead metaphors,” but their figurative meanings are claimed to be motivated by people’s preexisting metaphorical understanding of many basic concepts. It is shown that many elements of idioms and, more generally, of formulaic language are not peripheral aspects of language but are closely tied to more productive grammatical patterns and schemes of human thought. The author’s main goal is to show that the study of idioms turns out to be an ideal place to understand the rich, flexible nature of natural language and human thought.

  • Jackendoff, R. (1997). The architecture of the language faculty. Cambridge, MA: MIT Press.

Of particular interest for the present purposes is chapter 7, which is devoted to showing the central role that idioms and other fixed expressions have in Jackendoff’s architecture of the language faculty. He analyzes idioms as “phrasal lexical items” that involve abstract triplets of phonological, syntactic, and semantic structures. According to him, the lexicon, rather than occurring as a separate representational module, is an important interface component linking together and licensing structures across the submodules of the linguistic system.

  • Mel'čuk, I. (1995). Phrasemes in language and phraseology in linguistics. In M. Everaert, E. J. van der Linden, A. Schenk, & R. Schreuder (Eds.), Idioms: Structural and psychological perspectives (pp. 167–232). Hillsdale, NJ: Lawrence Erlbaum Associates.

This paper introduces a theory of phraseology and a classification of phrasemes in accordance with the framework of Meaning-Text Theory (put forward by the author) and the Explanatory Combinatorial Dictionary. The paper proposes that: (i) idioms are considered fixed (frozen) expressions of all possible kinds, called phrasemes; (ii) idioms are considered exclusively from the viewpoint of production rather than understanding; (iii) idioms are considered strictly statically; and (iv) idioms are considered with respect to their lexicographic treatment. This study has had a great influence on lexicography and, in particular, has motivated huge empirical research on collocations.

  • Nunberg, G., Sag, I., & Wasow, T. (1994). Idioms. Language, 70(3), 491–538.

A classical reference for the study of idiomaticity. The authors distinguish two basic types of idioms: “idiomatically combining expressions” (e.g., spill the beans, pull strings), whose meanings, while conventional, are distributed among their individual parts, versus “idiomatic phrases” (e.g., kick the bucket, saw logs), which do not distribute their meanings among their components. The authors claim that most syntactic arguments put forward by generative treatments of idioms, where conventionality is typically confused with non-compositionality, are flawed. In contrast, the basic properties that explain the relevant asymmetries are shown to be semantic and metaphorical.

  • O’Grady, W. (1998). The syntax of idioms. Natural Language & Linguistic Theory, 16(2), 279–312.

The author argues that the syntactic formation of idioms is subject to the so-called Continuity Constraint, a grammatical principle that defines their general architecture in terms of a chain of head-to-head relations. The paper contains relevant discussion on syntactic constraints on what a (im)possible idiom is.

Links to Digital Materials


  • Abeillé, A. (1995). The flexibility of French idioms: A representation with lexicalized Tree Adjoining Grammar. In M. Everaert, E. J. van der Linden, A. Schenk, & R. Schreuder (Eds.), Idioms: Structural and psychological perspectives (pp. 15–42). Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Abeillé, A., Borsley, R. D., & Espinal, M. T. (2006). The syntax of comparative correlatives in French and Spanish. In S. Müller (Ed.), Proceedings of the 13th International Conference on Head-Driven Phrase Structure Grammar (pp. 6–26). Stanford, CA: CSLI Publications.
  • Bobrow, S. A., & Bell, S. M. (1973). On catching on to idiomatic expressions. Memory & Cognition, 1(3), 343–346.
  • Booij, G. (2002). Constructional idioms, morphology, and the Dutch lexicon. Journal of Germanic Linguistics, 14(4), 301–329.
  • Borsley, R. D. (2004). An approach to English comparative correlatives. In S. Müller (Ed.), Proceedings of the 11th International Conference on Head-Driven Phrase Structure Grammar (pp. 70–92). Stanford, CA: CSLI Publications.
  • Bruening, B. (2010). Ditransitive asymmetries and a theory of idiom formation. Linguistic Inquiry, 41, 519–562.
  • Bruening, B. (2017). Syntactic constraints on idioms (do not include locality). In C. Halpert, H. Kotek, & C. van Urk (Eds.), A pesky set: Papers for David Pesetsky (pp. 183–192). Cambridge, MA: MIT Working Papers in Linguistics.
  • Burger, H., Buhofer, A., & Sialm, A. (1982). Handbuch der Phraseologie. Berlin, Germany: De Gruyter.
  • Burger, H., Dobrovol’skij, D., Kühn, P., & Norrick, N. R. (Eds.). (2007). Phraseologie: Ein internationales Handbuch zeitgenössischer Forschung (2 vols.). Handbücher zur Sprach- und Kommunikationswissenschaft. Berlin, Germany: De Gruyter.
  • Cacciari, C., & Tabosi, P. (1988). The comprehension of idioms. Journal of Memory and Language, 27(6), 668–683.
  • Cacciari, C., & Tabosi, P. (Eds.). (1993). Idioms: Processing, structure and interpretation. Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Chafe, W. (1968). Idiomaticity as an anomaly in the Chomskyan paradigm. Foundations of Language, 4, 109–127.
  • Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge, MA: The MIT Press.
  • Chomsky, N. (1980). Rules and representations. New York, NY: Columbia University Press.
  • Chomsky, N. (1981). Lectures on government and binding: The Pisa lectures. Dordrecht, The Netherlands: Foris.
  • Copestake, A., Lambeau, F., Villavicencio, A., Bond, F., Baldwin, T., Sag, I. A., & Flickinger, D. (2002). Multiword expressions: Linguistic precision and reusability. In Proceedings of the Linguistic Resources and Evaluation Conference 2002. Las Palmas de Gran Canaria, Spain.
  • Copestake, A., Lambeau, F., Waldron, B., Bond, F., Flickinger, D., & Oepen, S. (2004). A lexicon module for a grammar development environment. In Proceedings of the Linguistic Resources and Evaluation Conference 2004 (pp. 1111–1114). Lisboa, Portugal.
  • Croft, W., & Cruse, A. (2004). Cognitive linguistics. Cambridge, U.K.: Cambridge University Press.
  • Cutting, J. C., & Bock, K. (1997). That’s the way the cookie bounces: Syntactic and semantic components of experimentally elicited idiom blends. Memory & Cognition, 25(1), 57–71.
  • Dayal, V. (2011). Hindi pseudo-incorporation. Natural Language and Linguistic Theory, 29(1), 123–167.
  • Dikken, M. den. (2005). Comparative correlatives comparatively. Linguistic Inquiry, 36, 497–532.
  • Dobrovol'skij, D. (2000). Contrastive idiom analysis: Russian and German idioms in theory and in the bilingual dictionary. International Journal of Lexicography, 13(3), 169–186.
  • Erman, B., & Warren, B. (2000). The idiom principle and the open choice principle. Text, 20(1), 29–62.
  • Espinal, M. T. (2001). Property denoting objects in idiomatic constructions. In Y. D’Hulst, J. Rooryck, & J. Schroten (Eds.), Romance languages and linguistic theory 1999 (pp. 117–141). Amsterdam, The Netherlands: John Benjamins.
  • Espinal, M. T. (2004a). Diccionari de sinònims de frases fetes. Barcelona, València, Spain: Servei de Publicacions de la Universitat Autònoma de Barcelona, Publicacions de la Universitat de València, Publicacions de l’Abadia de Montserrat.
  • Espinal, M. T. (2004b). Lexicalization of light verb structures and the semantics of nominals. Catalan Journal of Linguistics, 3, 15–43.
  • Espinal, M. T. (2005). A conceptual dictionary of Catalan idioms. International Journal of Lexicography, 18(4), 509–540.
  • Espinal, M. T. (2009). Clitic incorporation and abstract semantic objects in idiomatic constructions. Linguistics, 47(6), 1221–1271.
  • Espinal, M. T., & Mateu, J. (2010). On classes of idioms and their interpretation. Journal of Pragmatics, 42(5), 1397–1411. doi:10.1016/j.pragma.2009.09.016
  • Evans, V., & Green, M. (2006). Cognitive linguistics: An introduction. Edinburgh, U.K.: Edinburgh University Press.
  • Everaert, M. (2010). The lexical encoding of idioms. In M. Rappaport-Hovav, E. Doron, & I. Sichel (Eds.), Lexical semantics, syntax and event structure (pp. 76–98). Oxford, U.K.: Oxford University Press.
  • Farkas, D. F., & de Swart, H. (2003). The semantics of incorporation. Stanford, CA: CSLI Publications.
  • Fazly, A., & Stevenson, S. (2006). Automatically constructing a lexicon of verb phrase idiomatic combinations. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL’06). Trento, Italy.
  • Fellbaum, C. (1993). The determiner in English idioms. In C. Cacciari & P. Tabossi (Eds.), Idioms: Processing, structure and interpretation (pp. 271–296). Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Fellbaum, C. (2011). Idioms and collocations. In C. Maienborn, K. von Heusinger, & P. Portner (Eds.), Semantics. An international handbook of natural language meaning (pp. 441–455). Berlin, Germany: De Gruyter.
  • Fellbaum, C. (2015). Syntax and grammar of idioms and collocations. In T. Kiss & A. Alexiadou (Eds.), Handbook of syntax (pp. 777–802). Berlin, Germany: De Gruyter.
  • Fellbaum, C. (2016). Treatment of multi-word units. In P. Durkin (Ed.), The Oxford handbook of lexicography (pp. 411–424). Oxford, U.K.: Oxford University Press.
  • Fellbaum, C. (Ed.). (2007). Idioms and collocations: Corpus-based linguistic and lexicographic studies. London, NY: Continuum Press.
  • Fillmore, C.J., Kay, P., & O’Connor, M. (1988). Regularity and idiomaticity in grammatical constructions: The case of let alone. Language, 64(3), 501–538.
  • Foolen, A. (1997). The expressive function of language: Towards a cognitive semantic approach. In S. Niemeier & R. Dirven (Eds.), The language of emotions: Conceptualization, expression and theoretical foundation (pp. 15–32). Amsterdam, The Netherlands: John Benjamins.
  • Fraser, B. (1970). Idioms within a transformational grammar. Foundations of Language, 6, 22–42.
  • Gazdar, G., Klein, E., Pullum, G., & Sag, I. (1985). Generalized phrase structure grammar. Cambridge, MA: Harvard University Press.
  • Gibbs, R. W. (1980). Spilling the beans on understanding and memory for idioms in conversation. Memory & Cognition, 8, 149–156.
  • Gibbs, R. W. (1986). Skating on thin ice: Literal meaning and understanding idioms in conversation. Discourse Processes, 9, 17–30.
  • Gibbs, R. W. (1993). Why idioms are not dead metaphors. In C. Cacciari & P. Tabossi (Eds.), Idioms: Processing, structure and interpretation (pp. 57–77). Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Gibbs, R. W. (1994). The poetics of mind: Figurative thought, language, and understanding. Cambridge, U.K.: Cambridge University Press.
  • Gibbs, R. W., Jr. (1995). Idiomaticity and human cognition. In M. Everaert, E. J. van der Linden, A. Schenk, & R. Schreuder (Eds.), Idioms: Structural and psychological perspectives (pp. 97–132). Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Gibbs, R. W. (2006). Embodiment and cognitive science. New York, NY: Cambridge University Press.
  • Gibbs, R. W. (2007). Idioms and formulaic language. In D. Geraeerts & H. Cuyckens (Eds.), The Oxford handbook of cognitive linguistics (pp. 697–725). Oxford, U.K.: Oxford University Press.
  • Gibbs, R. W., & Colston, H. L. (2007). Psycholinguistic aspects of phraseology: American tradition. In H. Burger, D. Dobrovolskij, P. Kühn, & N. R. Norrick (Eds.), Phraseologie: Ein internationales Handbuch der zeitgenössischen Forschung (pp. 819–836). Berlin, Germany: Walter de Gruyter.
  • Gibbs, R. W., & Nayak, N. (1989). Psycholinguistic studies on the syntactic behavior of idioms. Cognitive Psychology, 21, 100–138.
  • Gibbs, R. W., & O’Brien, J. (1990). Idioms and mental imagery: The metaphorical motivation for idiomatic meaning. Cognition, 36, 35–68.
  • Glasbey, S. (2003). Let’s paint the town red for a few hours: Composition of aspect in idioms. In A. M. Wellington (Ed.), Proceedings of the ACL Workshop: The Lexicon and Figurative Language (pp. 42–48). Sapporo, Japan: ACL Publications.
  • Glasbey, S. (2007). Aspectual composition in idioms. In L. de Saussure, J. Moeschler, & G. Puskás (Eds.), Recent advances in the syntax and semantics of tense, aspect and modality (pp. 71–87). Berlin, Germany: Mouton de Gruyter.
  • Glucksberg, S. (1993). Idiom meanings and allusional content. In C. Cacciari & P. Tabossi (Eds.), Idioms: Processing, structure, and interpretation (pp. 3–26). Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Goldberg, A. E. (1995). Constructions: A construction grammar approach to argument structure. Chicago, IL: University of Chicago Press.
  • Hamblin, J. L., & Gibbs, R. W. (1999). Why you can’t kick the bucket as you slowly die: Verbs in idiom comprehension. Journal of Psycholinguistic Research, 28, 25–39.
  • Hanks, P. (2010). Terminology, phraseology and lexicography. In A. Dykstra & T. Schoonheim (Eds.), Proceedings of Euralex 2010 (1299–1308). Leeuwarden/Ljouwert, The Netherlands: Fryske Akademy – Afûk.
  • Harley, H., & Stone, M. S. (2013). The “No Agent Idioms” hypothesis. In R. Folli, C. Sevdali, & R. Truswell (Eds.), Syntax and its limits (pp. 251–276). Oxford, U.K.: Oxford University Press.
  • Harwood, W., Hladnik, M., Leufkens, S., Temmerman, T., Corver, N., & van Craenenbroeck, J. (2016). Idioms: Phasehood and projection.
  • Hausmann, F. (2004). Was sind eigentlich Kollokationen? In K. Steyer (Ed.), Wortverbindungen–mehr oder weniger fest (pp. 309–334). Berlin, Germany: De Gruyter.
  • Heid, U. (2007). Computational linguistic aspects of phraseology II. In H. Burger, D. Dobrovolskij, P. Kühn, & N. R. Norrick (Eds.), Phraseologie: Ein internationales Handbuch der zeitgenössischen Forschung (pp. 1036–1044). Berlin, Germany: Walter de Gruyter.
  • Heid, U. (2008). Computational phraseology: An overview. In S. Granger & F. Meunier (Eds.), Phraseology. An interdisciplinary perspective (pp. 337–360). Amsterdam, The Netherlands: John Benjamins.
  • Hoeksema, J. (2010). De localiteit van idiomen. TABU, 38(1–4), 121–132.
  • Horn, G. M. (2003). Idioms, metaphors and syntactic mobility. Journal of Linguistics, 39, 245–273.
  • Horvath, J., & Siloni, T. (2016). Idioms: The type-sensitive storage model (Manuscript). Tel Aviv University.
  • Horvath, J., & Siloni, T. (2017). Idioms and “constructions”: Implications for the architecture of grammar. Paper presented at the Syntax of Idioms Workshop, Utrecht: Utrecht University.
  • Jackendoff, R. S. (1983). Semantics and cognition. Cambridge, MA: MIT Press.
  • Jackendoff, R. S. (1995). The boundaries of the lexicon. In M. Everaert, E. J. van der Linden, A. Schenk, & R. Schreuder (Eds.), Idioms: Structural and psychological perspectives (pp. 133–165). Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Jackendoff, R. S. (1997a). The architecture of the language faculty. Cambridge, MA: MIT Press.
  • Jackendoff, R. S. (1997b). Twistin’ the night away. Language, 67, 320–338.
  • Jackendoff, R. S. (2002). Foundations of language: Brain, meaning, grammar, evolution. Oxford, U.K.: Oxford University Press.
  • Jackendoff, R. S. (2007). A parallel architecture perspective on language processing. Brain Research, 1146, 2–22.
  • Jackendoff, R. S. (2008). Construction after construction and its theoretical challenges. Language, 84(1), 8–28.
  • Johnson, M. (1987). The body in the mind: The bodily basis of meaning, imagination, and reason. Chicago, IL: University of Chicago Press.
  • Kasparian, K. (2013). Hemispheric differences in figurative language processing: Contributions of neuroimaging methods and challenges in reconciling current empirical findings. Journal of Neurolinguistics, 26, 1–21.
  • Katz, J., & Postal, P. (1963). The semantic interpretation of idioms and sentences containing them. MIT Research Laboratory of Electronic Quarterly Progress Report, 70, 275–282.
  • Kay, P., & Sag, I. (2014). A lexical theory of phrasal idioms (Manuscript). UC Berkeley & Stanford University.
  • Kempler, D., van Lancker, D., Marchman, V., & Bates, E. (1999). Idiom comprehension in children and adults with unilateral brain damage. Developmental Neuropsychology, 15, 327–349.
  • Koopman, H., & Sportiche, D. (1991). The position of subjects. Lingua, 85, 211–258.
  • Krifka, M. (1992). Thematic relations as links between nominal reference and temporal constitution. In I. Sag & A. Szabolcsi (Eds.), Lexical matters (pp. 29–53). CSLI Lecture Notes. Chicago, IL: University of Chicago Press.
  • Lakoff, G. (1987). Women, fire, and dangerous things. Chicago, IL: University of Chicago Press.
  • Lakoff, G. (1990). The invariance hypothesis: Is abstract reason based on image-schemas? Cognitive Linguistics, 1, 39–74.
  • Lakoff, G. (1993). The contemporary theory of metaphor. In A. Ortony (Ed.), Metaphor and thought (pp. 202–251). Cambridge, U.K.: Cambridge University Press.
  • Lakoff, G., & M. Johnson. (1999). Philosophy in the flesh: The embodied mind and its challenge to Western thought. New York, NY: Basic Books.
  • Larson, R. K. (1988). On the double object construction. Linguistic Inquiry, 19, 335–391.
  • Larson, R. K. (2017). On “dative idioms” in English. Linguistic Inquiry, 48(3), 389–426. doi:10.1162/ling_a_00248.
  • Lauro, L. J. R., Tettamanti, M., Cappa, S. F., & Papagno, C. (2008). Idiom comprehension: A prefrontal task? Cerebral Cortex, 18, 162–170. doi:10.1093/cercor/bhm042
  • Levin, B. (1993). English verb classes and alternations: A preliminary investigation. Chicago, IL: University of Chicago Press.
  • Levin, B., & Rappaport-Hovav, M. (2005). Argument realization. Cambridge, U.K.: Cambridge University Press.
  • Levorato, M. C. (1993). The acquisition of idioms and the development of figurative competence. In C. Cacciari & P. Tabossi (Eds.), Idioms: Processing, structure and interpretation (pp. 101–128). Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Lew, R. (2011). Studies in dictionary use: Recent developments. International Journal of Lexicography, 24(1), 1–4.
  • Lew, R. (2015). Dictionaries and their users. In P. Hanks & G. M. de Schryver (Eds.), International handbook of modern lexis and lexicography (pp. 1–9). Berlin, Germany: Springer.
  • Lichte, T., & Kallmeyer, L. (2016). Same syntax, different semantics: A compositional approach to idiomaticity in multi-word expressions. In C. Piñón (Ed.), Empirical issues in syntax and semantics 11 (pp. 111–140).
  • Marantz, A. (1984). On the nature of grammatical relations. Cambridge, MA: MIT Press.
  • Marantz, A. (1997). No escape from syntax: Don’t try morphological analysis in the privacy of your own lexicon. In A. Dimitriadis, L. Siegel, et al. (Eds.), University of Pennsylvania Working Papers in Linguistics, Proceedings of the Annual Penn Linguistics Colloquium (Vol. 4, pp. 201–225). Philadelphia: Penn Linguistics Club, University of Pennsylvania.
  • Markantonatou, S., & Sailer, M. (Eds.). (2018). Multiword expressions: Insights from a multi-lingual perspective. Berlin, Germany: Language Science Press.
  • Mateu, J. (in press). Lexicalized syntax: Phraseology. In J. A. Argenter & J. Lüdtke (Eds.), Manual of Catalan linguistics. Berlin, Germany: De Gruyter.
  • Mateu, J., & Espinal M. T. (2007). Argument structure and compositionality in idiomatic constructions. Linguistic Review, 24(1), 33–59.
  • McGinnis, M. (2002). On the systematic aspect of idioms. Linguistic Inquiry, 33(4), 665–672. doi:10.1162/ling.2002.33.4.665
  • Mel'cuk, I. (1995). Phrasemes in language and phraseology in linguistics: The boundaries of the lexicon. In M. Everaert, E. J. van der Linden, A. Schenk, & R. Schreuder (Eds.), Idioms: Structural and psychological perspectives (pp. 167–232). Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Mel’cuk, I. (2012). Phraseology in the language, in the dictionary, and in the computer. Yearbook of Phraseology, 3, 31–56.
  • Mendívil, J. L. (2009). Palabras con estructura externa. In E. de Miguel (Ed.), Panorama de la lexicología (pp. 83−113). Barcelona, Spain: Ariel.
  • Moon, R. (1998). Fixed expressions and idioms in English: A corpus-based approach. Oxford, U.K.: Clarendon Press.
  • Moon, R. (2015). Idioms: A view from the web. International Journal of Lexicography, 28(3), 318–337. doi:10.1093/ijl/ecv023
  • Mueller, R. A., & Gibbs, R. W. (1987). Processing idioms with multiple meanings. Journal of Psycholinguistic Research, 16, 63–81.
  • Nenonen, M., Niemi, J., & Laine, M. (2002). Representation and processing of idioms: Evidence from aphasia. Journal of Neurolinguistics, 15, 43–58.
  • Nunberg, G., Sag, I., & Wasow, T. (1994). Idioms. Language, 70(3), 491–538.
  • O’Grady, W. (1998). The syntax of idioms. Natural Language and Linguistic Theory, 16, 279–312. doi:10.1023/A:1005932710202
  • Osborne, T., Putman, M., & Groβ‎, T. (2012). Catenae: Introducing a novel unit of syntactic analysis. Syntax, 15(4), 354–396. doi:10.1111/j.1467-9612.2012.00172.x
  • Papagno, C., Curti, R., Rizzo, S., Crippa, F., & Colombo, M. R. (2006). Is the right hemisphere involved in idiom comprehension? A neuropsychological study. Neuropsychology, 20, 598–606. doi:10.1037/0894-4105.20.5.598
  • Papagno, C., Oliveri, M., & Lauro, L. (2002). Neural correlates of idiom comprehension. Cortex, 38, 895–898.
  • Paquot, M. (2015). Lexicography and phraseology. In D. Biber & R. Reppen (Eds.), The Cambridge handbook of corpus linguistics (pp. 460–477). Cambridge, U.K.: Cambridge University Press.
  • Peterson, R. R., Burgess, C., Dell, G. S., & Eberhard, K. M. (2001). Dissociation between syntactic and semantic processing during idiom comprehension. Journal of Experimental Psychology: Learning Memory and Cognition, 27(5), 1223–1237.
  • Pulman, S. (1993). The recognition and interpretation of idioms. In C. Cacciari & P. Tabossi (Eds.), Idioms: Processing, structure and interpretation (pp. 249–270). Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Richter, F., & Sailer, M. (2014). Idiome mit phraseologisierten Teilsätzen: Eine Fallstudie zur Formalisierung von Konstruktionen im Rahmen der HPSG. In A. Lasch & A. Ziem (Eds.), Grammatik als Netzwerk von Konstruktionen (pp. 291–312). Berlin, Germany: De Gruyter.
  • Ruppenhofer, J., Baker, C. F., & Fillmore, C. J. (2002). Collocational information in the FrameNet database. In A. Braasch & C. Povslen (Eds.), Proceedings of the Euralex International Congress 2002 (pp. 359–369). Copenhagen, Denmark: Center for Sprogteknologi.
  • Sag, I. A., Baldwin, T., Bond, F., Copestake, A., & Flickinger, D. (2002). Multiword expressions: A pain in the neck for NLP. In A. Gelbukh (Ed.), Computational linguistics and intelligent text processing. CICLing 2002. Lecture Notes in Computer Science 2276. Berlin, Germany: Springer. doi:10.1007/3-540-45715-1_1
  • Sailer, M. (2013). Idioms and phraseology. Oxford Bibliographies.
  • Schenk, A. (1995). The syntactic behavior of idioms. In M. Everaert, E. J. van der Linden, A. Schenk, & R. Schreuder (Eds.), Idioms: Structural and psychological perspectives (pp. 253–271). Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Sinclair, J. (1991). Corpus, concordance, collocation. Oxford, U.K.: Oxford University Press.
  • Sperber, D., & Wilson, D. (1995). Relevance: Communication and cognition. Oxford, U.K.: Blackwell.
  • Sportiche, D. (2005). Division of labor between merge and move: Strict locality of selection and apparent reconstruction paradoxes (Manuscript). Los Angeles: UCLA.
  • Sprenger, S. A., Levelt, W. J. M., & Kempen, G. (2006). Lexical access during the production of idiomatic phrases. Journal of Memory and Language, 54(2), 161–184. doi:10.1016/j.jml.2005.11.001
  • Stubbs, M. (2001). Words and phrases. Hoboken, NJ: Wiley.
  • Stubbs, M. (2007). On texts, corpora and models of language. In M. Hoey, M. Mahlberg, M. Stubbs, & W. Teubert (Eds.), Text, discourse and corpora (pp. 127–161). London, U.K.: Continuum.
  • Svenonius, P. (2005). Extending the extension condition to discontinuous idioms. In P. Pica, J. Rooryck, & J. van Craenenbroeck (Eds.), Linguistic variation yearbook 2005 (pp. 227–263). Amsterdam, The Netherlands: John Benjamins.
  • Swinney, D. A., & Cutler, A. (1979). The access and processing of idiomatic expressions. Journal of Verbal Learning and Verbal Behavior, 18(5), 523–534. doi:10.1016/S0022-5371(79)90284-6
  • Titone, D. A., & Connine, C. M. (1999). On the compositional and noncompositional nature of idiomatic expressions. Journal of Pragmatics, 31(12), 1655−1674. doi:10.1016/S0378-2166(99)00008-9
  • Torner, S., & Bernal, E. (Eds.). (2017). Collocations and other lexical combinations in Spanish. New York, NY: Routledge.
  • Van Lancker Sidtis, D. (2011). Formulaic expressions in mind and brain: Empirical studies and a dual-processing model of language competence. In J. Guendouzi, F. Loncke, & M. J. Williams (Eds.), The handbook of psycholinguistic and cognitive processes: Perspectives in communication disorders (pp. 247–272). New York, NY: Psychology Press.
  • Van Lancker Sidtis, D., & Postman, W. A. (2006). Formulaic expressions in spontaneous speech of left- and right-hemisphere-damaged subjects. Aphasiology, 20, 411–426. doi:10.1080/02687030500538148
  • Vega-Moreno, R. (2003). Representing and processing idioms.
  • Villada Moirón, B., & Tiedemann, J. (2006). Identifying idiomatic expressions using automatic word-alignment. In Proceedings of the Workshop on Multiwords in a Multilingual Context, 11th Conference of the European Chapter of the Association for Computational Linguistics (pp. 33–40). Trento, Italy: Association for Computational Linguistics.
  • Villavicencio, A., Copestake, A., Waldron, B., & Lambeau, F. (2004). Lexical encoding of MWEs. In T. Tanaka, A. Villavicencio, F. Bond, & A. Korhonen (Eds.), Second ACL Workshop on Multiword Expressions: Integrating Processing (pp. 80–87). Morristown, NJ: Association for Computational Linguistics.
  • Volk, M. (1998). The automatic translation of idioms. Machine translation vs. translation of memory systems. In N. Weber (Ed.), Machine translation: Theory, applications, and evaluation. An assessment of the state of the art (pp. 167–192). St. Augustin: Gardez Verlag.
  • Weinreich, U. (1969). Problems in the analysis of idioms. In J. Puhvel (Ed.), Substance and structure of language (pp. 23–81). Berkeley: University of California Press.


  • 1. See also Mendívil (2009) and Mateu (in press) for the distinction between proper idioms and idiomatic collocations.

  • 2. See Chafe (1968) for an early characterization of the idiom problem for generative grammar.

  • 3. In order to account for the recalcitrance of idioms to undergo particular syntactic operations, two alternative classical approaches are found. One such approach consists in marking each idiom for the operations that may or may not follow (Weinreich, 1969). A different approach consists in postulating a frozenness hierarchy and marking each idiom as belonging to one of the levels of this hierarchy (Fraser, 1970).

    It should also be taken into account the fact that an operation such as topicalization is applicable not only to meaningful expressions but also to expressions that may have an independent status at the level of information structure representation. The higher the frozenness of an idiom the lower the possibility of allowing a topicalized idiom item, as illustrated in the following examples: *Some logs, he sawed during his sleep; *The bucket seems to have been kicked versus Those strings, he wouldn’t pull for you; Advantage seems to have been taken of Pat.

  • 4. From a terminological point of view, in the Romance literature this distinction is often reflected by a distinction made between idiomatic phrases (frases hechas in Spanish) and locutions (locuciones adverbiales, preposicionales, conjuntivas in Spanish).

  • 5. See Espinal (2004a, 2005) for a defense of this multicategorial approach to the study of idioms with reference to Catalan.

  • 6. But see Hoeksema (2010), Bruening (2010, 2017), and Richter and Sailer (2014) for the claim that there are idioms that include entire clauses.

  • 7. Such a vP-internal-subject hypothesis (Larson, 1988; Koopman & Sportiche, 1991) implies that idioms that include the subject exclude Tense, and they also exclude the agent. See section 3 on the meaning of idioms.

  • 8. See also Osborne, Putman, and Groβ‎ (2012) for the idea that not all idioms are stored as constituents. These authors hypothesize that all idioms are stored as catenae, where a catena is defined in a dependency-based grammar as a word or a combination of words that is continuous with respect to dominance.

    See Kay and Sag (2014) for a different catena/selection approach in the phraseological literature.

  • 9. For an alternative approach to idioms based on the notion of L(exical)-Selection, see Everaert (2010).

  • 10. For the classical notion of construction, see Goldberg (1995, p. 4): C is a construction if C is a form-meaning pair <Fi, Si> such that some aspect of Fi or some aspect of Si is not strictly predictable from C’s component parts or from other previously established constructions.

    A constructional idiom is a specific mapping of syntactic and conceptual structure that is not determined by the head item (Jackendoff, 1997a).

  • 11. For an alternative analysis, employing the framework of Lexicalized Tree Adjoining Grammar, where idiomaticity is not subject to syntactic ambiguity but emerges from a special unification with a semantic interpretation, see Abeillé (1995) and Lichte and Kallmeyer (2016).

  • 12. See Abeillé (1995) and Schenk (1995), who argue that the generalization from Nunberg et al. (1994) might be dependent on properties of these operations in English. Nunberg et al. themselves provide German counterexamples to the correlation of IP-hood and syntactic fixedness.

  • 13. Similarly, as noted by Nunberg et al. (1994, p. 496), “some idioms are transparent without being idiomatic combinations. It is pretty obvious why the expression saw logs is used to mean ‘to sleep’, given the resemblance between the sounds produced by the two activities. There is, however, no decomposition of the activity of sleeping into elements that correspond to the meanings of the parts of the expression, so saw logs does not qualify as an idiomatically combining expression.”

  • 14. As noted by Gibbs (2007), even the typical so-called “frozen” idiom kick the bucket seems somewhat analyzable in that it refers to sudden, and not prolonged, death, primarily due to the influence of kick. Furthermore, kick the bucket can be used in various other forms such as kick it, kick off, or kick it off (cf. also the bucket list). For more discussion of apparent frozenness of the idiom kick the bucket, see also Glucksberg (1993), Marantz (1997), Hamblin and Gibbs (1999), McGinnis (2002), Gibbs (2007), and Espinal and Mateu (2010).

  • 15. See also Horn (2003) for a review of Jackendoff’s account of idioms (1997a).

  • 16. For a review of these two principles see Erman and Warren (2000), Hanks (2010), and Paquot (2015).

  • 17. A similar decision is made in the online Oxford Dictionary.

  • 18. A detailed analysis of Catalan lexicography regarding the idiom caure-li la cara de vergonya (a algú) lit. fall.cldat the face of shame to somebody ‘be ashamed’ is given in Espinal (2005).