date: 27 June 2022

Morphological and Syntactical Variation and Change in Catalanfree

  • Gemma RigauGemma RigauDepartment of Catalan Philology, Universitat Autònoma de Barcelona
  •  and Manuel Pérez SaldanyaManuel Pérez SaldanyaDepartment of Catalan Philology, University of Valencia


Catalan is a Romance language closely related to Gallo-Romance languages. However, contact with Spanish since the 15th century has led it to adopt various linguistic features that are closer to those seen in Ibero-Romance languages. Catalan exhibits five broad dialects: Central, Northern, and Balearic, which pertain to the Eastern dialect block, and Northwestern and Valencian, which make up the Western.

This article deals with the most salient morphosyntactic properties of Catalan and covers diachronic and diatopic variations. It also offers information about diastratic or sociolinguistic variations, namely standard and non-standard variations. Among the most characteristic morphosyntactic features are the following:

1. Catalan is the only Romance language that exhibits a periphrastic past tense expressed by means of the verb anar ‘go’ + infinitive (Ahir vas cantar ‘Yesterday you sang’). This periphrastic past coexists with a simple past (Ahir cantares ‘Yesterday you sang’). However, Catalan does not have a periphrastic future built with the movement verb go.

2. Demonstratives show a two-term system in most Catalan dialects: aquí ‘here’ (proximal) and allà or allí ‘there’ (distal); but in Valencian and some Northwestern dialects, there is a three-term system. In contrast with other languages that have a two-term system, Catalan uses the proximal demonstrative to express proximity either to the speaker or to the addressee (Aquí on jo soc ‘Here where I am’, Aquí on tu ets ‘There where you are’).

3. Catalan has a complex system of clitic pronouns (or weak object pronouns) which may vary in form according to the point of contact with the verb, proclitically or enclitically; e.g., the singular masculine accusative clitic can have two syllabic forms (el and lo) and an asyllabic one (l’ or ‘l): El saludo ‘I am greeting him’, Puc saludar-lo ‘I can greet him’, L’havies saludat ‘You had greeted him’, Saluda’l ‘Greet him’.

4. Existential constructions may contain the predicate haver-hi ‘there be’, consisting of the locative clitic hi and the verb haver ‘have’ (Hi ha tres estudiants ‘There are three students’) and the copulative verb ser ‘be’ (Tres estudiants ja són aquí ‘Three students are already here’) or other verbs whose behavior can be close to an unaccusative verb when preceded by the clitic hi (Aquí hi treballen forners ‘There are some bakers working here’).

5. The negative polarity adverb no ‘not’ may be reinforced by the adverbs pas or cap in some dialects and can co-occur with negative polarity items (ningú ‘anybody/nobody’, res ‘anything/nothing’, mai ‘never’, etc.). Negative polarity items exhibit negative agreement (No hi ha mai ningú ‘Nobody is ever here’), but they may express positive meaning in some non-declarative syntactic contexts (Si mai vens, truca’m ‘If you ever come, call me’).

6. Other distinguishing items are the interrogative and confirmative particles, the pronominal forms of address, and the personal articles.


1. Catalan Dialects and the Codification in Catalan

Catalan has five major dialects, which are usually classified into two groups essentially on the basis of phonological differences (see Figure 1). The Eastern group comprises Central, Northern, and Balearic Catalan, in addition to Alguerese, whereas the Western one comprises Northwestern Catalan and Valencian (Veny, 1983; Veny & Massanell, 2015). Central, Northern, and Northwestern are historical dialects, derived directly from the evolution of the Vulgar Latin spoken in Old Catalonia, a territory that straddled the eastern end of the Pyrenees. Conversely, Valencian, Balearic, and Alguerese are dialects that developed as Catalan spread south and east by conquest during the territorial expansion of the Crown of Aragon in the 13th and 14th centuries.

Most Catalan varieties have a vocalic system of seven vowels in stressed syllables ([i], [e], [ɛ], [a], [ɔ], [o], and [u]). However, in a weak syllable, the vowels are reduced to three ([i], [ə], and [u]) in most of the Eastern Catalan dialects and to five ([i], [e], [a], [o], and [u]) in Western Catalan.

Figure 1. Main Catalan dialects.

Like other Gallo-Romance languages and Gallo-Italic dialects, Catalan historically lost all Latin final unstressed vowels other than a (manum > ‘hand’, viridem > verd ‘green’, altum > alt ‘’, but dominam > dona ‘wife’, altam > alta ‘’) and this had important consequences for various features of the nominal and verbal inflections. In the nominal inflection, for instance, the dropping of the final unstressed vowels caused adjectives like verd ‘green’, which were initially uninflected for gender, to be formally identified with gender-inflected masculine ones like alt ‘high’. As a result, they frequently developed a feminine form by analogy to that of the gender-inflected adjectives, as in verda, analogous to alta (Casanova, 1983).

As for verbal inflection, the loss of final unstressed vowels had a significant impact on particular points in the conjugation paradigm, such as the first-person singular of the present indicative (Gulsoy, 1993, pp. 421−480; Wheeler, 2007, pp. 139−152) and present subjunctive (Gulsoy, 1993, pp. 377−419; Pérez Saldanya, 1998, pp. 147−167) and this now constitutes a major area of dialectal divergence. In the first-person singular, the old forms historically either lost their ending altogether (cantō > cant ‘I sing’) or took a supporting vowel in contexts where it was needed for reasons of syllabification (parabolō > parle ‘I talk’, operō > obre ‘I act’). In contemporary usage, the supporting vowel -e has been generalized in Valencian for all verbs of the first conjugation (cante as in parle and obre) but has been dropped in Balearic (parl and obr as in cant). In the other dialects, this -e has been replaced by -o in Central and Northwestern Catalan (canto) and by ‑i in Northern Catalan (canti) and then generalized to all conjugations. All of these variants are considered acceptable, but the forms with ‑o are used by the largest number of speakers. Similar changes have taken place in the subjunctive present.

In other respects, Valencian shows specific features which often coincide with those of Spanish and are due to the more intense contact that this dialect has experienced historically with Aragonese and Spanish. Prominent examples of this influence are (a) the distinction between three degrees of space deixis (i.e., este ‘this’, eixe ‘that’, and aquell ‘that’), against the binary distinction in most Catalan dialects (i.e., aquest ‘this’ and aquell ‘that’), as discussed in Section 3.5; (b) the use of the -ra ending for the imperfect past tense of the subjunctive, derived from the Latin pluperfect indicative (e.g., cantāveram > cantara), against the use of -s forms in the other dialects of Catalan, derived from the Latin pluperfect subjunctive (e.g., cantāvissem > cantassen, and cantessin in Central Catalan); (c) the loss of vitality of the so-called adverbial pronouns hi (in all its functions) and en (in all its functions except the partitive one, such as De cotxes en tinc tres of cars gen = have.1sg three, meaning ‘I have three cars’); and (d) the distinction between the causal preposition per ‘by’ and the benefactive per a ‘for’ (colloquially pronounced [pa]), in contrast to the use of per for both functions in most Catalan dialects.

Northern Catalan also shows specific features, which in this case coincide with those of Occitan or French. Examples are (a) the differentiation between the third-person singular possessive pronoun seu ‘his/hers/its’ (and its inflected forms) and the third-person plural possessive pronoun llur ‘their’ (and its inflected forms), as in Old Catalan and opposed to the use of seu for both functions in Standard Catalan, and (b) the colloquial use of the particle pas without the adverb no ‘not’ in negative sentences (Ella vindrà pas ‘She won’t come’), as opposed to the use of pas only with an emphatic value in combination with no, as seen in Central Catalan (see Section 2.5).

The codification of present-day Catalan was carried out in the early years of the 20th century by the Institute of Catalan Studies (Institut d’Estudis Catalans [IEC]), particularly through the work of the linguist Pompeu Fabra (see article on “Catalan” in this encyclopedia, forthcoming), who authored a series of key reference works, most notably his Diccionari ortogràfic (Fabra, 1917), Gramàtica catalana (Fabra, 1918), and Diccionari general de la llengua catalana (Fabra, 1932). These were extended much later by the Institute’s own Diccionari de la llengua catalana (IEC, 1995) and Gramàtica de la llengua catalana (IEC, 2016). The basic principle informing Fabra’s codification was the ‘purifying’ of Catalan by ridding it whenever possible of Spanish influence and also any elements that were seen to be excessively dialect-bound while favoring those forms that were valid for all dialects (see Ginebra & Solà, 2008; Lamuela & Murgades, 1984). In many instances, the forms typical of Central Catalan were adopted (e.g., most features of the verbal conjugation) because of the political, demographic, and cultural weight of the territory where this dialect is spoken. However, in other cases, the forms adopted came from other geographical areas, particularly when they more closely reflected those of Old Catalan, such as the orthography of the weak vowels or the distinction between the prepositions per ‘by’ and per a ‘for’.

Within Catalonia proper, the prescriptive proposals put forth by the IEC were quickly accepted. This was largely due to the considerable personal prestige accruing to Pompeu Fabra because of his publications. In the Catalan-speaking territories outside of Catalonia itself, like Valencia and the Balearic Islands, the adoption of a standard took place later, in the 1930s. In these territories, the claim has occasionally been made—most notably in Valencia—that the local variety is not actually Catalan but a different language. However, in part with the aim of resolving the social tensions aroused by the debate over linguistic identity, the Valencian Parliament set up the Valencian Academy of Language (Acadèmia Valenciana de la Llengua [AVL]) in 1998. This body explicitly recognized the unity of Catalan language in 2005 (AVL, 2005) and has published a series of reference works that accept the bulk of Fabra’s norms but expand them to include some features that are particular to Valencian (AVL, 2006, 2015).

Owing to historical and sociolinguistic reasons, Catalan speakers also speak another language. Depending on the geographical area, they are also speakers of Spanish, French, or Italian; in some cases, the knowledge of these languages can be greater than that of Catalan, especially in urban areas. In addition, within the language community, there are internal administrative divisions, which means, for example, that only in some administrations is Catalan recognized as an official language and language of instruction in the education system. According to the Informe sobre l’estat de la llengua 2015 (Report on the State of Language 2015) in Pradilla and Sorolla (2016), the Catalan sociolinguistic situation shows quite different regional dynamics. Andorra and Catalonia are the regions where the sociolinguistic indicators, such as individual’s first language, language of identification, or intergenerational language transmission, are most favorable. Some of the sociolinguistic indicators are less favorable in the Balearic Islands and in Valencian Country and are adverse in Northern Catalonia and l’Alguer. In la Franja (Aragon Strip), Catalan is present mainly in local colloquial communication. On the sociolinguistic situation and the social and functional variation in Catalan, see Pradilla (2008, 2020).

The sociolinguistic situation to which we have just referred, along with some factors related to the linguistic attitudes of the speakers, can explain the preference for certain forms in some cases of morphosyntactic variation. The most local or non-standard variants rooted in colloquial speech usually prevail in speakers who are from rural areas and have little or no schooling in Catalan, who have received the language through family transmission and who value its solidarity dimension. On the other hand, the supralocal and standard variants usually prevail in speakers educated in Catalan or who have learned it as a second language and who value that the knowledge of the standard language makes it easier for them to find work or improve their social position. This is a heterogeneous set of variables that does not necessarily occur at the same time or to the same degree but can be used to mark general trends. On the incidence of sociolinguistic variables in morphosyntactic variation, see Casanova Solanes (2016) and Hawkey (2019) on lexical and morphosyntactic variation in Northern Catalonia communities, Montoya (2000) on language attrition and shrinkage and the interruption of intergenerational language transmission in Alacant, and Argenter (2020) on Catalan in contact with other languages and codeswitching.

2. Romance Phenomena

In this section, we will analyze some of the phenomena that Catalan shares with other Romance languages, albeit with its own specific features.

2.1. Existential Constructions

Catalan has two verbs for existential or presentational constructions: the verb haver-hi, which combines haver ‘to have’ with the incorporated locative clitic hi, and the verb ser (or ésser) ‘to be’ (see article on “Existential and Locative Constructions in the Romance Languages” in this encyclopedia). Both verbs select two arguments, a theme and a stative location, as shown in (1). Sentences with haver-hi are impersonal sentences, so the verb is conjugated in the third person, which is the morphological expression of the lack of person feature in languages like Romance (Benveniste, 1946/1966). By contrast, sentences with ser are personal constructions, and the verb agrees with the nominal element.


The locative clitic hi is obligatory when the verb is haver. This clitic, similarly to other Romance clitic pronouns, can act as a resumptive pronoun of a locative PP or AdvP in a peripheral or topic position (e.g., en el títol in (1a)). A construction such as (2) without the clitic is ungrammatical. In fact, in Catalan, the verb haver without hi is not a main verb but an auxiliary verb, as in Jo he estudiat ‘I have studied’.


The verb haver-hi can appear with a definite NP, even a proper noun, in both formal and colloquial language, as shown in (3a and b).


However, when the verb is ser ‘to be’, the locative argument can be a PP, as in (1b), the locative clitic pronoun hi, as in (4a), or an AdvP, as in (4b).


The subject in presentational sentences with ser must refer to a specific entity of the universe of the discourse (Rigau, 1988). Sentences in (5a and b) are grammatical and they are semantically equivalent if the NP is definite (la veina ‘the neighbor’), but if the subject of the verb ser is unspecific or non-referential (una veïna ‘a neighbor’), the sentence becomes ungrammatical. However, the sentence is grammatical when the indefinite NP una veïna receives the reading of a partitive nominal construction and refers to one element of a known set, ‘one of the neighbors’, as in (5c).


The complementary distribution of ser and haver-hi in existential (or presentational) sentences can be observed in (6) (Ramos, 2002, pp. 2000−2005; Rigau, 1997).


The ungrammaticality of és (6a) and ha (6c) contrast, respectively, with grammatical sentences where the NP is right-dislocated (6b) and where the NP receives a contrastive focus reading (6d).

The locative clitic hi in haver-hi sentences is known as a ‘locative subject’. In other words, hi blocks the presence of an NP or a pronoun in nominative case and the sentence becomes an impersonal sentence. Constructions with a nominative pronoun such as Hi ha ell (loc = has he) or Hi has tu (loc = have you) are ungrammatical in Catalan, except in the colloquial variety spoken on the island of Menorca. In general, if the NP is a nominative pronoun, the verb must be ser, as in Hi era jo ‘I was here/there’ and Jo hi era ‘I was here/there’. However, when the theme argument is a bare NP, haver-hi must be used, as in No hi ha vi ‘There is no wine’, *Hi és vi (loc = is wine). These facts show that the case received by the NP in haver-hi sentences is not the nominative. Nevertheless, in some Catalan dialects, such as Central Catalan, the verb haver-hi can agree in number (but not in person) with the NP, as shown in (7).


By contrast, in Northwestern and Balearic Catalan, agreement is not manifested between a plural NP and the verb (Solà, 1994), as in (8). In these dialects, the presence of the clitic hi blocks the agreement of the theme argument in person and number with the verb. This lack of agreement is regarded in the IEC’s prescriptive grammar as the standard usage (IEC, 2016, pp. 850−851).


In Northern Catalan and the dialect spoken in l’Alguer (Sardinia), haver-hi selects an indefinite NP and no agreement is manifested, as in Hi ha hòmens (loc = has men, ‘There are some men’). Although it parallels these dialects in other respects, Valencian haver-hi agrees in number with the theme argument. However, the verb used in colloquial Valencian (in contrast with other Catalan varieties) when the NP is definite is not the verb ser but the quasi-copulative verb estar (El cotxe està ací ‘The car is here’). For a detailed analysis and more dialectal information, see Rigau (1991, 1997).

The existential pattern is also sometimes visible in sentences with unaccusative verbs like arribar ‘to arrive’ or entrar ‘to enter’ and also with unergative verbs like parlar ‘to speak’ when they appear with the locative clitic subject hi, as in (9). Torrego (1989) has shown the significant role of a locative element in the shifting of an unergative verb into an unaccusative verb. Interestingly, in Northwestern Catalan, these sentences do not manifest number agreement (IEC, 2016, pp. 850−851; Rigau, 1997; Solà, 1987, 1994).


Note that the sentences in (9) are synonymous with those in (10), which are sentences with the stative verb haver-hi.


Some Catalan deontic verbs expressing need or obligation—such as caldre ‘to be necessary’, urgir ‘to be urgent’, faltar ‘to be lacking or missing’, or fer falta ‘to be necessary’—are also impersonal verbs. Their structure is parallel to existential sentences with the difference that deontic verbs accept a dative clitic (11a) as well as the locative clitic subject hi (11b). As expected, in Northwestern Catalan, these verbs do not manifest agreement with the nominal element (Rigau, 1999).


The nominal element cannot be either a nominative pronoun or an accusative clitic pronoun: *Hi calc jo (loc = am_necessary I), *Ens la cal (to_us her is_necessary). This contrasts with the use of analogous verbs in some other Romance languages, such as Sardinian kérrere ‘to be necessary’ (Jones, 1993).

To express temporal meaning, Catalan (like other Romance languages) uses the verb fer ‘to make’, which appears without a clitic pronoun. These temporal constructions are impersonal in the sense that the verb does not agree with the measure phrase, as illustrated in (12).


According to Henry (1968), two kinds of temporal constructions can be distinguished in Romance: temporal circumstantial constructions, with a non-complex measure phrase (12a), and temporal presentational constructions, where the main verb of the sentence is the temporal fer and the measure phrase is complex (12b and c). Temporal circumstantial constructions, which appear in a peripheral position, act as frame adverbials (Parsons, 1990, p. 209), which are temporal phrases that set a context within which the rest of the sentence must be interpreted. Temporal presentational constructions are autonomous sentences and show a fixed order. On the properties of temporal existential constructions in Catalan, see IEC (2016, pp. 1199–1202) and Rigau (2001).

2.2 Forms of Address

Catalan has three different second-person pronouns: the familiar form tu and the respectful forms vós and vostè (IEC, 2016, pp. 195−196). The pronouns tu and vostè have the plural forms vosaltres and vostès, respectively, but vós has no parallel plural form. As forms of address, tu, vós, and vostè are second-person singular pronouns, but they show different agreement with the verb, as exemplified in (13).


The pronoun tu is used with relatives, friends, or social peers in general and agrees with the verb in second-person singular (13a). The pronoun vós comes from the Latin second-person plural pronoun vōs and as a result agrees with the verb in the second-person plural (13b). Initially, it was used both as a second plural form and as a respectful singular form. However, at the end of the 13th century, vosaltres (from vós ‘’ + altres ‘others’) became generalized as a plural form, so that vós was confined to its respectful singular meaning.

The pronoun vostè is a late form, parallel to the Spanish usted. It comes from the contraction of the nominal phrase vostra mercè ‘your mercy’ and therefore agrees with the verb in the third-person singular (13c). Traditionally, the courtesy value of vostè was higher than that of vós, which suggested friendly politeness as opposed to the highly formal vostè. Currently, however, vós has lost vitality, especially in urban areas.

2.3 Mood

In terms of mood opposition, the indicative is the unmarked mood with respect to the subjunctive. As a result, syntactically the indicative can appear in either simple sentences or subordinate clauses (14a) whereas the subjunctive appears above all in subordinate clauses (14b).


Moreover, semantically the indicative is generally used in assertive contexts in which particular information is affirmed (14a), whereas the subjunctive is associated with non-assertive contexts (14b) and, more specifically, with values related to the unreal, possibility, doubt, and so on; will, desire, need, and so on; positive and negative emotions; and given or presupposed information (Hualde, 1992, pp. 316−323; IEC, 2016, pp. 933−942; Pérez Saldanya, 1988; Quer, 2002; Wheeler et al., 1999, pp. 373−392).

In the majority of subordinate clauses, only one mood is possible. There are cases, however, in which two moods are possible and this alternation is usually associated with differences in meaning. For example, saying verbs select an indicative clause if they introduce an assertion (15a) and a subjunctive clause if they introduce a request or order (15b).


Verbs of saying, opinion, and judgment usually select the subjunctive when negated (16a), but they also accept the indicative if negation does not affect the subordinate clause (16b), either because the speaker considers the content to be true or because a prior assertion is being reproduced.


As in other Romance languages, relative clauses may also appear in either mood, depending on whether the NP in which the relative clause is inserted can be interpreted as specific or non-specific. With an indicative like that in (17a), the NP is specific and denotes a specific entity in the discourse world (17a). In this case, for example, the speaker has in mind a particular speaker of Chinese. But with a subjunctive like (17b), it is not specific and does not denote any specific entity. The context here would be, for example, that the speaker needs to find any person who can interpret Chinese.


Despite what has been described here, there are subordinate clauses in which the use of the mood depends not on the assertive or non-assertive nature of the clause but rather on the subordinator that introduces them. For example, in conditionals introduced by the conjunction si ‘if’, the present indicative will be used in open conditionals, meaning conditionals that express possible situations in the present temporal sphere (18a). In remote conditionals, either imperfect subjunctive (18b) or imperfect indicative (18c) can be used to express false or possibly false situations in the same temporal sphere. Of these two options, the imperfect subjunctive is the more commonly used over the last five decades whereas the imperfect indicative is generally perceived as archaic or literary. Finally, the pluperfect subjunctive is possible only in the case of past counterfactual conditionals (18d).


2.4 Future and Conditional

Unlike other Romance languages, Catalan has no periphrastic future formed with a verb of movement analogous to the ‘be going to’ structure in English. This difference is probably due to the presence in Catalan of a verbal periphrasis that is similarly constructed using anar ‘to go’ and the infinitive but that expresses past, not future (19) (see Section 3.4 below).


The fact that Catalan’s past periphrasis had already been formed by the end of the Middle Ages undoubtedly prevented the emergence of a parallel future periphrasis, which was formed later in the other Romance languages. For this reason, in Catalan the future formed from Late Latin’s ‘infinitive + habeo’ periphrasis (e.g., cantare habeo ‘I must sing’ > cantaré ‘I will sing’) has remained stable and this tense has mood usages that are more restricted than in other Romance languages (IEC, 2016, pp. 920−923; Pérez Saldanya, 1998, pp. 292−293). Associated with a meaning of subsequence, the future can express mood values related to courtesy (20a), obviousness (20b), or opposition and reproach (20c).


In Valencian and to a lesser degree in other dialects, the future may also express the idea that something is inferred to hold at the present moment (21a). This inferential use has been documented in Medieval Catalan (Martines, 2015) but is usually disapproved of in contemporary prescriptive grammars, which recommend instead the periphrastic formulation deure ‘must’ + infinitive (21b).


In Northwestern dialects, the future is also used as a form of imperative (22a), and as it happens in the imperative form (22b), clitic pronouns appear in an enclitic position (Veny, 1983, p. 135). This, however, is a colloquial use that is now in retreat.


The conditional can be used with a past future temporal value or with a mood value of unreality in the present temporal sphere. The former meaning appears especially in content clauses in the context of reported speech (23a), and the latter in the apodosis of so-called unreal conditional sentences (23b) or sentences that imply a condition without explicitly stating it (23c).


The conditional has other modal uses in the present sphere. It can be used to attenuate the expression of a wish or order (24a) or with an evidential value to indicate that certain information is attributed to another person or an unsubstantiated claim (24b). The latter is not very common and is restricted to journalistic language and formal registers.


As in other Romance languages, the conditional comes from the Late Latin periphrasis infinitive + habeō, where habeō is in imperfect subjunctive form (cantāre habebam ‘had to sing’ > cantaria). Old Catalan has another conditional tense derived from the Latin pluperfect indicative form (e.g., cantāveram ‘I had sing’ > cantara), which currently is used only in stereotyped constructions, such as Fora bo/interessant que… ‘It would be good/interesting that ...’. In Valencian, however, cantara has also been used for the imperfect past tense of the subjunctive since the 17th century (see Section 1).

2.5 Negation

Negation is generally expressed with the adverb no. This adverb is placed before the verb or before the verb preceded by proclitic pronouns (25a). Moreover, it can appear in correlation with the negative particle pas, which adds an emphatic meaning to the negation (Espinal, 2002, pp. 2748−2752), as in the declarative sentence (25b), in which a prior assumption, either implicit or explicit, is denied with a certain adversative nuance.


The adverb no is also used in constituent negation, followed optionally by pas (26a), and in response to a yes–no question (26b).


The particle pas was used with an emphatic meaning in all dialects of Old Catalan, but there are now geographical differences, persisting in Central and Northwestern Catalan but not in Valencian or Balearic. Under the influence of French, in the Northern dialect it has become the general form of negation and colloquially is used alone, without the adverb no: Ells vindran pas (they will_come neg) ‘They won’t come’. In Central Pyrenean dialects, the emphatic negative marker is the postverbal particle cap from Latin capu(m) ‘end’, as in No vindré cap (no will_come neg) ‘There’s no way I’m coming’ (Llop Naya, 2016, p. 33).

As in other Romance languages, the adverb no can establish a relationship of negative concordance with other negative elements, known as negative polarity items (NPIs). These may be negative quantifiers (27a), the adverb tampoc (27b), or phrases coordinated with the negative conjunction ni (27c). NPIs often appear positioned after the verb, and in this case the adverb no is mandatory (27a−c). Less frequently, they may appear before the verb, and in such cases the adverb no is optional (27d), although in formal language the usual preference is to keep it (IEC, 2016, pp. 1304−1306), in accordance with Old Catalan usage (Pérez Saldanya, 2004).


Elided negation is currently the most common solution in contexts where the verb is not made explicit, such as partial responses (28a). Nevertheless, no continues to be used to a certain extent with the quantifiers res and gens, and it is obligatory with gaire (28b).


Negative quantifiers can also be used with a non-specific meaning equivalent to the English ‘any’ and ‘ever’ in sentences that are not negative (Espinal, 2007, pp. 51−52; IEC, 2016, pp. 1306−1309; Wheeler et al., 1999, p. 477). This value appears in yes–no interrogative sentences (29a), the protasis of conditional sentences (29b), content clauses dependent on verbs that indicate doubt, fear, opposition, and so on (29c), or the coda of unequal comparisons (29d).


The adverb no can also be used with an expletive (or non-negative value) in contexts in which a negative state of affairs is expected to be considered (Espinal, 2007, pp. 51−52). They are contexts with predicates expressing fear (30a) or with subordinate clauses headed by abans ‘before’ (30b) (Espinal, 2002, pp. 2776−2779; IEC, 2016, pp. 1313−1314; Pérez Saldanya & Torrent-Lenzen, 2006).


3. Specific Phenomena

3.1 The Complementizer De

Catalan shows a complementizer for infinitive clauses functioning as a postverbal subject or direct object: the preposition de ‘of’ (Alsina, 2002, pp. 2396–2397; S. Bonet, 2002, pp. 2376−2378; IEC, 2016, pp. 1009−1011; Villalba, 2002, pp. 2268–2270). This complementizer is optional when the infinitive clause is the postverbal subject of some psychological verbs such as agradar ‘to like’, costar ‘to be hard (for someone)’, and molestar ‘to bother’ (31a); deontic verbs like tocar ‘to be somebody’s turn’ and bastar ‘to be enough’ (31b); and some copulative sentences (31c).


When it is introducing a clause functioning as a direct object, the complementizer de is (a) obligatory with the verbs mirar, provar, or assajar (all of them meaning ‘to try’), pregar ‘to request’, and dir ‘to propose’ (32a); (b) mainly unacceptable with the quasimodal verb voler ‘to want’, perception verbs sentir ‘to hear’ and veure ‘to see’, and causative verbs fer ‘to do/make’ and deixar ‘to let’ (32b); and (c) optional with the majority of transitive verbs that select an infinitive clause, such as lamentar ‘to regret’, esperar ‘to hope’, prometre ‘to promise’, permetre ‘to allow’, aconseguir ‘to manage to’, and ordenar ‘to order’ (32c).


The presence of the preposition de in (32a and c) does not prevent the infinitive clause from being pronominalized by the accusative clitic ho ‘it’, as shown in (33).


Outside of the instances where it is strictly obligatory, the use of the complementizer de is characteristic of formal language.

3.2 Personal Articles

Proper nouns of person, both given names and surnames, can be introduced by the so-called personal article (en, na, and n’) or by the parallel forms of the definite article (el, la, and l’) (Brucart, 2002, pp. 1477−1479; IEC, 2016, pp. 581−582; Wheeler et al., 1999, pp. 67−68). As shown in Table 1, in Balearic, personal articles are used in all phonological contexts. However, in the majority of dialects spoken in Catalonia, definite articles overwhelmingly predominate, although the masculine personal article en occurs in some areas. Conversely, in most of the Valencian-speaking territory, articles are generally not used at all with names.

Table 1. The Use of Articles with Proper Nouns of Person



Catalonian varieties


Masculine starting with a consonant

en Pere

en/el Pere


Masculine starting with a vowel




Feminine starting with a consonant

na Maria

la Maria


Feminine starting with a vowel




Nevertheless, proper nouns generally appear without any sort of article in academic and formal texts, including news reports: Ramon Llull, Pompeu Fabra, Monserrat Caballé. Moreover, neither definite nor personal articles are used in vocative forms, as in Maria, vine cap aquí (‘Mary, come here’).

The personal articles were initially used as honorific title forms equivalent to the Spanish don-doña, Portuguese dom-dona, or Italian don-donna. Although all of these honorific titles come from the Latin nouns domine ‘man’ and domina ‘woman’ in the vocative case, in Catalan they underwent a more severe process of cliticization and semantic erosion (Casanova, 2003, pp. 209−210). The original courtesy value of en and na weakened during the 14th century; this was probably due to the emergence of new forms of treatment, such as mossèn ‘my Lord’ (restricted to priests, i.e., ‘Father’)’, misser ‘my Sire’, monsenyor ‘my Lord’, doctor ‘Doctor’, mestre ‘Master’, and senyora ‘Lady’, with which they occasionally appeared in combination: mossèn en Berenguer and senyora na Francina. The consequence of this semantic weakening meant that by the 15th century the en and na forms had become mere articles, which later could be replaced by the respective definite article in some dialects. Their disappearance from Valencian is probably due to the stigmatization of their use starting in the 15th century under the influence of the Castilian language and humanist ideas (Casanova, 2003, p. 215ff.).

3.3 The Neuter Article

The definite article agrees in gender and number with the noun that it modifies, as in el nen ‘the boy/child’, la nena ‘the girl’, els nens ‘the boys/children’, and les nenes ‘the girls’. The form of the singular masculine article (el or l’, the latter followed by a vowel) can also modify elements other than nouns, such as a relative sentence (34a), certain adjectives (34b), and certain adverbs in superlative constructions (34c). In this usage, the phrase headed by the article works as an NP and denotes the abstract or concrete entity that exhibits the property expressed by the modified element.


In place of the article el, in colloquial language the neuter article lo is used, and it can also appear followed by a PP with de or the possessive, as in lo que dius (‘what you’re saying’), lo important (‘the important thing’), lo mateix (‘the same’), lo de sempre (‘as usual’), and lo meu (‘what is mine’). However, this article is not accepted in formal language since it is considered a calque from Spanish (Fabra, 1919−1947, pp. 268−270, 350−351, 765−766; IEC, 2016, pp. 587−593), as Old Catalan did not have a neuter article that was different from the masculine singular.

Until the 15th century, the functions of this neuter article were performed by the neuter demonstrative ço (ço que dius ‘what you’re saying’ and ço d’en Pere ‘that which is owned by Peter’) or by the masculine article, which originally had the form lo or l’ (l’important ‘the important thing’, lo mateix ‘the same’, and al pus prest que porets ‘as soon as you can’). However, ço lost vitality by the end of the Middle Ages, leaving only the masculine article (Martines, 2010). In the 18th century, the masculine article took the form el (or l’), leaving the way open for the form lo to be reinterpreted as a neuter article. As has been pointed out, for some authors, this functional split was due to the influence of Spanish, a language that has distinguished el (masculine article) and lo (neuter article) since its beginnings, but for others it is a Catalan internal development encouraged by the existence of other neuter nominal elements, specifically the clitic pronoun ho ‘it’ and the demonstrative pronouns açò ‘this’, això ‘that’, and allò ‘that’ (Casanova, 2001).

3.4 The Go-Past

Catalan has two past perfective tenses: the simple past, derived from the Latin perfect (e.g., parabolāvit ‘(s)he spoke / has spoken’ > parlà), and the periphrastic past, which is formed with the verb anar ‘to go’ followed by an infinitive (e.g., va parlar ‘(s)he spoke’). The two tenses are synonymous and are used with the value of prehodiernal perfective past (35).


In spoken language, the periphrastic past has displaced the simple past in most domains of usage. However, the simple past is still used orally in the dialect of the city of Valencia and its surrounding areas and, with less vitality, in the dialects of Elx (a Valencian-speaking city) and the islands of Mallorca and Ibiza. Moreover, it continues to be widely used in literature, sometimes in stylistic alternation with the periphrastic form, except in the first person, for which the periphrastic is favored.

As was the case for the go-future in English and different Romance languages (see Section 2.3), Catalan’s periphrastic past was formed from the grammaticalization of purposive directional constructions involving the infinitive—that is, from constructions which indicate that the subject is moving with the intention of carrying out the action designated by the infinitive. Initially, this periphrasis had an aspectual meaning of completeness that emerged in past narrative contexts with the verb anar in the historical present or simple past, as in example (36), which mimics the forms of Old Catalan (Pérez Saldanya & Hualde, 2003, pp. 51−55).


In a first stage of grammaticalization, the inferential meaning of completeness was set and the construction was re-parsed as an aspectual periphrasis used to lend dynamism and liveliness to the narration of the past event. This use is also documented in French and Occitan and to a lesser extent in Spanish (Colón, 1978). In these languages, however, periphrasis has progressively lost vitality; in Catalan, it experienced a process of increasing grammaticalization and ultimately became a perfective past periphrasis, probably as early as the mid-14th century, at least in the case of Central Catalan.

In this grammaticalization process, the auxiliary anar ‘to go’ became fixed in the present form, but with a paradigmatic levelling, since the suppletive forms for the first- and second-person plural (anam and anats) were replaced by forms with va- (vam and vau, respectively). Moreover, in different dialects, the morph -re- was later adopted in the same persons as those in which it appears in the simple past (the second-person singular and the plural persons) and, with more restrictions, in the first-person singular (see Table 2).

Table 2. Forms of the Present Indicative of Anar and the Past Auxiliary


Present indicative

of anar 'to go'

Past auxiliary (I)

Past auxiliary (II)










anam (> anem)

anats (> aneu)














As indicated, in Catalan the periphrastic past behaves as prehodiernal past (Ahir vaig treballar molt ‘Yesterday I worked a lot’) and is clearly different from the perfect, which (among other values) expresses hodiernal past (Avui he treballat molt ‘Today I worked a lot’). Under French contact pressure, some Northern Catalan speakers have removed this temporal distinction and use the latter form with both values: Ahir/Avui he treballat molt ‘Today/Yesterday I worked a lot’. (On past tense and auxiliary selection, see Gómez Duran, 2016, pp. 129–132; Hawkey, 2019, pp. 22–23.)

3.5 Spatial Deixis

The majority of Catalan dialects show a two-term space deixis represented by the demonstrative adverbs aquí ‘here’ and allà (or allí) ‘there’, the demonstrative determiners aquest ‘this’ and aquell ‘that’ (and their inflected forms), and the demonstrative pronouns això ‘this’ and allò ‘that’. Aquí, aquest, and això may refer to either the speaker’s location (37a) or the addressee’s location (37b), whereas allà (or allí), aquell, and allò express a location away from both the speaker and the addressee (37c).


Nonetheless, like Old Catalan, Valencian shows three-term space deixis (Pérez Saldanya & Rigau, 2011). In Valencian, the adverb ací (or aquí) ‘here’, the determiner este (or the more high-register variant aquest) ‘this’, and the pronoun açò ‘this’ express proximity to the speaker; the adverb ahí ‘there’ (from the Spanish equivalent) and the determiner eixe (or the more high-register variant aqueix) ‘that’ express medial proximity or proximity to the addressee; and the adverb allà (or allí) ‘there’, the determiner aquell ‘that’, and the pronoun allò ‘that’ express distal proximity to both speaker and addressee (Badia, 1994, pp. 498−500, 702−705; Brucart, 2002, pp. 1438−1445; IEC, 2016, pp. 601−603, 799).

Belonging to one deictic system or another implies certain differences in the use of the verbs of motion anar ‘to go’ and venir ‘to come’ (Rigau, 1976). In the dialects with two-term space deixis, the verb anar is used when the movement is toward a place that is neither the location of the speaker nor that of the addressee. As a consequence, the place must be explicitly expressed by a locative complement, whose minimal expression is the locative clitic pronoun hi: Anàvem a Tarragona ‘We went to Tarragona’ ➔ Hi anàvem ‘We went there’/ *Anàvem. In contrast, when the movement is toward the location of the speaker or the addressee or both, the verb is venir, and the clitic hi is not needed: Vindràs a Tarragona? ‘Will you come to Tarragona?’, where Tarragona is interpreted as the place of the speaker. Thus, in Demà no vindran ‘Tomorrow they will not come’, the movement of the subject will be toward the place of the speaker or the addressee or both. In other words, with this verb, the destination complement may be covert.

In the dialects with three-term space deixis, the verb venir is used when the destination is the location of the speaker. Otherwise, the verb selected is anar: Vine a ma casa ‘Come to my house’, Aniré a ta/sa casa ‘I will go to your/their house’. Moreover, the locative clitic hi has lost vitality in these dialects and the locative complement can be omitted: Aniré ‘I will go’.

3.6 Clitic Pronouns: Some Clusters with a Dative Clitic

Catalan has a complex system of clitic pronouns (or weak object pronouns) which may vary in form according to the point of contact with the verb, proclitically (El saludo ‘I greet him’, L’havies saludat ‘You had greeted him’) or enclitically (Vaig saludar-lo ‘I greeted him’, Saluda’l ‘Greet him’), and according to the inflection in person, number, and in some cases gender (em1SG, et2SG, el3SG.M, la3SG.F, ens1PL, us2PL, els2PL.M, etc.). Clitics also vary in function: la ‘her’, li ‘to_her/him’, and so on. The clitic system includes the so-called pronominal adverbs en and hi. See IEC (2016, pp. 189−218, 669−712) and Wheeler et al. (1999, pp. 166−217). On the clitic es in passive and impersonal constructions, see Bartra (2002, pp. 2149–2165).

To express possession, Catalan may use the genitive clitic en (En conec l’autor ‘I know its author’), the dative clitic (Li rento la cara ‘I wash his/her face’), or the zero anaphora (Tancaré els ulls ‘I will close my eyes’) in addition to the so-called possessive pronoun (el nostre fill ‘our son’).

Interestingly, some clitic clusters show different solutions depending on the dialect. Such is the case, for example, with the combination of third-person accusative pronouns (el, la, els, and les) with the third-person dative pronoun (li and els). In Old Catalan, accusative pronouns preceded dative pronouns, and in combinations involving third-person accusative pronouns the singular dative li soon adopted the dissimilated form hi, which coincides with the locative clitic: Ell la li dona > Ell la hi dona ‘He gives it to her’. This combination is preserved in modern Standard Catalan. Conversely, in Valencian, the loss of the locative hi in the 18th century involved the loss of the dative allomorph hi and the reintroduction of li, albeit now with the modern dative-first order: Ell li la dona ‘He gives it to her’ (Casanova, 1989). Both combinations are considered acceptable in the Standard language.

Dialects with the singular dative clitic hi use the colloquial sequence els hi (‘to them’) instead of the Standard form els (‘to them’) for the third-person plural dative clitic. Thus, for example, the Standard Catalan sentence Els agrada la paella ‘They like paella’ contrasts with the colloquial Els hi agrada la paella. Interestingly, when the plural dative clitic combines with the partitive clitic en (or n’) ‘of it’, the partitive appears between the two forms of the colloquial plural dative (els + en + hi): Ella els n’hi dona ‘She gives them some of it’. On Catalan clitic combinations, see E. Bonet (2002), Fabra (1956, pp. 60−63), and IEC (2016, pp. 201−207).

3.7 Differential Object Marking

The direct object is generally constructed without any preposition. As in Spanish or Romanian, however, in certain contexts it is marked with the preposition a ‘to’, which is obligatory with the indirect object. This differential marking generally appears if the direct object is highly animate (or as animate as the subject) and in contexts in which the constituents are not located in their canonical position (IEC, 2016, pp. 730−736; Sancho Cremades, 2002, pp. 1737−1738). The preposition a is compulsory when the direct object is reduplicated by a clitic pronoun (38a), when the subject occurs after the verb or the verb is dropped (38b), and generally with the relative pronoun qui ‘who/whom’ (38c).


Conversely, it is optional with pronouns designating people (39a) and when the direct object does not occur in its canonical position, as in dislocations (39b) or in focalizations (39c).


Colloquially, it also appears with human definite NPs and personal names (e.g., el president ‘the president’ and la Maria ‘Mary’) but this use is not accepted by the prescriptive grammar. In Old Catalan, differential object marking appeared mainly with personal pronouns; but at the end of the Middle Ages, it was also found with proper nouns (both person and divinity names) and human definite NPs (Pineda i Cirera, 2021). This usage has been continually increasing, surely under Spanish influence.

3.8 Modal Particles

Catalan dialects are rich in particles expressing modality. Confirmation-seeking questions are built with the confirmative particles eh?, no?, oi?, veritat?, and fa? The particle eh? belongs to all dialects, whereas the particle oi? belongs to Central Catalan, veritat? ‘truth’ to Valencian, and fa? to Northern Catalan. The negative particle no? ‘not’ is found in all dialects when it appears at the end of the sentence. At the beginning of the sentence, it is typical of Catalan spoken in the Tarragona area. When the particles introduce the question, they are followed by the conjunction que ‘that’, as shown in (40a). When they are at the right periphery, the conjunction is not present and the particle forms an intonation unit, as in (38b). See Cuenca and Castellà (1995), IEC (2016, pp. 107−108, 1253−1254), Prieto and Rigau (2007), and Rigau (1998).


In Central Catalan, Northwestern Catalan, and Balearic, neutral (or non-presuppositional) yes–no questions (41a) can be headed by the conjunction que ‘that’ (41b).


In very formal pragmatic situations, such as the question posed by the officiant in a wedding ceremony (42), the que would not be appropriate (Payrató, 2002, pp. 1203–1204; Prieto & Rigau, 2007).


In Valencian and Northern Catalan, a question headed by que cannot be neutral. For example, in these dialects, question (41b) would be interpreted as a counter-expectational question showing surprise. Such questions express the lack of agreement between the speakers’ own expectations and the discourse context (IEC, 2016, pp. 111−112, 1248−1268; Prieto & Rigau, 2007, 2011). For other modality particles that express mirativity, such as the dialectal particle pla, see Rigau (2012).


