1-10 of 180 Results  for:

Clear all

Article

English in the U.S. South contains a wide range of variation, encompassing ethnic, social class, and subregional variations all within the umbrella term of Southern English. Although it has been a socially distinct variety since at least the mid-19th century, many of the modern features it is nationally known for developed only after 1875. Lexical variation has long distinguished the U.S. South, but new vocabulary has replaced the old, and subregional variation in the U.S. South is no longer important for lexical variation. Social class still plays an important role in grammatical variation, but the rise of compulsory education limited previously wider ranges of dialect features. Despite traditional scholarship’s primary focus on lexical and grammatical language variation in the U.S. South, phonological variation has been the main area of scholarship since 1990s. Within phonological variation, the production of vowels, the most socially salient features of the U.S. South, has been a heavily studied realm of scholarship. Prosodic, consonant, and perception studies have been on the rise and have provided numerous insights into this highly diverse dialect region.

Article

The use of a sociolinguistic approach in the comparative study of word formation is a quite modern phenomenon. The lack of any continuous documentation for many of the nonstandard Romance varieties results in the still partial nature of such analyses. However, they are undoubtedly of great interest from a comparative point of view. In short, while all the Romance varieties are connected through genetic affinity, contact phenomena have instead caused significant divergences related to status in the realm of word formation. What was the cause and how did this happen? In particular, the lack of an intense and continuous contact with the Greek-Latin cultural superstrate prevented the creation of new formation rules for words of learned origin in the minor Romance varieties and dialects (e.g., Corsican, Occitan, Friulian, Sardinian). This lack of interconnection with the Greek-Latin lexical stock has caused the minor Romance varieties to be distanced from the standard Romance languages (e.g., French, Italian, Spanish) and besides has brought the last ones closer to the learned levels of the main European non-Romance languages.

Article

Allomorphy and syncretism are both deviations from the one-to-one relationship between form and meaning inside the linguistic sign as postulated by Saussure as well as from the ideal of inflectional morphology as stipulated in the canonical approach by Corbett. Instances of both phenomena are well documented in all Romance languages. In inflection, allomorphy refers to the use of more than one root/stem in the paradigm of a single lexeme or to the existence of more than one inflectional affix for the same function. Syncretism describes the existence of identical forms with different functions in one and the same paradigm. Verbs exhibiting stem allomorphy are traditionally called irregular, a label that describes the existence of unexpected and, sometimes, unpredictable forms from a learner’s perspective. Extreme forms of allomorphy are called suppletion, for which traditional accounts require two or more etymologically unrelated roots/stems to coexist within the paradigm of a single lexeme. Allomorphy often originates in sound change affecting only stems in a certain phonological environment. When the phonological conditioning of the stem allomorph disappears, which is frequently the case, its distribution within the paradigm may become purely morphological, thus constituting a morphome in the sense of Aronoff. Recurrent patterns of syncretism may also be considered morphomes. Whereas syncretism was quite rare in Latin verb morphology, Romance languages feature it to much greater, if different, degrees. In extreme cases, syncretism patterns become paradigm-structuring in many Gallo-Romance varieties, as is the case in the verb morphology of standard French, where almost all forms are syncretic with at least one other.

Article

Indo-Aryan languages have the longest documented historical record, with the earliest attested texts going back to 1900 bce. Old Indo-Aryan (Vedic, Sanskrit) had an inflectional case-marking system where nominatives functioned as subjects. Objects could be realized via several different case markers (depending on semantic and structural factors), but not the nominative. This inflectional system was lost over the course of several centuries during Middle Indo-Aryan, resulting in just a nominative–oblique inflectional distinction. The New Indo-Aryan languages innovated case markers and developed new case-marking systems. Like in Old Indo-Aryan, case is systematically used to express semantic differences via differential object marking constructions. However, unlike in Old Indo-Aryan, many of the New Indo-Aryan languages are ergative and all allow for non-nominative subjects, most prominently for experiencer subjects. Objects, on the other hand, can now also be unmarked (nominative), usually participating in differential object marking. The case-marking patterns within New Indo-Aryan and across time have given rise to a number of debates and analyses. The most prominent of these include issues of case alignment and language change, the distribution of ergative vs. accusative vs. nominative case, and discussions of markedness and differential case marking.

Article

Steffen Heidinger

The notion of valency describes the property of verbs to open argument positions in a sentence (e.g., the verb eat opens two argument positions, filled in the sentence John ate the cake by the subject John and the direct object the cake). Depending on the number of arguments, a verb is avalent (no argument), monovalent (one argument), bivalent (two arguments), or trivalent (three arguments). In Romance languages, verbs are often labile (i.e., they occur in more than one valency pattern without any formal change on the verb). For example, the (European and Brazilian) Portuguese verb adoecer ‘get sick’/‘make sick’ can be used both as a monovalent and a bivalent verb (O bebê adoeceu ‘The baby got sick’ vs. O tempo frio adoeceu o bebê ‘The cold weather made the baby sick’). However, labile verbs are not equally important in all Romance languages. Taking the causative–anticausative alternation as an example, labile verbs are used more frequently in the encoding of the alternation in Portuguese and Italian than in Catalan and Spanish (the latter languages frequently recur to an encoding with a reflexively marked anticausative verb (e.g., Spanish romperse ‘break’). Romance languages possess various formal means to signal that a given constituent is an argument: word order, flagging the argument (by means of morphological case and, more importantly, prepositional marking), and indexing the argument on the verb (by means of morphological agreement or clitic pronouns). Again, Romance languages show variation with respect to the use of these formal means. For example, prepositional marking is much more frequent than morphological case marking on nouns (the latter being only found in Romanian).

Article

A computational learner needs three things: Data to learn from, a class of representations to acquire, and a way to get from one to the other. Language acquisition is a very particular learning setting that can be defined in terms of the input (the child’s early linguistic experience) and the output (a grammar capable of generating a language very similar to the input). The input is infamously impoverished. As it relates to morphology, the vast majority of potential forms are never attested in the input, and those that are attested follow an extremely skewed frequency distribution. Learners nevertheless manage to acquire most details of their native morphologies after only a few years of input. That said, acquisition is not instantaneous nor is it error-free. Children do make mistakes, and they do so in predictable ways which provide insights into their grammars and learning processes. The most elucidating computational model of morphology learning from the perspective of a linguist is one that learns morphology like a child does, that is, on child-like input and along a child-like developmental path. This article focuses on clarifying those aspects of morphology acquisition that should go into such an elucidating a computational model. Section 1 describes the input with a focus on child-directed speech corpora and input sparsity. Section 2 discusses representations with focuses on productivity, developmental paths, and formal learnability. Section 3 surveys the range of learning tasks that guide research in computational linguistics and NLP with special focus on how they relate to the acquisition setting. The conclusion in Section 4 presents a summary of morphology acquisition as a learning problem with Table 4 highlighting the key takeaways of this article.

Article

This contribution analyses morphologically autonomous structures within the context of the Romance languages, the family of languages which, along with Latin, have most served as an evidence base for these structures. Autonomous morphological structures are defined as an abstract representation of paradigmatic cells which form a cohesive group and reliably share exponents with each other, and the forms which realize them, are thus to a large extent interpredictable. In this contribution, I restrict my discussion to the most canonical type of these structures and those which have sparked the most controversy in the linguistic literature. I analyze this controversy and suggest that it is due to (a) their overlapping meaning with the term morphome, a concept which embodies an empirical claim about all morphology and (b) the controversy surrounding what morphology actually is and the basic units of morphological analysis and storage. I make a distinction between abstractive and constructive models of morphology and suggest that historical tendencies within the latter encourage scholars to view morphologically autonomous structures either as not synchronically relevant or as phonologically or semantically derivable due to their theoretical assumptions about the nature of language and the mental storage of words. These assumptions constitute the horizons of intelligibility of such models regarding the functioning of language and its governing principles, including outdated ideas of the capacity of mental storage. Unfortunately, however, the different theories furnish scholars with an expansive array of devices through which they can seemingly explain away the synchronic generalizations of the data while relegating the most recalcitrant data to the domain of memorized forms which are not relevant to the grammar. I present evidence in favor of the psychological reality of morphologically autonomous structures in diachrony and I argue that synchronically, these structures are necessary to explain the distribution of the data and capture the fact that speakers do not memorize every inflectional form of a paradigm but rely on patterns of predictability and implicational relationships between forms. It is my suggestion that morphologically autonomous structures encourage a revaluation of the basic units of memorization and the structure of the lexicon in accordance with abstractive theories of morphology.

Article

Catalan  

Francisco Ordóñez

Catalan is a “medium-sized” Romance language spoken by over 10 million speakers, spread over four nation states: Northeastern Spain, Andorra, Southern France, and the city of L’Alguer (Alghero) in Sardinia, Italy. Catalan is divided into two primary dialectal divisions, each with further subvarieties: Western Catalan (Western Catalonia, Eastern Aragon, and Valencian Community) and Eastern Catalan (center and east of Catalonia, Balearic Islands, Rosselló, and l’Alguer). Catalan descends from Vulgar Latin. Catalan expanded during medieval times as one of the primary vernacular languages of the Kingdom of Aragon. It largely retained its role in government and society until the War of Spanish Succession in 1714, and since it has been minoritized. Catalan was finally standardized during the beginning of the 20th century, although later during the Franco dictatorship it was banned in public spaces. The situation changed with the new Spanish Constitution promulgated in 1978, when Catalan was declared co-official with Spanish in Catalonia, the Valencian Community, and the Balearic Islands. The Latin vowel system evolved in Catalan into a system of seven stressed vowels. As in most other Iberian Romance languages, there is a general process of spirantization or lenition of voiced stops. Catalan has a two-gender grammatical system and, as in other Western Romance languages, plurals end in -s; Catalan has a personal article and Balearic Catalan has a two-determiner system for common nouns. Finally, past perfective actions are indicated by a compound tense consisting of the auxiliary verb anar ‘to go’ in present tense plus the infinitive. Catalan is a minoritized language everywhere it is spoken, except in the microstate of Andorra, and it is endangered in France and l’Alguer. The revival of Catalan in the post-dictatorship era is connected with a movement called linguistic normalization. The idea of normalization refers to the aim to return Catalan to a “normal” use at an official level and everyday level as any official language.

Article

Words are the backbone of language activity. An average 20-year-old native speaker of English will have a vocabulary of about 42,000 words. These words are connected with one another within the larger network of lexical knowledge that is termed the mental lexicon. The metaphor of a mental lexicon has played a central role in the development of theories of language and mind and has provided an intellectual meeting ground for psychologists, neurolinguists, and psycholinguists. Research on the mental lexicon has shown that lexical knowledge is not static. New words are acquired throughout the life span, creating very large increases in the richness of connectivity within the lexical system and changing the system as a whole. Because most people in the world speak more than one language, the default mental lexicon may be a multilingual one. Such a mental lexicon differs substantially from a lexicon of an individual language and would lead to the creation of new integrated lexical systems due to the pressure on the system to organize and access lexical knowledge in a homogenous manner. The mental lexicon contains both word knowledge and morphological knowledge. There is also evidence that it contains multiword strings such as idioms and lexical bundles. This speaks in support of a nonrestrictive “big tent” view of units of representation within the mental lexicon. Changes in research on lexical representations in language processing have emphasized lexical action and the role of learning. Although the metaphor of words as distinct representations within a lexical store has served to advance knowledge, it is more likely that words are best seen as networks of activity that are formed and affected by experience and learning throughout the life span.

Article

Antonio Fábregas and Rafael Marín

The term nominalization refers to a specific type of category-changing morphological operation that produces nouns from other lexical categories, most productively verbs and adjectives. By extension, it is also used to refer to the resulting derived nouns. In Romance languages, nominalization generally involves addition of a suffix to the base (cf. Italian generoso ‘generous’ > generos-ità ‘generosity’), and such suffixes are called nominalizers. However there are also cases of nouns built from other categories without any overt nominalizer (cf. Spanish inútil ‘useless’ > inútil ‘useless person’); descriptively, this process is called conversion, and it is debatable whether it should also be treated as a nominalization or whether another different kind of morphological operation is involved here. Nominalizations can be divided in several classes depending on a variety of semantic and syntactic factors, such as the type of entities that they denote or the ability to introduce arguments. The main nominalization classes are (a) complex event nominalizations, which come from verbs, can combine with some temporal and aspectual modifiers, and have the ability to introduce at least an internal argument; (b) state nominalizations, which denote states associated to the verbs that serve as their bases; (c) participant nominalizations, which denote different types of arguments of the base, such as agents, resulting objects, locations or recipients; and (d) quality nominalizations, coming from adjectives and more restrictively from verbs, which denote a set of properties related to their base. Different classes of predicates select for different nominalization types, and there is a debate surrounding which tests capture in a more complete way the nuances of this taxonomy. Nominalizers impose different types of restrictions to their bases: aspectual restrictions (individual-level vs. stage-level, (a) telicity, dynamicity, etc.), argument structure restrictions (agent vs. nonagent, different types of internal arguments), morphological restrictions (for instance, selecting only verbs that belong to a particular conjugation class), and finally conceptual restrictions (for instance, showing a strong preference for bases belonging to a particular conceptual domain). In Romance languages, nominalizations sometimes alternate with other word classes, most significantly infinitives (see article on “Infinitival Clauses in the Romance Languages” in this encyclopedia). Infinitival constructions in Romance can display a mixture of verbal and nominal properties, or be totally recategorized as nouns, and in both cases they can compete with prototypical nominalizations. Less generally, participles (see article on “Participial Relative Clauses” in this encyclopedia), gerunds and supines can also display nominalization properties in some Romance varieties.