History of the French Lexicon  

Olivier Bertrand

The French lexicon of the early 21st century has two main sources: In the early 21st century, approximately 87% of the vocabulary comes from Latin and 13% from many other languages. The French lexicon was created and developed roughly from the 7th–8th century ce up to the early 21st century. There are three ways to create vocabulary: inheritance from popular Latin, borrowing from other languages (as well as from written Latin), and internal creation (through semantic extension, derivation, and compounding).


Etymology and the Lexical Core of Germanic  

Robert Mailhammer

Etymologies are statements about the origin and history of linguistic items (words and structures). Typically, an etymology gives information about what historical period of a language a word or a structure was created and what kinds of processes were involved, as well as about its subsequent history. Usually, etymologies involve the reconstruction of parts or all of an item’s history including the original formation. A reconstruction is a hypothesis about the form and meaning of an ancestral form and the changes it has undergone to yield the oldest attested form. This hypothesis is based on language-internal data and data from related languages as well as our knowledge about language change. The use of comparative data is key for determining and reconstructing the ancestral form of a linguistic item. One important property of reconstructions, and hence of etymologies, is that they are probabilistic; that is, they are hypotheses that are more or less likely to be correct. Etymologies of high quality have a high level of reliability or confidence, whereas etymologies of low quality are generally only weakly supported. There is a range of factors influencing the quality of an etymology, and it is important to make clear how well-supported etymologies are when considering the etymological situation of the whole or a part of the vocabulary of a language. Two pivotal factors are the degree to which sound correspondences and related changes are regular and the strength of the correspondence pattern in terms of correspondence sets and equations. There is a significant body of work of etymological research on Germanic. This work can be broadly categorized into studies that etymologize words in a given daughter language and studies that take a more comparative approach. The focus of the literature has been on finding connections within the Indo-European family and explaining Germanic and its lexicon in terms of their development from Proto-Indo-European. Nonetheless, it is well known that the Germanic lexicon contains loans from other Indo-European languages, especially from Celtic and Latin, such as PGmc. *tūna- ‘fence’ (e.g., OHG zūn ‘fence’) borrowed from Proto-Celtic *dūno ‘fort, rampart’. It is also common knowledge that a substantial part of the Germanic vocabulary is of unclear origin. The exact amount of non-etymologized vocabulary in the Germanic lexicon is unknown, but existing quantitative data suggest that the standard figure quoted in the literature of one third is too low. However, mainstream literature has not systematically investigated Germanic words of unknown origin with the aim of finding contact etymologies that satisfy the standard requirements of contact linguistics. Since the second half of the 20th century, non-Indo-European elements in the Germanic lexicon have received more attention. The majority of hypotheses involves substratum languages. By contrast, one key observation based on what is known about outcomes of language contact, supported by well-studied cases, is that it is quite likely that some of these non-etymologized words were borrowed from non-Indo-European languages, and it is also likely that at least some of these words are from a superstratum rather than a substratum. Relevant lexical items belong to semantic domains such as warfare, the legal system, and administration, for example, PGmc. *fulka- ‘divison’ (of an army), *sibjō ‘family, clan’, *aþal-/*aþil-/*aþil- ‘nobility, noble’. Moreover, non-etymologized words relating to superior cultural innovations, for example, terms of coins (PGmc. *skellingaz/*skillinaz ‘shilling’ and PGmc. *pan(n)(d)ing ‘penny’) and agricultural innovations (PGmc. *plōg- ‘(wheel) plough’) also fit better with superstratum influence than with substratum influence. Furthermore, it is also important to highlight that words of unknown origin form part of the lexical core of Germanic, for example, *erþō ‘earth’, *handuz ‘hand’, *stainaz ‘stone’, *drinkanan ‘drink’. Whatever the origin of the hitherto non-etymologized words in the PGmc. lexicon, it is to be expected that a sizable part of them are of non-Indo-European origin. Given the significant implications for the cultural history of the people who spoke Proto-Germanic and their contemporaries, it seems well worth investigating the extra-Indo-European connections of Proto-Germanic in spite of the challenges.


Etymology in Romance  

Éva Buchi and Steven N. Dworkin

Etymology is the only linguistic subdiscipline that is uniquely historical in its study of the relevant linguistic data and one of the oldest fields in Romance linguistics. The concept of etymology as practiced by Romanists has changed over the last 100 years. At the outset, Romance etymologists took as their brief the search for and identification of individual word origins. Starting in the early 20th century, various specialists began to view etymology as the preparation of the complete history of all facets of the evolution over time and space of the words or lexical families being studied. Identification of the underlying base was only the first step in the process. From this perspective, etymology constitutes an essential element of diachronic lexicology, which covers all formal, semantic, and syntactic facets of a word’s evolution, including, if appropriate, the circumstances leading to its demise and replacement.


History of the Occitan and Gascon Lexicon  

Hélène Carles and Martin Glessgen

The process of differentiation of the Occitan and the Gascon lexicon began under the Roman Empire, increasing from the 8th century onward, and was further accentuated during the course of the second millennium. The dialects but also the written varieties of Medieval Occitan and Gascon were highly developed and remained pluricentric. The mechanisms of lexical innovation engendered by the development of the various textual traditions, as well as by intertextuality, caused the vocabulary to develop considerably between the 12th and the 15th centuries. From the 16th to the 19th centuries, the process of elaboration of written culture began to grind to a halt, although the two languages continued to be spoken throughout the territory. The traditional vocabulary continued to diversify, parallel to the development of regional literature and the constitution of significant lexical inventories. Thus, at the start of the contemporary period, the dialectal varieties of Occitan and Gascon had reached a pinnacle of diversification, but use of the spoken variety diminished throughout the 20th century, despite the powerful revival movements of the 19th and 20th centuries. Future research should intensify its efforts in the field of lexicological analysis with the object of emphasizing the richness of dialectal varieties and the expressivity of contemporary literature.


History of the Sardinian Lexicon  

Ignazio Putzu

Ever since the fundamental studies carried out by the great German Romanist Max Leopold Wagner (b. 1880–d. 1962), the acknowledged founder of scientific research on Sardinian, the lexicon has been, and still is, one of the most investigated and best-known areas of the Sardinian language. Several substrate components stand out in the Sardinian lexicon around a fundamental layer which has a clear Latin lexical background. The so-called Paleo-Sardinian layer is particularly intriguing. This is a conventional label for the linguistic varieties spoken in the prehistoric and protohistoric ages in Sardinia. Indeed, the relatively large amount of words (toponyms in particular) which can be traced back to this substrate clearly distinguishes the Sardinian lexicon within the panorama of the Romance languages. As for the other Pre-Latin substrata, the Phoenician-Punic presence mainly (although not exclusively) affected southern and western Sardinia, where we find the highest concentration of Phoenician-Punic loanwords. On the other hand, recent studies have shown that the Latinization of Sardinia was more complex than once thought. In particular, the alleged archaic nature of some features of Sardinian has been questioned. Moreover, research carried out in recent decades has underlined the importance of the Greek Byzantine superstrate, which has actually left far more evident lexical traces than previously thought. Finally, from the late Middle Ages onward, the contributions from the early Italian, Catalan, and Spanish superstrates, as well as from modern and contemporary Italian, have substantially reshaped the modern-day profile of the Sardinian lexicon. In these cases too, more recent research has shown a deeper impact of these components on the Sardinian lexicon, especially as regards the influence of Italian.


Dalmatian (Vegliote)  

Martin Maiden

Dalmatian is an extinct group of Romance varieties spoken on the eastern Adriatic seaboard, best known from its Vegliote variety, spoken on the island of Krk (also called Veglia). Vegliote is principally represented by the linguistic testimony of its last speaker, Tuone Udaina, who died at the end of the 19th century. By the time Udaina’s Vegliote could be explored by linguists (principally by Matteo Bartoli), it seems that he had no longer actively spoken the language for decades, and his linguistic testimony is imperfect, in that it is influenced for example by the Venetan dialect that he habitually spoke. Nonetheless, his Vegliote reveals various distinctive and recurrent linguistic traits, notably in the domain of phonology (for example, pervasive and complex patterns of vowel diphthongization) and morphology (notably a general collapse of the general Romance inflexional system of tense and mood morphology, but also an unusual type of synthetic future form).


Romance in Contact with Albanian  

Walter Breu

Albanian has been documented in historical texts only since the 16th century. In contrast, it had been in continuous contact with languages of the Latin phylum since the first encounters of Romans and Proto-Albanians in the 2nd century bce. Given the late documentation of Albanian, the different layers of matter borrowings from Latin and its daughter languages are relevant for the reconstruction of Proto-Albanian phonology and its development through the centuries. Latinisms also play a role in the discussion about the original home of the Albanians. From the very beginning, Latin influence seems to have been all-embracing with respect to the lexical domain, including word formation and lexical calquing. This is true not only for Latin itself but also for later Romance, especially for Italian historical varieties, less so for now extinct Balkan-Romance vernaculars like Dalmatian, and doubtful for Romanian, whose similarities with Albanian had been strongly overestimated in the past. Many Latin-based words in Albanian have the character of indirect Latinisms, as they go back to originally Latin borrowings via Ancient (and Medieval) Greek, and there is also the problem of learned borrowings from Medieval Latin. As for other Romance languages, only French has to be considered as the source of fairly recent borrowings, often hardly distinguishable from Italian ones, due to analogical integration processes. In spite of 19th-century claims in this respect, Latin (and Romance) grammatical influence on Albanian is (next to) zero. In Italo-Albanian varieties that have developed all over southern Italy since the late Middle Ages, based on a succession of immigration waves, Italian influence has been especially strong, not only with respect to the lexical domain but by interfering in some parts of grammar, too.


(High) German  

Simon Pickl

(High) German is both a group of closely related West Germanic varieties and a standardized language derived from this group that comprises a wide range of dialects and colloquial varieties in addition to its standardized form. The two terms have related, and to an extent overlapping, but distinct meanings: German refers to a Standard Average European language spoken predominantly in Central Europe by some 96 million speakers and by minority speech communities around the globe. High German has a double meaning: On the one hand, it is another term for Standard German. On the other hand, it refers to the High German linguistic group within West Germanic, the linguistic basis for the German language. As such, it is defined by the High German consonant shift, a sound change that affected Germanic obstruents and set it apart from its immediate neighbors within (West) Germanic, that is, Low German and Low Franconian. The High German consonant shift around the 7th century, together with the onset of written transmission in the 8th century, marks the beginning of the history of (High) German. Traditional dialects perpetuate patterns of areal variation that arose in the wake of this sound change. Standard German developed out of High German written varieties, especially based on East Central German, through processes of leveling, koineization, metalinguistic reasoning, and codification. During that process, the emergent supra-regional norm superseded Low German in northern Germany and Upper German regional norms in the south, as well as influencing spoken registers, but (Standard) German remains a pluricentric and pluriareal language. Today, colloquial, regional varieties that combine features of Standard German and traditional dialects dominate oral language use, and in social media the written language, too, is developing new colloquial forms that build on standard orthography as well as on regional, informal forms of spoken language usage.