Etymology and the Lexical Core of Germanic  

Robert Mailhammer

Etymologies are statements about the origin and history of linguistic items (words and structures). Typically, an etymology gives information about what historical period of a language a word or a structure was created and what kinds of processes were involved, as well as about its subsequent history. Usually, etymologies involve the reconstruction of parts or all of an item’s history including the original formation. A reconstruction is a hypothesis about the form and meaning of an ancestral form and the changes it has undergone to yield the oldest attested form. This hypothesis is based on language-internal data and data from related languages as well as our knowledge about language change. The use of comparative data is key for determining and reconstructing the ancestral form of a linguistic item. One important property of reconstructions, and hence of etymologies, is that they are probabilistic; that is, they are hypotheses that are more or less likely to be correct. Etymologies of high quality have a high level of reliability or confidence, whereas etymologies of low quality are generally only weakly supported. There is a range of factors influencing the quality of an etymology, and it is important to make clear how well-supported etymologies are when considering the etymological situation of the whole or a part of the vocabulary of a language. Two pivotal factors are the degree to which sound correspondences and related changes are regular and the strength of the correspondence pattern in terms of correspondence sets and equations. There is a significant body of work of etymological research on Germanic. This work can be broadly categorized into studies that etymologize words in a given daughter language and studies that take a more comparative approach. The focus of the literature has been on finding connections within the Indo-European family and explaining Germanic and its lexicon in terms of their development from Proto-Indo-European. Nonetheless, it is well known that the Germanic lexicon contains loans from other Indo-European languages, especially from Celtic and Latin, such as PGmc. *tūna- ‘fence’ (e.g., OHG zūn ‘fence’) borrowed from Proto-Celtic *dūno ‘fort, rampart’. It is also common knowledge that a substantial part of the Germanic vocabulary is of unclear origin. The exact amount of non-etymologized vocabulary in the Germanic lexicon is unknown, but existing quantitative data suggest that the standard figure quoted in the literature of one third is too low. However, mainstream literature has not systematically investigated Germanic words of unknown origin with the aim of finding contact etymologies that satisfy the standard requirements of contact linguistics. Since the second half of the 20th century, non-Indo-European elements in the Germanic lexicon have received more attention. The majority of hypotheses involves substratum languages. By contrast, one key observation based on what is known about outcomes of language contact, supported by well-studied cases, is that it is quite likely that some of these non-etymologized words were borrowed from non-Indo-European languages, and it is also likely that at least some of these words are from a superstratum rather than a substratum. Relevant lexical items belong to semantic domains such as warfare, the legal system, and administration, for example, PGmc. *fulka- ‘divison’ (of an army), *sibjō ‘family, clan’, *aþal-/*aþil-/*aþil- ‘nobility, noble’. Moreover, non-etymologized words relating to superior cultural innovations, for example, terms of coins (PGmc. *skellingaz/*skillinaz ‘shilling’ and PGmc. *pan(n)(d)ing ‘penny’) and agricultural innovations (PGmc. *plōg- ‘(wheel) plough’) also fit better with superstratum influence than with substratum influence. Furthermore, it is also important to highlight that words of unknown origin form part of the lexical core of Germanic, for example, *erþō ‘earth’, *handuz ‘hand’, *stainaz ‘stone’, *drinkanan ‘drink’. Whatever the origin of the hitherto non-etymologized words in the PGmc. lexicon, it is to be expected that a sizable part of them are of non-Indo-European origin. Given the significant implications for the cultural history of the people who spoke Proto-Germanic and their contemporaries, it seems well worth investigating the extra-Indo-European connections of Proto-Germanic in spite of the challenges.