The Oxford Research Encyclopedia of African History will be available via subscription on April 26. Visit About to learn more, meet the editorial board, learn about subscriber services.

Show Summary Details

Page of

PRINTED FROM the OXFORD RESEARCH ENCYCLOPEDIA, AFRICAN HISTORY ( (c) Oxford University Press USA, 2019. All Rights Reserved. Personal use only; commercial use is strictly prohibited (for details see Privacy Policy and Legal Notice).

date: 23 April 2019

Historical Linguistics: Words and Things

Summary and Keywords

Comparative historical linguistics is an approach comprising a set of methods that historians who have training in linguistics employ to reconstruct histories for periods of history for which written documentation is absent or scant. It is suggested that the use of comparative historical linguistics helped to push against the notion that people living in oral societies had to be deemed prehistorical, a category popularized in the 19th century, because it is premised that the rich history of the words comprising their languages hold troves of knowledge that historians can access and use to write narratives. Core steps of comparative historical linguistics are explained so that readers understand how researchers use modern-day spoken languages to work backward in time to reconstruct the histories of words that comprise the material items, ideas, and concepts that mattered to speakers of languages prior to the 21st century. The methods’ benefits are discussed, and their limitations highlighted.

Keywords: comparative historical linguistics, proto-language, cognate, etymology, language family, language phyla phonology, etymology, epistemology, lexicostatistics

The curiosity that people have to understand events of historical significance in times past seems innate. Perhaps, too, is the inclination to leave records to posterity on matters important in their lifetimes. Of course, the ability to access or to bestow telltale signs of history via material artifacts has limitations that depend on the available technologies and media, valued and known to people during their lifetime. Since the 16th century, as history as an academic discipline and as a profession has changed, moves to set acceptable parameters regarding valid sources and methods, as well as views about source and method reliability, have unfolded.1 While the understanding that material artifacts, environments, and human interactions each inform historical contexts and contribute to framing and reconstructing historical accounts, a prevailing view that written texts comprise the surest and most trustworthy sources for writing histories developed. Written records, though held to intense scrutiny to discern potential bias or credibility shortcomings, became the favored evidentiary sources for historical methods.

Comparative historical linguists would not deny that written records offer an ideal starting point for historical inquiry. After all, what historian does not appreciate a rich archive of meeting minutes, diaries, newspapers, letters, legal documents, government treatises, and the like to launch a research project? However, the discipline’s preference for written sources as the preeminent historical method set up the field in a way that swayed thinking about oral societies and the feasibility of their recapture using credible historical methods. Their histories were deemed as unfortunately unreachable and unknowable because it was thought that histories without written text would amount to mere conjecture. As a corollary, the privileging of written texts in historical methods encouraged binary framing of history and societies. By the mid-19th century, the phrase prehistoric societies was in use, relegating to that category those societies who had not left to posterity written records. Implicit in the dichotomy was that there were also historic societies, though not explicitly phrased as such, that had histories retrievable through available written records. Within that binary thinking there developed in tandem a perspective that prehistoric societies did not have much to offer a developing, enlightened modern world. During the 18th and 19th centuries, this idea had a role in emboldening social Darwinist, racist, and colonial agendas on the African continent.2 While some certainly viewed Africa from this perspective, accepting the idea that oral societies’ histories were unimportant, an assumption that was often anchored in, at best, biased, and at worst, racist thinking, some folks did not accept this as an end point.

Given the scarcity of written records in Africa for eras prior to European encroachment, a process that accelerated on the periphery of the African continent in the 15th century and increased through the long 19th century, historians of pre-1800 history curated a pioneering methodology. Drawing on the methods of historical linguists who researched the history of languages, comparative historical linguists, who sometimes self-identify as historian-linguists, looked to use the history of language to create a scaffolding for accessing the history of people through the words that comprised those languages. The use of comparative historical linguistic approaches in the study of Africa’s past flourished in the 20th century. Building on the work of historical linguists who had classified African languages into four language phyla (often synonymously referred to as language families), anthropologist-historian Jan Vansina and historian Christopher Ehret employed comparative historical linguistic methods toward reconstructing the histories of early African societies. They forged the application of the method in African history that helped to answer the question: can modern-day spoken languages in Africa be used as reliable texts for reconstructing histories through comparative historical linguistic methods? This question has been answered with a decisive “yes.”

Historians of early African history have continued to push the boundaries of methodology, seeking to access history through the practice of using modern-day spoken languages as source material for reconstructing the histories of oral-based and other societies. They specialize in writing the histories of orature-based societies where rich traditions had long been the way to communicate, educate, and pass forward their histories imbued in vocabulary, epics, song, proverbs, and more. Across decades and sometimes working together, archaeologists, anthropologists, historians, and linguists continue to research and narrate what they conclude about Africa’s deep histories. As subfields like ethnoarchaeology, comparative oral traditions, genetics, paleontology, and more develop, they continued to push the field and state of understanding, continuously asking the question: how far back in time can we rely on the method to generate dependable conclusions?3

Africa’s Language Families

More than two thousand languages and dialects are spoken in Africa.4 Linguists have classified the majority of these languages as belonging to four clusters, deemed phylum or families. This means that each language of the more than two thousand languages is related to one of four ancestral proto-languages that were spoken many millennia prior. A proto-language is hypothesized to have been the earliest expression of a language that a community of speakers spoke and from which later related languages are descended. Among linguists and other scholars who rely on linguistic classifications in their research, there is broad agreement on the historic homelands and dating for the four major language families in Africa. The languages belonging to the Niger–Congo family date to about 10,000 bce, with origins in the area of the Niger River delta. Afrasian homeland areas run adjacent to the Red Sea, with origins set at approximately 13,000 bce. Nilo-Saharan languages dated to 11,000 bce likely made their homelands in the middle stretches of the Nile River, near the confluence of the Blue and White Nile Rivers. And the languages belonging to the Khoisan family of languages had beginnings about 16,000 bce in the region between Lake Tanganyika and the Indian Ocean.5 Across the vast regions of the continent and long ago, speakers of the earliest languages that gave rise to hundreds of descendant languages gradually migrated and settled into new areas to establish new homes, and over millennia the number of languages descended from them grew to more than two thousand. Yet, each of those more than two thousand languages has a history that links back to those original speakers of languages in their ancient homeland areas. It is those histories that are imbedded in the vocabularies and language structures that linguists and comparative historical linguists use to reconstruct the history of languages and the worlds of the speakers of those languages across time.

The histories of the proposed geographic homelands for the original proto-languages that diverged over time and great distance are the focus of interest for comparative historical linguists. In the context of Africa, the historian-linguist appreciates the way that languages change as speakers of those languages move into new areas. The migration flows of speakers quite naturally spread their languages, ideas, and ways of living as they pushed into new lands.

The Contribution of Comparative Historical Linguistics

Scholars who are trained to reconstruct history using comparative historical linguistic methodologies are often committed to writing the history of precolonial eras in Africa. And their training in the comparative historical method makes it possible to push against the presumption that the histories of oral societies are irretrievable. Across the 20th and 21st centuries, the comparative historical linguistic method has continued to provide a path toward historical recovery, and it has proven a valued method as interdisciplinary and multidisciplinary methods continue to grow.6

Primary source texts are the evidence historians use to reconstruct narratives about times and places in which they were produced, persisted in use, and sometimes remain relevant in the present. Texts can comprise written documents, material artifacts like film, tools, architecture, inscriptions, and more. For comparative historical linguists, texts are valuable when they are available, but the primary texts they utilize as a source are the words that people speak or reliable records of the words prior generations once spoke; that is, the words comprising the languages of modern-day languages are considered primary source texts. For comparative historical linguists, the vocabulary that comprises a language can be used as a portal to the past. The premise is that each word has a history and that the history of each word can suggest things about the people and societies that found those words important enough to use, keep, incorporate, or develop in their corpus of vocabulary. The history of words is accessed through steps taken using comparative historical linguistics.

Historians who use comparative historical linguistic methods focus on what language and word histories—including the ways words and their meanings change over time—can reveal to us about the past and present. However, reaching a place where all of this can be discerned involves a number of steps. One step focuses on etymology. Etymology is the way that the origins of words and the meanings of those words are surmised. Reconstructing the etymology of a word helps to get at the approximate age of the word so that one can estimate when a word came into use, and hence became relevant in a speaker’s vocabulary. Comparative historical linguistics offers a way to understand how societies of speakers came to have particular words within their lexicon. The possible reasons are examined through three questions researchers ask about words: (1) Was a word present in an ancestral spoken language from which the language under study inherited it across generations into their present-day vocabulary? If the answer is yes, then the word is considered to be an inherited word. (2) Did the word derive from a unique context that called for the invention of a new word to name a new item or idea or environment important to speakers at the time it became part of their vocabulary? If the answer is yes, then the word is considered an innovated word. (3) Was a word borrowed from a distinct language into the vocabulary of the language being researched? If the answer is yes, then the word is considered borrowed. Such words are often referred to as loan words or borrowed words. Whether words are inherited, innovated, or borrowed, they are a rich source of historical information about the choices, circumstances, and interactions important to the people who spoke them.

The histories of individual words are used to build sets of vocabularies that can provide nuanced insights into aspects of the conceptual and material knowledge of speakers of those languages. Two brief examples here help to elucidate this point. Unless people are very familiar with a particular technology, say, for instance, the making of pottery, there should be no reason to expect speakers of a language to have a full corpus of words in their vocabularies for the implements, steps, and techniques used in the production of pottery. For folks who only use pots, one would expect they might have words simply to name only types of pots in their vocabularies. Similarly, in languages of people who specialize in deep-sea fishing off the Indian Ocean coasts of East Africa, we would expect those languages to have complex vocabulary related to all facets of that practice and product, whereas one would not expect such vocabularies of speakers of languages who live at an elevation of 3000 meters on the hills of Mount Kilimanjaro. An illuminating example involving the history of cattle-keeping can be found in David L. Schoenbrun’s 1993 article, “We Are What We Eat: Ancient Agriculture between the Great Lakes.”7

To get to a place where comparative historical linguists can make claims about the etymology of a word or set of words, a framework structure for the genetic relationships among the languages must be established first. One way that such a framework can be created is through a close analysis of particular words that modern-day speakers of the languages under investigation use, or through the analysis of available, reliable word lists needed for this initial step.

Language Relationships and Time-Depth

To employ historical linguistics methods in reconstructing histories, historians must either establish the genetic relationships among the languages being studied, or they must have access to reliable proposed studies that have already done this.8 As the application of comparative historical linguistics continues to increase both in the fields of linguistics and history, there is increased access to reliable language relationships that have been reconstructed. Typically, however, historians’ relying on comparative historical linguistic methods have taken on the work of establishing language relatedness and time-depth themselves for the subsets of languages that they research. This process entails a number of steps. A first step involves discerning what the regular sound change processes, referred to as phonology, that related languages have experienced since they diverged from an ancestral proto-language.

Phonology rests on examining the way that sounds in a language have changed over time. When a new language is born, it happens as part of a divergence process that commonly includes a separation from speakers of the original parent language. When that separation occurs geographically and over a long period of centuries, daughter languages might develop new unique ways of pronouncing words. Daughter languages are those languages that are born from a directly shared proto-language parent. The way those distinct pronunciations comprise part of an emerging dialect and later a fully developed language happen with regularity. Thus, languages follow phonological rules; the way words are pronounced in a language is not random, although to native speakers of the languages, those regular pronunciations feel quite natural, and they commonly do not consciously realize that they are following phonological rules in their speech. But when researchers compare modern-day words spoken in distinct related languages, they can detect regular sound changes that occurred. One well-known example includes a look at the English words for father, foot, and fish and Latin’s corresponding words for the same: pater, ped, and pisc. These three words are cognates, meaning that the root word was present in an ancestral proto-language from which English and Latin languages inherited them far back in time. When comparing the three words in both languages, the sound correspondence that is perhaps most obvious is the initial sound for each /f/ in English and /p/ in Latin. In their common shared ancestral language, which is called proto-Indo-European, the original phonological sound in the initial word position was /*p/, whereas the proto-Indo-European descendant languages that became Latin retained the word initial /p/ in the words it inherited from its proto-Indo-European ancestral language. English is a distant but related language to Latin, and long ago in their shared proto-Indo-European ancestral language, there occurred a sound shift in which the initial sound for /p/ changed to an /f/. In other words, proto-Indo-European /*p/ regularly shifted to /f/ in English. Using linguistic marks, it is stated that proto-Indo-European /p/>/f/ in English, while proto-Indo-European Latin /p/>/p/ in Latin.9 Once a comparative historical linguist has established all of the phonological rules for languages, those rules become a set of guidelines for testing whether a word in a present-day spoken language was inherited from a parent language, when a word follows a regular set of rules, or whether it was invented, which can be hypothesized when it follows regular rules but is not in use by a sister or neighboring language. However, if a word is in use but appears to break regular rules and is not in evidence among closely related or even sister languages, those that share a common parent language, then it can be suggestive that the word was inherited from the parent language but might not be in use in all of the modern-day descended languages. This is possible because sometimes speakers of a language let words become obsolete because they are no longer of use to them or speakers have innovated or borrowed a word that better expresses their idea. In contrast, if a word appears in a related language but its phonology does not show regular sound change correspondence, that is, it does not follow the established rules for that language, then it can be postulated to have been borrowed into that language, either directly from a related or sister language or from speakers of non-related languages with whom they interacted. These kinds of queries and hypotheses are reflective of the comparative approach to drawing together bits of history within and between languages.

The first goal of recognizing the sound change rules among languages is to permit a way to the second step in the method, that is, to determine the prospective relatedness of languages under investigation. Once regular sound changes are known, step two may involve developing a 100-word list that will be used to determine the rates of shared cognation retention among the related languages being studied.10 This step is called lexicostatistics. To do this, comparative historical linguists have developed word lists that comprise words that are familiar to most societies. It is theorized that 100-word list words are more resistant to change than random cultural vocabulary. For example, 100-word lists assume that such words as leg, moon, head, and water are found in all languages, meaning that researchers assume speakers will have words to name those things. What is more, it assumes that when a speaker of one language interacts regularly with a speaker of a distinct language, neither speaker would be easily inclined to give up a 100-word list word in exchange for the other’s word on the list. However, use of the 100-word list in determining language relatedness also assumes—by way of rates of shared cognation—that speakers of related languages over long stretches of time will let go of some 100-word languages for other words, just not easily or at the same rate as they might borrow words from other languages for cultural words to name items or ideas that might be as variable as naming foods, for instance.

It is the 100-word list that comparative historical linguists may use to compare percentages of shared cognates between hypothesized related languages. Using a debated but reliable method called glottochronology, relying on the percentage of share cognates between hypothesized related languages, permits researchers to establish a hypothesized time-depth for how much time separates speakers from a common distant ancestor. The controversial yet still widely relied upon numbers reflected in Table 1 are used to hypothesize the number of years before the present when two languages diverged from a proto-language ancestor. For example, if two modern-day spoken languages share 55 percent of their 100-word list core vocabulary in common, it can be proposed that their common ancestor existed approximately two thousand years before the present. Similarly, if two languages only share 7 percent of their 100-word list core vocabulary in common, their common ancestral language might have been spoken approximately nine thousand years before the present. The controversies that have surrounded the use of glottochronology have largely centered on the precision with which one can attach years to the percentages of shared cognates. Though comparative historical linguists are aware of such debates, they are not using the rates of shared cognation to make absolutist claims about the exact number of years that separate modern languages from their common ancestor; rather, they use the rate of shared cognation to suggest relative time-depth relatedness for the languages being researched. Broadly, glottochronology is useful to the historian because it creates a means of establishing a relative stratigraphy across time. This is a step analogous to what archaeologists do to establish dates for sites in which they find multiple layers of artifacts within the earth. Radio-carbon dating is helpful for them to validate to certain degrees of certainty that what is on the lower levels are indeed older than those items closer to the surface. Neither glottochronology nor radio-carbon dating are precise measures, yet each aide in establishing relative relationships across time.

Table 1. Glottochronology-Derived Dating Estimates

Rough Median Dating in Years Before Present (bp)

Median Common Retention Percentage Rates Between Related Languages

























Source: Data from Christopher Ehret, “Testing the Expectations of Glottochronology against the Correlations of Language and Archaeology in Africa,” in Time Depth in Historical Linguistics, vol. 2, ed. A. McMahon, C. Renfrew, and L. Trask (Cambridge, UK: McDonald Institute for Archaeological Research, 2000), 395.

Once a glottochronology-based timeline for related languages’ rates of shared cognation has been established, then the historian can build a dendrogram to communicate the relatedness of languages that captures the stratigraphy of their divergence over time. Dendrograms are similar to what one might call a family tree. Each node on a tree is the point at which new branches or, in the case of languages, new daughter languages are born. For illustrative purposes, look at the dendrogram in Figure 1. Among ten languages spoken on the coast of Tanzania, which comprise a subset of languages known as the Ruvu languages, which in turn comprise only ten the more than 450 languages comprising the vast Bantu language family within the Niger–Congo phyla, the dendrogram suggests time-depth by percentages of shared 100-word list cognates and postulated branches, represented by solid lines, of relatedness among modern-day Ruvu languages.11

Historical Linguistics: Words and ThingsClick to view larger

Figure 1. Proto-Ruvu divergence and internal group contact. From Societies, Religion, and History: Central East Tanzanians and the World They Created, c. 200 bce to 1800ce, Columbia University Press, 2008.

Drawing from this example, one can surmise a number of suggestions present in the dendrogram illustration. Referring to the glottochronology table, the time-depth for the existence of the proto-Ruvu ancestral language sits at a mean of 65.5 percent, suggesting that sometime during the 6th century, about one thousand five hundred years before the 21st century, proto-Ruvu began a process of divergence into its daughter languages.12 Moreover, the legend in the upper-right corner of the language suggests that there were likely points of increased interactions that are reflected in cultural vocabulary, which are not included in the 100-word list. Discerning those points of increased interaction are achievable because of the comparative method, which rests on the process of establishing the rules that frame regular sound changes using 100-word lists, determining cognates and using glottochronology and dendrogram illustrations. When words are identified as borrowed, researchers see these attested in the irregular sound correspondences. In other words, they do not map on to the phonological system in the way expected of an inherited or innovated word.

Once the relatedness among languages is established and the dated stratigraphy and stages of divergence across time based on rates of shared cognations are outlined, it is then possible to move on to proposing a mapping of their histories across geographical settings. The standard way this is done is by applying “Occam’s razor” or the “law of parsimony” to sketch the simplest migration mapping possible to propose the location of the proto-language from which the languages under investigation descended. One begins by plotting the distribution of the modern-day spoken languages on a map and then working in reverse chronological order to hypothesize where the most likely proto-language shared by sister languages points to. Then one continues backward from there until one reaches the likely homeland for the proto-language from which the modern-day descended languages diverged to fill in across the landscape. In the case of the Ruvu languages, the proto-Ruvu was likely in the area between the Ruvu and Wami Rivers near the Indian Ocean coast.13

Why the Study of Words?

Using lists of 100-word core vocabularies to understand the regular phonological changes is one way to establish the sound change rules, after which it is possible to postulate if words are borrowed from another language or inherited with the same meaning from a parent language, or whether the word was innovated out of need by the speakers at a point in time. Once that is done, examining word etymologies, their origins, that are specific to cultural vocabulary becomes a rich source of synchronic and diachronic knowledge that can inform the historian about the people who spoke them. This step in establishing etymologies is most effective when a rich database of words pertaining to the topic or subjects under investigation is available, whether collected by the researcher or available through a published source. Some sources for languages in Africa come from a collection of sources, though unfortunately, there is no comprehensive database available.

The words humans speak can be viewed as a portal into their experiences and thoughts, which might have relationships or connections to tangible things or exist solely as an abstract notion. In studying words, researchers recognize that the words people used for things and ideas in their time and place were meaningful to them.14 And in studying the change and innovation of words across time, it is possible to learn how words retain meaning, shift meaning, are exchanged for new words, or even become obsolete in one language whereas in a sister or neighboring language, the word might persist because of its usefulness. The use of words does not exist in isolation; it is the relationship among sets of words that adds dimension to our ability to understand the dense and conscious and unconscious relationships and functions that words serve among speakers of a language. When examining a group of implements for cooking foods, for example, researchers might learn the names for the complete set of implements and their particular functions. For instance, some are for grinding, stirring, mixing, boiling, or frying. And related to these words, by digging in, researchers might find out about the technologies used to create the implements, the specialists who created them, where they were sold and purchased, and where they were used (e.g., outside or inside of a structure or far or near from a home). Researchers may be curious about the products that were cooked, and who cooked them. The possibilities are limitless.

Semantic Fields

Words can be revealing of worlds within worlds. In any given society, people are involved in innumerable actions that can be tied to ideas and material items. The work of comparative historical linguists gets at trying to uncover bits and pieces of past worlds that are left in the words used among present-day speakers. While the history of words in isolation may be interesting, the work of researchers is always strengthened when a context for words and actions can be fleshed out with nuance and detail. One way that researchers get at that context and detail is by defining their research projects around sets of questions for which sets of words related to the subjects can be collected. Sets of words that are related to a subject or narrow topic within a subject opens up the opportunity to build a corpus of vocabulary that is termed a “semantic field.” The more relevant words whose histories a researcher can reconstruct contribute to building out a comprehensive representation of the past. Similar to the cooking implement example (see the section “Why the Study of Words?”), comparative historical linguists can use words of interest to test whether semantic fields are relevant to speakers of a language in which they are interested. As they continue their research with speakers of a language, they might find that words that represent ideas or tangible things, which were at first on their own agenda for inquiry, have no relevance to the speakers. Semantic fields are thus typically refined during fieldwork. The brilliance of working with semantic fields is that they become repositories of epistemology that permit the historian to glimpse at societal priorities and concerns of speakers who lived in the past and who are living in the present. The sets of words comprise conceptual fields related to subjects providing synchronic with potential diachronic understandings that can emerge through comparative and long-range studies. For instance, understandings about the composition of economic structures, philosophies, families, institutions of governance, and ways of transmitting education across generations can be the focus.

To bring robust semantic fields to life, researchers need to spend hours in the field learning directly from modern-day language speakers who have knowledge about the subjects of interest. Additionally, researchers can prepare, augment, and interrogate their research and that of others using published or newly acquired ethnographic accounts, oral traditions, proverbs, songs, and other documents.

In the historiography on early African history that has relied on comparative historical linguistics as a primary method, many of the semantic fields that have been probed have concerned subjects and historical processes that are believed to be tied to agriculture and farming, pottery and iron-making production, the domestication of animals, and concepts related to family structures, lineage affiliations and inheritance, as well as important life stages when people acquired appropriate knowledge and entered into new roles and responsibilities within their societies. Importantly, semantic fields can link conceptually and tangentially to form sets that cover multiple possibilities of application or function, but they are not always obvious in their usage to outsiders. The differences between tying and knotting, for instance, likely mean specific things to a seamstress, others to a cub scout, and something wholly distinct for a person who makes their livelihood by fishing. Similarly, in studying the idea or concept of knotting or tying within the context of finding connections of meaning, in a particular language knots might function as a metaphor for describing relationships that bind or tie people together. In this way, a verb meaning “to knot or tie” might well have relevance in one’s understanding of how families or groups understand their relatedness and perhaps responsibility to the unit. Fleshing out the worlds in which people lived and conceptualized through the development of semantic fields takes time, patience, as well as close analyses and comparisons of data.15

Once groups of words related to ideas or ways of doing things are identified, researchers are at a good starting point to begin writing narratives that incorporate the histories of those words. There are different ways to do this. Some might take an approach where the etymology of each word is laid out for the reader so that they have a firm understanding of the ancientness or age of the words and have a framework for understanding the way the word’s meaning and use might have shifted over time. Then they might launch into the narrative history. Another approach would be to write the narrative while incorporating an explanation of etymology. An example of this touches on the histories of early Niger–Congo eras of more than ten thousand years ago, as well as developments in later Bantu times. From their pasts, words that are still in use across numerous descendant language communities are used to develop narratives about a core epistemology related to ideas of theism and health that are ancient, yet evolving over time. These include root words for “god” (-amb-), “spirit or ancestor” (-dim-), and “to become fitting, straight, right” (-dung-).16 An example of how writing such a narrative might look is presented here. Niger–Congo-speaking communities made their homes in the Niger Delta regions of West Africa about 10,000 bce. Using the reconstruction of word histories, researchers have established some core ideas that comprised Niger–Congo and descendant speakers’ epistemology. Epistemology refers to the way people understand their knowledge and world. Epistemology is key to establishing the assumptions that they had about the world. For instance, one idea that Niger–Congo adhered to as early as twelve thousand years ago was that a force, something that might be called a creator or god, was responsible for making the world. That idea is captured in the word Nyambe, which is built on the verb root -amb-, whose meaning is “to begin.” Since Niger–Congo times and through the generations of people who came after them, the idea of a single creator has resonated and persisted across time. Sometimes the root continues to be used for god, but other times speakers of languages descended from Niger–Congo kept the core idea of a monotheistic god, yet they innovated a new word for it. This is what happened in the last millennium BCE when southern Kaskazi peoples innovated the word mulungu, derived from a proto-Bantu verb root meaning “to become fitting, straight or right” to name the idea of a creator. Across more than ten thousand years of history, from Niger–Congo times in western Africa to southern Kaskazi eras in eastern Africa, the epistemological idea persisted that a single monotheistic force—creator or god—was responsible for making the world even though the word that named it was changed. Interestingly, the idea of a creator god was conceptualized as distant, not one that people would propitiate or honor in day-to-day living. An examination of Bantu history provides evidence suggestive of the way that their Niger–Congo forbears may have understood relationships with ethereal forces. By 3500 bce, early Bantu-speaking people understood that ancestor or spirit forces, identified by the ancient root -dim-, had the ability to influence living people’s lives in positive or negative ways. They also recognized that those forces responded favorably to being remembered through propitiation ceremonies. This core idea has endured across more than five thousand five hundred years of Bantu history and thus offers a glimpse into the epistemological reasons that Bantu-descended people preserved their diverse practices of venerating ancestors. This brief narrative demonstrated the way historians might write narrative using the history of words.

History and Context

Just as words do not exist in a vacuum, neither has nor should research. Correlations among conclusions arrived at through distinct methods of inquiry are invaluable in any field of research. While care must be taken to avoid incongruent comparisons, there are opportunities to test hypotheses from one methodological approach with those arrived at by other methods for similar places and times. Thus, the idea of using a multimethod approach when historical linguistics is a primary method is not only appealing, but common. The integrity of any finding in any field is only strengthened when researchers using distinct methods arrive at conclusions that corroborate another’s findings. Since the mid-20th century, and increasingly so in the 21st with advancements made in genetics research, findings from archaeology, anthropology, botany, and paleontology have aligned with and further informed conclusions reached using historical linguistics. For instance, compelling studies exist on human genetic history, particularly those that earlier centered on biological female mitochondrial DNA (mtDNA), and more recently have focused on Y chromosome research associated with biological males, which have added information and complicated migration patterns formerly proposed using linguistic methods.17 In May 2017, an article that appeared in Science added to understandings about the interactions of Bantu-speaking people with others on the African continent. Those findings added to migration histories that earlier had been proposed through linguistic analyses. And the genetics research further contributed to knowledge about the diversity of Bantu-descended people who were forcibly removed to the Americas in the transatlantic enslavement era.18 Advances along these lines provide an avenue to address limitations that come from working especially with limited datasets, a hindrance that continues to be addressed through researchers’ ability to link independent datasets for more comprehensive analyses.19

There is little doubt that with more than two thousand languages and dialects spoken today on the continent of Africa, there is much more work to be done and there is room for an army of scholars to enter the field. In addition, there is an urgency with regard to the study of languages, as the number of languages that are endangered and dying is increasing. Because of the state of advancement in genetics, botany, the growing number of approaches to undertaking satellite and aerial archaeological research, as well as other methods, there are plentiful opportunities for comparative historical linguistics to contribute rich insights into the meanings of those things that are uncovered.20

Discussion of the Literature

Linguists’ research in the 19th century toward classifying Africa’s languages into phylum or families set the framework for the corpus of scholarship that comparative historical linguists took up as a primary method for reconstructing early African history, especially since the mid-20th century. Across this period, there have ensued healthy debates of two sorts. One thread has centered on the ways that historians and linguists have doubted, often for differing reasons, the use of linguistic methods for reconstructing history. A second thread has centered on the conclusions drawn about early history using comparative historical linguistics.

Along the first thread, historians who have turned a cautious eye to the use of linguistics are typically reserved because of their lack of familiarity with the data and the methods. Learning how to do the work of a comparative historical linguist usually requires adding a subfield specialization to one’s training. That training, while salient, clear, and meaningful to those who go on to employ it in their research, does not address the skepticism that historians who rely on written documentation understand or appreciate. The challenge here is that teaching just what comparative historical linguistics is and does takes time, which researchers do not usually have with their colleagues. In part, this has created a challenge for training more historians who use the methods. As of the early 21st century, existing researchers have been largely trained by those who pioneered the field of comparative historical linguistics: anthropologist-historian Jan Vansina and historian Christopher Ehret or their students. Related to this are the inside debates among linguists and historians who use linguistics regarding the limitations and shortcomings of methods that use 100-word lists, glottochronology, dendrograms, and related methods to hypothesize the relatedness, divergence, and time-depth of languages. Acknowledgment of the imperfect science has been made, but for those scholars who have trained in the method, the course of action has been to continue applying and refining techniques as they are employed, along with having an eye on the corollary research that is being done in other disciplines and against which independent hypotheses can be juxtaposed and analyzed. Among the relatively small group of scholars using comparative historical linguistics, the notion of not using the method would amount to walking away from a discipline that has seemed to accept the idea that there is no way to reconstruct the histories of oral societies. At this point in the history of the discipline, the conclusions reached using comparative historical linguistics have been too fruitful to consider that option.

The second broad thread concerns historians and linguists who inspired early, healthy debates about the migration pathways by which speakers of languages diverged as they gradually settled across the vast African landscape. A core debate is tied to the Bantu language family, a subset of Niger–Congo, with an eye on whether there is evidence to support an early or late split of early Bantu-speaking migration into east and west streams, which laid the foundation for the more than four-millennia period of language divergence and population migration and settlement across much of sub-Saharan Africa. Select seminal early works on this point include Christopher Ehret’s 1972 article, “Bantu Origins and History: Critique and Interpretation,” followed by “Patterns of Bantu and Central Sudanic Settlement in Central and Southern Africa (c.1000 bc–500 ad).” Later, Vansina’s two articles: “Bantu in the Crystal Ball, I,” published in 1979, and “Bantu in the Crystal Ball II,” in 1980, as well as his “Western Bantu Expansion” article in 1984 added new perspectives. This debate was carried forward into the 1990s and beyond in Vansina’s Paths in the Rainforest and in Ehret’s “Bantu History: Re-envisioning the Evidence of Language.” While those debates remain salient, research continues.21

Researchers in the early 21st century have continued to focus their work on the classification of subsets of languages, with lesser focus on the migration stream debates, tending instead to turn toward reconstructing narratives centered on the interactions among and shifting sociopolitical cultures of speakers of distinct languages. Among these is the work of David L. Schoenbrun and Rhiannon Stephens, which focused on the African Great Lakes region and Uganda, respectively, and that of Kairn Klieman (The Pygmies Were Our Compass), which added to the seminal narrative that Jan Vansina set forth in Paths in the Rainforest.22 Some significant examinations have centered on the question of lineage and gender in Christine Saidi and Rhonda M. Gonzales’s research on women, gender, and authority in east-central and central-east Africa. Based on comparative vocabulary, there have been robust debates centered on whether the earliest Bantu speakers were patrilineal or matrilineal. The ongoing debates have learned in the direction of matrilineality being the first social structure by which formation of the family, belonging, and inheritance were recognized in the earliest Bantu-speaking communities (c. 3500 bce).23

An added challenge for comparative historical linguistic methods has centered on both the biases of past and current researchers in framing and conducting research. Working on past times, places, and cultures with limited bias requires a reflexivity about how one’s episteme and related assumptions regarding the world must not be imbued in judgments placed on words, ideas, and things for which they have no relevance. For instance, a common meaning for a proto-Bantu root word meaning “infertile person,” *-gumba, attests in widespread Bantu languages even today. In the majority of instances, it is gender neutral in application to biological males and females. However, in collected published dictionaries, its meaning was usually glossed as “infertile woman,” or “barren woman,” leaving the reader with the incomplete impression that biological men were not thought infertile along similar lines. This sort of assumption, though perhaps seemingly negligible in the larger scheme of things, in using available language data that one has not himself or herself collected means that the reliability of the data and meaning ascribed to them must be scrutinized. In the case of infertility, recorders’ biases may have been informed by the idea that biological women are infertile while men are sterile, a distinction not attested in Bantu-derived language evidence. When sources are dictionaries, for instance, which often times were the outcome of projects undertaken by missionaries who wanted to translate their texts into local languages, as well use their texts as aids in teaching locals about proper cultural ways, one must remain vigilant toward the biases in translation and meaning that intentionally or inadvertently occurred. Likewise, researchers must be cognizant of placing assumptions on words that are framed by their non-native language-speaking episteme onto the history of peoples who may or may not have shared such a worldview. Having this reflexive practice is one that benefits all researchers, but it is especially important in not telescoping one’s present understanding and assumptions onto the distant past.

Primary Sources

Sources available to people interested in exploring the use of the comparative historical linguistic method include available published dictionaries. Although the availability of published dictionaries is not comprehensive or evenly available across languages, what is available can serve as a starting point for creating vocabulary lists. Online, Ethnologue: Languages of the World is an ever-growing searchable tool that maintains profiles for the world’s languages. It is updated regularly with statistics, tools, and guides that can be formatted in various ways. Additionally, there are some online databases that have been created as a starting point for Bantu languages. One includes the Bantu Lexical Reconstructions 3 database (BLR 3), wherein approximately 10,000 proposed proto-Bantu reconstructions are available in a searchable interface. BLR 3 builds on the mid-20th-century work of A. E. Meeussen, who wrote Bantu Lexical Reconstructions, which was posthumously published in 1980. BLR 3 was built as an online tool for researchers and linguists interested in the Bantu languages. The site explains that it “is not a finished product, it is continuously being updated by its present editors (Yvonne Bastin and Thilo C. Schadeberg).” An extensive bibliography of the source material used in building the database is provided on the website.24 Additionally, the Comparative Bantu Online Database is described as follows on its site: “The CBOLD project was started in 1994 by Larry Hyman and John Lowe to produce in Berkeley a lexicographic database to support and enhance the theoretical, descriptive, and historical linguistic study of the languages in the important Bantu family.”25 By and large there is a great deal of work still needed to create reliable and sharable datasets for Africa’s more than two thousand languages.

Further Reading

Blench, Roger. Archaeology, Language, and the African Past. Lanham, MD: AltaMira Press, 2006.Find this resource:

Bostoen, Koen. “Historical Linguistics.” In Field Manual for African Archaeology, Edited by Alexandre Livingstone Smith, Els Cornelissen, Olivier P. Gosselain, and Scott MacEachern, 257–260. Terveuren, Belgium: Royal Museum for Central Africa, 2017.Find this resource:

Ehret, Christopher. A Historical-Comparative Reconstruction of Nilo-Saharan. Cologne: Rüdiger Köppe, 2001.Find this resource:

Ehret, Christopher. “Bantu Expansions: Re-Envisioning a Central Problem of Early African History.” International Journal of African Historical Studies 34, no. 1 (2001): 5–41.Find this resource:

Ehret, Christopher, S. O. Y. Keita, Paul Newman, and Peter Bellwood. “The Origins of Afroasiatic.” Science 306, no. 5702 (2004): 1680–1681.Find this resource:

Ehret, Christopher. History and the Testimony of Language. Berkeley, CA: University of California Press, 2010.Find this resource:

Fields-Black, Edda L. “Untangling the Many Roots of West African Mangrove Rice Farming: Rice Technology in the Rio Nunez Region, Earliest Times to c. 1800.” Journal of African History 49, no. 1 (2008): 1–21.Find this resource:

Fourshey, Catherine Cymone, Rhonda M. Gonzales, and Christine Saidi. Bantu Africa: 3500 bce to Present. New York: Oxford University Press, 2017.Find this resource:

Gonzales, Rhonda. Societies, Religion, and History: Central-East Tanzanians and the World They Created, c. 200 BCE to 1800 CE. New York: Columbia University Press, 2009.Find this resource:

Greenberg, Joseph H. “Linguistic Evidence Regarding Bantu Origins.” Journal of African History 13, no. 2 (1972): 189–216.Find this resource:

Greenberg, Joseph H. Review of Linguistic Diversity in Space and Time, by Johanna Nichols. Current Anthropology 34, no. 4 (August 1993): 503–505.Find this resource:

Guthrie, Malcolm. Comparative Bantu: An Introduction to the Comparative Linguistics and Prehistory of the Bantu Languages. Upper Saddle River, NJ: Gregg Press, 1971.Find this resource:

Heine, Bernd, and Derek Nurse, eds. African Languages: An Introduction. Cambridge, UK: Cambridge University Press, 2000.Find this resource:

Klieman, Kairn A. The Pygmies Were Our Compass: Bantu and Batwa in the History of West Central Africa, Early Times to c. 1900 ce. Portsmouth, NH: Heinemann, 2003.Find this resource:

Luna, Kathryn M. de. Collecting Food, Cultivating People: Subsistence and Society in Central Africa. New Haven, CT: Yale University Press, 2016.Find this resource:

Nurse, Derek. “The Contributions of Linguistics to the Study of History in Africa.” Journal of African History 38, no. 3 (1997): 359–391.Find this resource:

Schoenbrun, David L. A Green Place, a Good Place: Agrarian Change and Social Identity in the Great Lakes Region to the 15th Century. Portsmouth, NH: Heinemann, 1998.Find this resource:

Saidi, Christine. Women’s Authority and Society in Early East-Central Africa. Rochester, NY: University of Rochester Press, 2010.Find this resource:

Simons, Gary F., and Charles D. Fennig, eds. Ethnologue: Languages of the World. 21st ed. Dallas: SIL International, 2018.Find this resource:

Stephens, Rhiannon. A History of African Motherhood: The Case of Uganda, 700–1900. Cambridge, UK: Cambridge University Press, 2013.Find this resource:

Vansina, Jan M. Paths in the Rainforests: Toward a History of Political Tradition in Equatorial Africa. Madison: University of Wisconsin Press, 1990.Find this resource:


(1.) Maarten Couttenier, “No Documents, No History,” Museum History Journal 3, no. 2, (2013): 130, 123–148.

(2.) Harry Hamilton Johnston, A Survey of the Ethnography of Africa, and the Former Racial and Tribal Migrations in that Continent (London: Royal Anthropological Institute of Great Britain and Ireland, 1913).

(3.) Derek Nurse, “The Contributions of Linguistics to the Study of History in Africa,” Journal of African History 38, no. 3 (1997): 359–391.

(5.) Christopher Ehret, The Civilizations of Africa: A History to 1800 (Charlottesville: University of Virginia Press, 2002), 37, Map 4.

(7.) David L. Schoenbrun, “We Are What We Eat: Ancient Agriculture between the Great Lakes,” Journal of African History 34, no. 1 (1993): 1–31.

(8.) Derek Nurse, “The Contributions of Linguistics to the Study of History in Africa,” Journal of African History 38, no. 3 (1997): 359–361.

(10.) There have been word lists comprised of more than 100 words; 100-word list is used only as an example.

(11.) Gonzales, Societies, illustrations, Figure 1.

(12.) Gonzales, chap. 1, para. 38.

(13.) The link provided here leads to a slideshow where one can see an illustration of the suggested process; Gonzales, Societies, chap. 1, para. 38.

(14.) David Henige, “Inscriptions Are Texts Too,” History in Africa 32 (2005): 185, 188, 191.

(17.) Loredana Castrì, Sergio Tofanelli, Paolo Garagnani, Carla Bini, Xenia Fosella, Susi Pelotti, Giorgio Paoli, Davide Pettener, and Donata Luiselli, “MtDNA Variability in Two Bantu-Speaking Populations (Shona and Hutu) from Eastern Africa: Implications for Peopling and Migration Patterns in Sub-Saharan Africa,” American Journal of Physical Anthropology 140, no. 2 (October 2009): 302–311; Eva K. F. Chan, Rae-Anne Hardie, Desiree C. Petersen, Karen Beeson, Riana M. S. Bornman, Andrew B. Smith, and Vanessa M. Hayes. “Revised Timeline and Distribution of the Earliest Diverged Human Maternal Lineages in Southern Africa,” PLoS One; San Francisco 10, no. 3 (March 2015): e0121223; Margarida Coelho, Fernando Sequeira, Donata Luiselli, Sandra Beleza, and Jorge Rocha, “On the Edge of Bantu Expansions: MtDNA, Y Chromosome and Lactase Persistence Genetic Variation in Southwestern Angola,” BMC Evolutionary Biology 9 (January 2009): 1–18; Verónica Gomes, Maria Pala, Antonio Salas, Vanesa Álvarez-Iglesias, António Amorim, Alberto Gómez-Carballa, Ángel Carracedo, et al., “Mosaic Maternal Ancestry in the Great Lakes Region of East Africa,” Human Genetics; Heidelberg 134, no. 9 (September 2015): 1013–1027; and Doug Jones, and Bojka Milicic, eds., Kinship, Language, and Prehistory: Per Hage and the Renaissance in Kinship Studies (Salt Lake City: University of Utah Press, 2011).

(18.) Etienne Patin, Marie Lopez, Rebecca Grollemund, Paul Verdu, Christine Harmant, Helene Quach, Guillaume Laval, et al., “Dispersals and Genetic Adaptation of Bantu-Speaking Populations in Africa and North America,” Science 356, no. 6337 (May 5, 2017): 543–546; and Daine J. Rowold, David Perez-Benedico, Oliver Stojkovic, Ralph Garcia-Bertrand, and Rene J. Herrera, “On the Bantu Expansion,” Gene 593, no. 1 (November 15, 2016): 48–57.

(19.) Koen Bostoen, “Pots, Words and the Bantu Problem: On Lexical Reconstruction and Early African History,” Journal of African History 48, no. 2 (2007): 130.

(20.) Sean Reid, “Satellite Remote Sensing of Archaeological Vegetation Signatures in Coastal West Africa,” African Archaeological Review 33, no. 2 (2016): 163–182; Isabelle C. Winder, “Landscape Structures and Human Evolutionary Ecology: Space, Scale and Environmental Patterning in Africa,” Internet Archaeology 38 (March 2015); and Ashley Ceri, “Towards a Socialised Archaeology of Ceramics in Great Lakes Africa,” African Archaeological Review 27, no. 2 (2010): 135–163.

(21.) Christopher Ehret, “Bantu Origins and History: Critique and Interpretation,” Transafrican Journal of History 2, no. 1 (1972): 1–9; Christopher Ehret, “Patterns of Bantu and Central Sudanic Settlement in Central and Southern Africa (c. 1000 bc–500 ad),” Transafrican Journal of History 3, no. 1/2 (1973): 1–71; Jan Vansina, “Bantu in the Crystal Ball, I,” History in Africa 6 (1979): 287–333; Jan Vansina, “Bantu in the Crystal Ball, II,” History in Africa 7 (1980): 293–325; and Jan Vansina, “Western Bantu Expansion,” Journal of African History 25, no. 2 (1984): 129–145.

(22.) Klieman, The Pygmies Were Our Compass; Schoenbrun, A Green Place; Fields-Black, Deep Roots; and Stephens, A History of African Motherhood.

(24.) A. E. Meeussen, “Bantu Lexical Reconstructions 3” (Tervuren, Belgium: Royal Museum for Central Africa).

(25.) Larry Hyman and John Lowe, “Comparative Bantu Online Dictionary” (CBOLD), 1994.