Afroasiatic languages are the fourth largest linguistic phylum, spoken by some 350 million people in North, West, Central, and East Africa, in the Middle East, and in scattered communities in Europe, the United States, and the Caucasus. Some Afroasiatic languages, such as Arabic, Hausa, Amharic, Somali, and Oromo, are spoken by millions of people, while others are endangered with extinction. As of the early 21st century, the phylum is composed of six families: Egyptian (extinct), Semitic, Cushitic, Omotic, Berber, and Chadic. There are some typological features shared by all families, particularly in the domain of phonology. Languages are also typologically quite distinct with respect to syntax and functions encoded in the grammatical systems.
Some Afroasiatic languages, such as Egyptian, Akkadian, Phoenician, Hebrew, Arabic, and Ge’ez, have a longtime written tradition, but for many languages no writing system has yet been proposed or adopted. The Old Semitic writing system gave rise to the modern alphabets used in thousands of unrelated contemporary languages. Two Semitic languages, Hebrew (with some Aramaic) and Arabic, were used to write the Old Testament and the Koran, the holy books of Judaism and Islam.
Alan Reed Libert
Artificial languages—languages which have been consciously designed—have been created for more than 900 years, although the number of them has increased considerably in recent decades, and by the early 21st century the total figure probably was in the thousands. There have been several goals behind their creation; the traditional one (which applies to some of the best-known artificial languages, including Esperanto) is to make international communication easier. Some other well-known artificial languages, such as Klingon, have been designed in connection with works of fiction. Still others are simply personal projects.
A traditional way of classifying artificial languages involves the extent to which they make use of material from natural languages. Those artificial languages which are created mainly by taking material from one or more natural languages are called a posteriori languages (which again include well-known languages such as Esperanto), while those which do not use natural languages as sources are a priori languages (although many a posteriori languages have a limited amount of a priori material, and some a priori languages have a small number of a posteriori components). Between these two extremes are the mixed languages, which have large amounts of both a priori and a posteriori material. Artificial languages can also be classified typologically (as natural languages are) and by how and how much they have been used.
Many linguists seem to be biased against research on artificial languages, although some major linguists of the past have been interested in them.
Creole languages have a curious status in linguistics, and at the same time they often have very low prestige in the societies in which they are spoken. These two facts may be related, in part because they circle around notions such as “derived from” or “simplified” instead of “original.” Rather than simply taking the notion of “creole” as a given and trying to account for its properties and origin, this essay tries to explore the ways scholars have dealt with creoles. This involves, in particular, trying to see whether we can define “creoles” as a meaningful class of languages. There is a canonical list of languages that most specialists would not hesitate to call creoles, but the boundaries of the list and the criteria for being listed are vague. It also becomes difficult to distinguish sharply between pidgins and creoles, and likewise the boundaries between some languages claimed to be creoles and their lexifiers are rather vague.
Several possible criteria to distinguish creoles will be discussed. Simply defining them as languages of which we know the point of birth may be a necessary, but not sufficient, criterion. Displacement is also an important criterion, necessary but not sufficient. Mixture is often characteristic of creoles, but not crucial, it is argued. Essential in any case is substantial restructuring of some lexifier language, which may take the form of morphosyntactic simplification, but it is dangerous to assume that simplification always has the same outcome. The combination of these criteria—time of genesis, displacement, mixture, restructuring—contributes to the status of a language as creole, but “creole” is far from a unified notion. There turn out to be several types of creoles, and then a whole bunch of creole-like languages, and they differ in the way these criteria are combined with respect to them.
Thus the proposal is made here to stop looking at creoles as a separate class, but take them as special cases of the general phenomenon that the way languages emerge and are used to a considerable extent determines their properties. This calls for a new, socially informed typology of languages, which will involve all kinds of different types of languages, including pidgins and creoles.
William F. Hanks
Deictic expressions, like English ‘this, that, here, and there’ occur in all known human languages. They are typically used to individuate objects in the immediate context in which they are uttered, by pointing at them so as to direct attention to them. The object, or demonstratum is singled out as a focus, and a successful act of deictic reference is one that results in the Speaker (Spr) and Addressee (Adr) attending to the same referential object. Thus,
(1)A:Oh, there’sthat guy again (pointing)B:Oh yeah, now I see him (fixing gaze on the guy)
(2)A:I’ll have that one over there (pointing to a dessert on a tray)B:This? (touching pastry with tongs)A:yeah, that looks greatB:Here ya’ go (handing pastry to customer)
In an exchange like (1), A’s utterance spotlights the individual guy, directing B’s attention to him, and B’s response (both verbal and ocular) displays that he has recognized him. In (2) A’s utterance individuates one pastry among several, B’s response makes sure he’s attending to the right one, A reconfirms and B completes by presenting the pastry to him. If we compare the two examples, it is clear that the underscored deictics can pick out or present individuals without describing them. In a similar way, “I, you, he/she, we, now, (back) then,” and their analogues are all used to pick out individuals (persons, objects, or time frames), apparently without describing them. As a corollary of this semantic paucity, individual deictics vary extremely widely in the kinds of object they may properly denote: ‘here’ can denote anything from the tip of your nose to planet Earth, and ‘this’ can denote anything from a pastry to an upcoming day (this Tuesday). Under the same circumstance, ‘this’ and ‘that’ can refer appropriately to the same object, depending upon who is speaking, as in (2). How can forms that are so abstract and variable over contexts be so specific and rigid in a given context? On what parameters do deictics and deictic systems in human languages vary, and how do they relate to grammar and semantics more generally?
Chris Rogers and Lyle Campbell
The reduction of the world’s linguistic diversity has accelerated over the last century and correlates to a loss of knowledge, collective and individual identity, and social value. Often a language is pushed out of use before scholars and language communities have a chance to document or preserve this linguistic heritage. Many are concerned for this loss, believing it to be one of the most serious issues facing humanity today. To address the issues concomitant with an endangered language, we must know how to define “endangerment,” how different situations of endangerment can be compared, and how each language fits into the cultural practices of individuals. The discussion about endangered languages focuses on addressing the needs, causes, and consequences of this loss.
Concern over endangered languages is not just an academic catch phrase. It involves real people and communities struggling with real social, political, and economic issues. To understand the causes and consequence of language endangerment for these individuals and communities requires a multifaceted perspective on the place of each language in the lives of their users. The loss of a language affects not only the world’s linguistic diversity but also an individual’s social identity, and a community’s sense of itself and its history.
Young-mee Yu Cho
Due to a number of unusual and interesting properties, Korean phonetics and phonology have been generating productive discussion within modern linguistic theories, starting from structuralism, moving to classical generative grammar, and more recently to post-generative frameworks of Autosegmental Theory, Government Phonology, Optimality Theory, and others. In addition, it has been discovered that a description of important issues of phonology cannot be properly made without referring to the interface between phonetics and phonology on the one hand, and phonology and morpho-syntax on the other. Some phonological issues from Standard Korean are still under debate and will likely be of value in helping to elucidate universal phonological properties with regard to phonation contrast, vowel and consonant inventories, consonantal markedness, and the motivation for prosodic organization in the lexicon.
The expression language of the economy and business refers to an extremely heterogeneous linguistic reality. For some, it denotes all text and talk produced by economic agents in the pursuit of economic activity, for others the language used to write or talk about the economy or business, that is, the language of the economic sciences and the media. Both the economy and business contain a myriad of subdomains, each with its own linguistic peculiarities. Language use also differs quite substantially between the shop floor and academic articles dealing with it. Last but not least, language is itself a highly articulate entity, composed of sounds, words, concepts, etc., which are taken care of by a considerable number of linguistic disciplines and theories. As a consequence, this research landscape offers a very varied picture.
The state of research is also highly diverse as far as the Romance languages are concerned. The bulk of relevant publications concerns French, followed at a certain distance by Spanish and Italian, while Romanian, Catalan, and Portuguese look like poor relations. As far as the dialects are concerned, only those of some Italian cities that held a central position in medieval trade, like Venice, Florence, or Genoa, have given rise to relevant studies. As far as the metalanguage used in research is concerned, the most striking feature is the overwhelming preponderance of German and the almost complete absence of English. The insignificant role of English must probably be attributed to the fact that the study of foreign business languages in the Anglo-Saxon countries is close to nonexistent. Why study foreign business languages if one own’s language is the lingua franca of today’s business world? Scholars from the Romance countries, of course, generally write in their mother tongue, but linguistic publications concerning the economic and business domain are relatively scarce there. The heterogeneity of the metalanguages used certainly hinders the constitution of a close-knit research community.
Aidan Pine and Mark Turin
The world is home to an extraordinary level of linguistic diversity, with roughly 7,000 languages currently spoken and signed. Yet this diversity is highly unstable and is being rapidly eroded through a series of complex and interrelated processes that result in or lead to language loss. The combination of monolingualism and networks of global trade languages that are increasingly technologized have led to over half of the world’s population speaking one of only 13 languages. Such linguistic homogenization leaves in its wake a linguistic landscape that is increasingly endangered.
A wide range of factors contribute to language loss and attrition. While some—such as natural disasters—are unique to particular language communities and specific geographical regions, many have similar origins and are common across endangered language communities around the globe. The harmful legacy of colonization and the enduring impact of disenfranchising policies relating to Indigenous and minority languages are at the heart of language attrition from New Zealand to Hawai’i, and from Canada to Nepal.
Language loss does not occur in isolation, nor is it inevitable or in any way “natural.” The process also has wide-ranging social and economic repercussions for the language communities in question. Language is so heavily intertwined with cultural knowledge and political identity that speech forms often serve as meaningful indicators of a community’s vitality and social well-being. More than ever before, there are vigorous and collaborative efforts underway to reverse the trend of language loss and to reclaim and revitalize endangered languages. Such approaches vary significantly, from making use of digital technologies in order to engage individual and younger learners to community-oriented language nests and immersion programs. Drawing on diverse techniques and communities, the question of measuring the success of language revitalization programs has driven research forward in the areas of statistical assessments of linguistic diversity, endangerment, and vulnerability. Current efforts are re-evaluating the established triad of documentation-conservation-revitalization in favor of more unified, holistic, and community-led approaches.
Victor A. Friedman
The Balkan languages were the first group of languages whose similarities were explained in modern linguistic terms as a result of language contact rather than as a result of descent from a common ancestor. Nikolai Trubetzkoy coined the term Sprachbund ‘linguistic league’ (as opposed to Sprachfamilie ‘language family’) to describe this relationship. Balkan linguistics, as both a subset of and precursor to contact linguistics, is, at its base, an historical linguistic discipline. It seeks to explain similarities among the relevant languages as the result of diffusion rather than of either transmission or of putative universal, typological properties of human language (which latter assumes parallel developments whose causation is ahistorical, i.e., unconnected with either contact or ancestry). The relevant languages are, with the exception of Turkic, all part of the Indo-European language family, but they belong to five distinct groups that are known to have been separated for a significant length of time (presumably millennia). Moreover, for four out of five Indo-European groups as well as for Turkic, there exists documentation that goes back more than a millennium, and in some cases several millennia. The Balkan languages are thus the oldest example of a well-documented and still living Sprachbund.
The primary questions that Balkan linguistics seeks to answer are these: What are the results of language contact in the Balkan languages, and how did they come about? The Balkan languages are traditionally defined as Albanian, Modern Greek, Balkan Romance (Romanian, Aromanian, and Meglenoromanian), and Balkan Slavic (Bulgarian, Macedonian, and the southernmost dialects of the former Serbo-Croatian). In recent decades, it has been recognized that the relevant dialects of Romani, Judezmo, and Turkish and Gagauz also participate in at least some of the convergent processes that are taken as definitive of the Balkan linguistic league. While the language family is defined by regular sound correspondences, which in turn help define shared morphology and a core lexicon, the Balkan linguistic league is defined principally by shared morphosyntactic developments and a shared lexicon of borrowings often called “cultural.” In the Balkan linguistic league, phonological developments are sometimes shared among different languages at the dialectal level, but there are no such features that characterize the Balkan languages as a group. Just as in the language family not every diagnostic item is represented in every branch, so, too, in the Balkan linguistic league not every feature is equally represented in all languages and dialects.
Among the most characteristic morphosyntactic features are the following: (1) replacement of infinitives by analytic subjunctives, (2) the use of a particle derived from etymological ‘want’ to mark the future, (3) replacement of synthetic gradation of adjectives with analytic constructions, (4) replacement of conditionals by anterior futures, (5) resumptive clitic pronouns for certain direct and indirect objects, (6) various simplifications in the declensional system, (7) postposed definite articles (for Balkan Slavic, Balkan Romance, and Albanian), (8) grammaticalized evidentials (Balkan Slavic, Albanian, Turkic, and to some extent Balkan Romance and Romani). While some of these convergences began in the ancient or medieval periods, the Balkan linguistic league took its definitive modern shape during the centuries of the Ottoman Empire (14th to early 20th centuries).
William R. Leben
About 7,000 languages are spoken around the world today. The actual number depends on where the line is drawn between language and dialect—an arbitrary decision, because languages are always in flux. But specialists applying a reasonably uniform criterion across the globe count well over 2,000 languages in Asia and Africa, while Europe has just shy of 300. In between are the Pacific region, with over 1,300 languages, and the Americas, with just over 1,000. Languages spoken natively by over a million speakers number around 250, but the vast majority have very few speakers. Something like half are thought likely to disappear over the next few decades, as speakers of endangered languages turn to more widely spoken ones.
The languages of the world are grouped into perhaps 430 language families, based on their origin, as determined by comparing similarities among languages and deducing how they evolved from earlier ones. As with languages, there’s quite a lot of disagreement about the number of language families, reflecting our meager knowledge of many present-day languages and even sparser knowledge of their history. The figure 430 comes from Glottolog.org, which actually lists them all. While the world’s language families may well go back to a smaller number of original languages, even to a single mother tongue, scholars disagree on how far back current methods permit us to trace the history of languages.
While it is normal for languages to borrow from other languages, occasionally a totally new language is created by mixing elements of two distinct languages to such a degree that we would not want to identify one of the source languages as the mother tongue. This is what led to the development of Media Lengua, a language of Ecuador formed through contact among speakers of Spanish and speakers of Quechua. In this language, practically all the word stems are from Spanish, while all of the endings are from Quechua. Just a handful of languages have come into being in this way, but less extreme forms of language mixture have resulted in over a hundred pidgins and creoles currently spoken in many parts of the world. Most arose during Europe’s colonial era, when European colonists used their language to communicate with local inhabitants, who in turn blended vocabulary from the European language with grammar largely from their native language.
Also among the languages of the world are about 300 sign languages used mainly in communicating among and with the deaf. The structure of sign languages typically has little historical connection to the structure of nearby spoken languages.
Some languages have been constructed expressly, often by a single individual, to meet communication demands among speakers with no common language. Esperanto, designed to serve as a universal language and used as a second language by some two million, according to some estimates, is the prime example, but it is only one among several hundred would-be international auxiliary languages.
This essay surveys the languages of the world continent by continent, ending with descriptions of sign languages and of pidgins and creoles. A set of references grouped by section appears at the very end. The main source for data on language classification, numbers of languages, and speakers is the 19th edition of Ethnologue (see Resources), except where a different source is cited.