This is an advance summary of a forthcoming article in the Oxford Research Encyclopedia of Linguistics. Please check back later for the full article.
French-Based Creole Languages (FBCLs) may be characterized as a group by one historical and two linguistic properties. Their shared historical feature is that they arose between the 16th and 19th centuries as vehicular (hence oral) languages in French colonies, through language contact between oral varieties of French spoken by the colonists, and typologically and genetically diverse languages spoken by imported slaves—or imported workers or the local people in the case of Tayo, which emerged in the 19th century after the abolition of slavery and whose status as an FBCL is controversial. The linguistic features characterizing FBCLs are (1) that their lexicon is derived from French while their grammar (phonology and morphosyntax) is both reminiscent of, and different from, that of known varieties of spoken, nonstandard, dialectal French; and (2), that they stand as first languages (L1s), namely, they are acquired by children through the natural process of language acquisition and are used for all-purpose communication—as opposed to pidgins, a type of contact languages only used as vehicular L2s for specific-interaction purposes (e.g., trade).
FBCLs thus defined currently include on the American continent: Gwiyané/Guyanais (in French Guyana) and Karipuna Creole (Brazil, near the French-Guyana border); Lwizyané/Louisianais (on the decrease), in Louisiana, USA; in the Caribbean: Ayisyen/Haitian (in the independent Republic of Haiti); Senlisyen/Saint-Lucian (in the state of Sainte-Lucie), and the creoles spoken in the French-controlled territories of Martinique, Guadeloupe, Dominique, Saint-Barthélémy, and the northern part of Saint-Martin; in the Indian Ocean, off the shores of Eastern Africa: Morisyen/Mauritian (in Mauritius), Seselwa/Seychellois (in the Seychelles), Rodrigé/Rodriguais (in the Rodrigues islands, controlled by Mauritius), Réyinyoné/Réunionnais (in the island of Réunion, a French-controlled territory); and in Southern New Caledonia: Tayo.
Beyond the shared defining features proposed above, there is much variation among FBCLs with respect to the places, periods, and historical conditions of their emergence; the relevant contact languages involved in their development; and their resulting grammatical properties.
While in phonology Middle Indo-Aryan (MIA) dialects preserved the phonological system of Old Indo-Aryan (OIA) virtually intact, their morphosyntax underwent far-reaching changes, which altered fundamentally the synthetic morphology of earlier Prākrits in the direction of the analytic typology of New Indo-Aryan (NIA). Speaking holistically, the “accusative alignment” of OIA (Vedic Sanskrit) was restructured as an “ergative alignment” in Western IA languages, and it is precisely during the Late MIA period (ca. 5th–12th centuries
(a) We shall start with the restructuring of the nominal case system in terms of the reduction of the number of cases from seven to four. This phonologically motivated process resulted ultimately in the rise of the binary distinction of the “absolutive” versus “oblique” case at the end of the MIA period). (b) The crucial role of animacy in the restructuring of the pronominal system and the rise of the “double-oblique” system in Ardha-Māgadhī and Western Apabhramśa will be explicated. (c) In the verbal system we witness complete remodeling of the aspectual system as a consequence of the loss of earlier synthetic forms expressing the perfective (Aorist) and “retrospective” (Perfect) aspect. Early Prākrits (Pāli) preserved their sigmatic Aorists (and the sigmatic Future) until late MIA centuries, while on the Iranian side the loss of the “sigmatic” aorist was accelerated in Middle Persian by the “weakening” of s > h > Ø. (d) The development and the establishment of “ergative alignment” at the end of the MIA period will be presented as a consequence of the above typological changes: the rise of the “absolutive” vs. “oblique” case system; the loss of the finite morphology of the perfective and retrospective aspect; and the recreation of the aspectual contrast of perfectivity by means of quasinominal (participial) forms. (e) Concurrently with the development toward the analyticity in grammatical aspect, we witness the evolution of lexical aspect (Aktionsart) ushering in the florescence of “serial” verbs in New Indo-Aryan.
On the whole, a contingency view of alignment considers the increase in ergativity as a by-product of the restoration of the OIA aspectual triad: Imperfective–Perfective–Perfect (in morphological terms Present–Aorist–Perfect). The NIA Perfective and Perfect are aligned ergatively, while their finite OIA ancestors (Aorist and Perfect) were aligned accusatively. Detailed linguistic analysis of Middle Indo-Aryan texts offers us a unique opportunity for a deeper comprehension of the formative period of the NIA state of affairs.
Ans van Kemenade
The status of English in the early 21st century makes it hard to imagine that the language started out as an assortment of North Sea Germanic dialects spoken in parts of England only by immigrants from the continent. Itself soon under threat, first from the language(s) spoken by Viking invaders, then from French as spoken by the Norman conquerors, English continued to thrive as an essentially West-Germanic language that did, however, undergo some profound changes resulting from contact with Scandinavian and French. A further decisive period of change is the late Middle Ages, which started a tremendous societal scale-up that triggered pervasive multilingualism. These repeated layers of contact between different populations, first locally, then nationally, followed by standardization and 18th-century codification, metamorphosed English into a language closely related to, yet quite distinct from, its closest relatives Dutch and German in nearly all language domains, not least in word order, grammar, and pronunciation.
Hokan is a linguistic stock or phylum based on a series of hypotheses about deeper genetic relationships among languages that extend geographically from Northern California to Nicaragua. Following the general effort to genetically link the vast number of Native American languages and to reduce them to a few superstocks, Dixon and Kroeber first proposed the Hokan stock in 1913, to include several California indigenous languages: Karuk, Chimariko, Shastan, Palaihnihan (Atsugewi and Achumawi), Pomoan, Yana, and later Esselen and Yuman. The name Hokan stems from the Atsugewi word for “two”: hoqi. While the first proposals by Dixon and Kroeber rested on very limited cognate sets comprising only five words, later assessments by Sapir included hundreds of putative cognate sets and analyses of Hokan morphosyntax. By 1925, Sapir further included Washo, Salinan, Seri, Chumashan, Tequistlatecan, and Subtiaba-Tlapanec as the Southern Hokan branch into the stock.
Throughout the 20th century, scholars sought additional evidence for the stock as more and refined data on the languages became available. A number of languages were added, and earlier proposals were abandoned. A new surge in work on individual California indigenous languages in the 1950s and 1960s prompted a string of studies conducting binary comparisons. This renewed interest inspired a series of Hokan conferences held until the 1990s. A more recent comprehensive assessment of the entire stock was undertaken by Kaufman in 1988. Applying rigorous analysis and only implicating those languages for which he encountered substantial evidence, Kaufman proposes sixteen classificatory units for Hokan clustered geographically. Kaufman’s Hokan stock also includes Coahuilteco and Comecrudan in Mexico and Jicaque in Nicaragua.
Although Hokan was widely studied in the 20th century, and many scholars presented what they thought to be supporting evidence, it is far from being an established genetic unit. In fact, many scholars today treat it with a lot of skepticism. One major challenge, as with any phylum-level affiliation, is its time depth. Proto-Hokan is thought to be at least as antique as Proto-Indo-European. Moreover, many of the languages were spoken in geographically contiguous areas, with speakers being multilingual and in close contact for an extended period of time, as is the case in Northern California. This suggests considerable language contact effects and complicates the distinction between true cognates and ancient borrowings. Many of the languages involved further show similarities in grammatical structure as a result of language contact.
Hokan languages stretch across California, Nevada, South Texas, various parts of Mexico, Honduras, and Nicaragua and display notable structural differences. Phonologically, the languages show great variation including small and large phoneme inventories and different phonological processes. Typologically, they are equally diverse, but many are considered polysynthetic to varying degrees. Morphosyntactic and grammatical similarities are evident especially among languages spoken in Northern California. These resemblances include sets of lexical affixes with similar meanings and affinities in core argument patterns.
The Iroquoian languages are spoken today in New York State, Ontario, Quebec, Wisconsin, North Carolina, and Oklahoma. The languages share a relatively small segment inventory, a challenging accentual system, polysynthetic morphology, a complex system of pronominal affixes, an unusual kinship terminology, and a syntax that functions almost exclusively to combine the meaning of two expressions. Some of the languages have been documented since contact with Europeans in the 16th century. There exists substantial scholarly linguistic work on most of the languages, and solid teaching materials continue to be developed.
The rigor and intensity of investigation on Japanese in modern linguistics has been particularly noteworthy over the past 50 years. Not only has the elucidation of the similarities to and differences from other languages properly placed Japanese on the typological map, but Japanese has served as a critical testing area for a wide variety of theoretical approaches.
Within the sub-fields of Japanese phonetics and phonology, there has been much focus on the role of mora. The mora constitutes an important timing unit that has broad implications for analysis of the phonetic and phonological system of Japanese. Relatedly, Japanese possesses a pitch-accent system, which places Japanese in a typologically distinct group arguably different from stress languages, like English, and tone languages, like Chinese. A further area of intense investigation is that of loanword phonology, illuminating the way in which segmental and suprasegmental adaptations are processed and at the same time revealing the fundamental nature of the sound system intrinsic to Japanese.
In morphology, a major focus has been on compounds, which are ubiquitously found in Japanese. Their detailed description has spurred in-depth discussion regarding morphophonological (e.g., Rendaku—sequential voicing) and morphosyntactic (e.g., argument structure) phenomena that have crucial consequences for morphological theory. Rendaku is governed by layers of constraints that range from segmental and prosodic phonology to structural properties of compounds, and serves as a representative example in demonstrating the intricate interaction of the different grammatical aspects of the language. In syntax, the scrambling phenomenon, allowing for the relatively flexible permutation of constituents, has been argued to instantiate a movement operation and has been instrumental in arguing for a configurational approach to Japanese. Japanese passives and causatives, which are formed through agglutinative morphology, each exhibit different types: direct vs. indirect passives and lexical vs. syntactic causatives. Their syntactic and semantic properties have posed challenges to and motivations for a variety of approaches to these well-studied constructions in the world’s languages.
Taken together, the empirical analyses of Japanese and their theoretical and conceptual implications have made a tremendous contribution to linguistic research.
The Kiowa-Tanoan family is a small group of Native American languages of the Plains and pueblo Southwest. It comprises Kiowa, of the eponymous Plains tribe, and the pueblo-based Tanoan languages, Jemez (Towa), Tewa, and Northern and Southern Tiwa. These free-word-order languages display a number of typologically unusual characteristics that have rightly attracted attention within a range of subdisciplines and theories.
One word of Taos (my construction based on Kontak and Kunkel’s work) illustrates. In tóm-múlu-wia ‘I gave him/her a drum,’ the verb wia ‘gave’ obligatorily incorporates its object, múlu ‘drum.’ The agreement prefix tóm encodes not only object number, but identities of agent and recipient as first and third singular, respectively, and this all in a single syllable. Moreover, the object number here is not singular, but “inverse”: singular for some nouns, plural for others (tóm-músi-wia only has the plural object reading ‘I gave him/her cats’).
This article presents a comparative overview of the three areas just illustrated: from morphosemantics, inverse marking and noun class; from morphosyntax, super-rich fusional agreement; and from syntax, incorporation. The second of these also touches on aspects of morphophonology, the family’s three-tone system and its unusually heavy grammatical burden, and on further syntax, obligatory passives. Together, these provide a wide window on the grammatical wealth of this fascinating family.
Young-mee Yu Cho
Due to a number of unusual and interesting properties, Korean phonetics and phonology have been generating productive discussion within modern linguistic theories, starting from structuralism, moving to classical generative grammar, and more recently to post-generative frameworks of Autosegmental Theory, Government Phonology, Optimality Theory, and others. In addition, it has been discovered that a description of important issues of phonology cannot be properly made without referring to the interface between phonetics and phonology on the one hand, and phonology and morpho-syntax on the other. Some phonological issues from Standard Korean are still under debate and will likely be of value in helping to elucidate universal phonological properties with regard to phonation contrast, vowel and consonant inventories, consonantal markedness, and the motivation for prosodic organization in the lexicon.
James Hye Suk Yoon
The syntax of Korean is characterized by several signature properties. One signature property is head-finality. Word order variations and restrictions obey head-finality. Korean also possesses wh in-situ as well as internally headed relative clauses, as is typical of a head-final language. Another major signature property is dependent-marking. Korean has systematic case-marking on nominal dependents and very little, if any, head-marking. Case-marking and related issues, such as multiple case constructions, case alternations, case stacking, case-marker ellipsis, and case-marking on adjuncts, are front and center properties of Korean syntax as viewed from the dependent-marking perspective. Research on these aspects of Korean has contributed to the theoretical understanding of case and grammatical relations in linguistic theory. Korean is also characterized by agglutinative morphosyntax. Many issues in Korean syntax straddle the morphology-syntax boundary. Korean morphosyntax constitutes a fertile testing ground for ongoing debates about the relationship between morphology and syntax in domains such as coordination, deverbal nominalizations (mixed category constructions), copula, and other denominal constructions. Head-finality and agglutinative morphosyntax intersect in domains such as complex/serial verb and auxiliary verb constructions. Negation, which is a type of auxiliary verb construction, and the related phenomena of negative polarity licensing, offer important evidence for crosslinguistic understanding of these phenomena. Finally, there is an aspect of Korean syntax that reflects areal contact. Lexical and grammatical borrowing, topic prominence, pervasive occurrence of null arguments and ellipsis, as well as a complex system of anaphoric expressions, resulted from sustained contact with neighboring Sino-Tibetan languages.
Kra-Dai, also known as Tai–Kadai, Daic, and Kadai, is a family of diverse languages found in southern China, northeast India, and Southeast Asia. The number of these languages is estimated to be close to a hundred, with approximately 100 million speakers all over the world. As the name itself suggests, Kra-Dai is made up of two major groups, Kra and Dai. The former refers to a number of lesser-known languages, some of which have only a few hundred fluent speakers or even less. The latter (also known as Tai, or Kam-Tai) is well established, and comprises the best-known members of the family, Thai and Lao, the national languages of Thailand and Laos respectively, whose speakers account for over half of the Kra-Dai population.
The ultimate genetic affiliation of Kra-Dai remains controversial, although a consensus among western scholars holds that it belongs under Austronesian. The majority of Kra-Dai languages have no writing systems of their own, particularly Kra. Languages with writing systems include Thai, Lao, Sipsongpanna Dai, and Tai Lue. These use Indic-based scripts. Others use Chinese character-based scripts, such as the Zhuang and Kam-Sui in southern China and surrounding regions. The government introduced Romanized scripts in the 1950s for the Zhuang and the Kam-Sui languages. Almost every group within Kra-Dai has a rich oral history tradition.
The languages are typically tonal, isolating, and analytic, lacking in inflectional morphology, with no distinction for number and gender. A significant number of basic vocabulary items are monosyllabic, but bisyllabic and multisyllabic compounds also abound. There are morphological processes in which etymologically related words manifest themselves in groups through tonal, initial, or vowel alternations. Reduplication is a salient word formation mechanism. In syntax, the Kra-Dai languages can be said to have basic SVO word order. They possess a rich system of noun classifiers. Other features include verb serialization without overt marking to indicate grammatical relations. A number of lexical items (mostly verbs) may function as grammatical morphemes in syntactic operations. Temporal and aspectual meanings are expressed through tense-aspect markers typically derived from verbs, while mood and modality are conveyed via a rich array of discourse particles.
Traditional Chinese linguistics grew out of two distinct interests in language: the philosophical reflection on things and their names, and the practical concern for literacy education and the correct understanding of classical works. The former is most typically found in the teachings of such pre-Qin masters as Confucius, Mozi, and Gongsun Long, who lived between the 6th and 3rd centuries
The picture just presented, in which Chinese philosophy and philology are combined to form a seemingly autonomous tradition, is complicated, however, by the fact that the Indic linguistic tradition started to influence the Chinese in the 2nd century
Chinese, with its linguistic tradition, had a profound impact in ancient East Asia. Not only did traditional studies of Japanese, Tangut, and other languages show significant Chinese influence, under which not the least achievement was the invention of the earliest writing systems for these languages, but many scholars from Japan and Korea actually took an active part in the study of Chinese as well, so that the Chinese linguistic tradition would itself be incomplete without the materials and findings these non-Chinese scholars have contributed. On the other hand, some of these scholars, most notably Motoori Norinaga and Fujitani Nariakira in Japan, were able to free themselves from the character-centered Chinese routine and develop rather original linguistic theories.
Judith T. Irvine
In the indigenous sociolinguistic systems of West Africa, an important way of expressing—and creating—social hierarchy in interaction is through intermediaries: third parties, through whom messages are relayed. The forms of mediation vary by region, by the scale of the social hierarchy, and by the ways hierarchy is locally understood. In larger-scale systems where hierarchy is elaborate, the interacting parties include a high-status person, a mediator who ranks lower, and a third person or group—perhaps another dignitary, but potentially anyone. In smaller-scale, more egalitarian societies, the (putative) interactants could include an authoritative spirit represented by a mask, the mask’s bearer, a “translator,” and an audience. In all these systems, mediated interactions may also involve distinctive registers or vocalizations. Meanwhile, the interactional structure and its characteristic ways of speaking offer tropes and resources for expressing politeness in everyday talk.
In the traditions connected with precolonial kingdoms and empires, professional praise orators deliver eulogistic performances for their higher-status patrons. This role is understood as transmission—transmitting a message from the past, or from a group, or from another dignitary—more than as creating a composition from whole cloth. The transmitter amplifies and embellishes the message; he or she does not originate it. In addition to their formal public performances, these orators serve as interpreters and intermediaries between their patrons and their patrons’ visitors. Speech to the patron is relayed through the interpreter, even if the original speaker and the patron are in the same room. Social hierarchy is thus expressed as interactional distance.
In the Sahel, these social hierarchies involve a division of labor, including communicative labor, in a complex system of ranked castes and orders. The praise orators, as professional experts in the arts of language and communication, are a separate, low-ranking category (known by the French term griot). Some features of griot performance style, and the contrasting—sometimes even disfluent—verbal conduct of high-ranking aristocrats, carry over into speech registers used by persons of any social category in situations evoking hierarchy (petitioning, for example). In indigenous state systems further south, professional orators are not a separate caste, and chiefs are also supposed to have verbal skills, although still using intermediaries. Special honorific registers, such as the esoteric Akan “palace speech,” are used in the chief’s court. Some politeness forms in everyday Akan usage today echo these practices.
An example of a small-scale society is the Bedik (Senegal-Guinea border), among whom masked dancers serve as the visible and auditory representation of spirit beings. The mask spirits, whose speech and conduct contrasts with their bearers’ ordinary behavior, require “translators” to relay their messages to addressees. This too is mediated communication, involving a multi-party interactional structure as well as distinctive vocalizations.
Linguistic repertoires in the Sahel have long included Arabic, and Islamic learning is another source of high status, coexisting with other traditional sources and sharing some interactional patterns. The European conquest brought European languages to the top of West African linguistic hierarchies, which have remained largely in place since independence.
As might be expected from the difficulty of traversing it, the Sahara Desert has been a fairly effective barrier to direct contact between its two edges; trans-Saharan language contact is limited to the borrowing of non-core vocabulary, minimal from south to north and mostly mediated by education from north to south. Its own inhabitants, however, are necessarily accustomed to travelling desert spaces, and contact between languages within the Sahara has often accordingly had a much greater impact. Several peripheral Arabic varieties of the Sahara retain morphology as well as vocabulary from the languages spoken by their speakers’ ancestors, in particular Berber in the southwest and Beja in the southeast; the same is true of at least one Saharan Hausa variety. The Berber languages of the northern Sahara have in turn been deeply affected by centuries of bilingualism in Arabic, borrowing core vocabulary and some aspects of morphology and syntax. The Northern Songhay languages of the central Sahara have been even more profoundly affected by a history of multilingualism and language shift involving Tuareg, Songhay, Arabic, and other Berber languages, much of which remains to be unraveled. These languages have borrowed so extensively that they retain barely a few hundred core words of Songhay vocabulary; those loans have not only introduced new morphology but in some cases replaced old morphology entirely. In the southeast, the spread of Arabic westward from the Nile Valley has created a spectrum of varieties with varying degrees of local influence; the Saharan ones remain almost entirely undescribed. Much work remains to be done throughout the region, not only on identifying and analyzing contact effects but even simply on describing the languages its inhabitants speak.
The expression language of the economy and business refers to an extremely heterogeneous linguistic reality. For some, it denotes all text and talk produced by economic agents in the pursuit of economic activity, for others the language used to write or talk about the economy or business, that is, the language of the economic sciences and the media. Both the economy and business contain a myriad of subdomains, each with its own linguistic peculiarities. Language use also differs quite substantially between the shop floor and academic articles dealing with it. Last but not least, language is itself a highly articulate entity, composed of sounds, words, concepts, etc., which are taken care of by a considerable number of linguistic disciplines and theories. As a consequence, this research landscape offers a very varied picture.
The state of research is also highly diverse as far as the Romance languages are concerned. The bulk of relevant publications concerns French, followed at a certain distance by Spanish and Italian, while Romanian, Catalan, and Portuguese look like poor relations. As far as the dialects are concerned, only those of some Italian cities that held a central position in medieval trade, like Venice, Florence, or Genoa, have given rise to relevant studies. As far as the metalanguage used in research is concerned, the most striking feature is the overwhelming preponderance of German and the almost complete absence of English. The insignificant role of English must probably be attributed to the fact that the study of foreign business languages in the Anglo-Saxon countries is close to nonexistent. Why study foreign business languages if one own’s language is the lingua franca of today’s business world? Scholars from the Romance countries, of course, generally write in their mother tongue, but linguistic publications concerning the economic and business domain are relatively scarce there. The heterogeneity of the metalanguages used certainly hinders the constitution of a close-knit research community.
Aidan Pine and Mark Turin
The world is home to an extraordinary level of linguistic diversity, with roughly 7,000 languages currently spoken and signed. Yet this diversity is highly unstable and is being rapidly eroded through a series of complex and interrelated processes that result in or lead to language loss. The combination of monolingualism and networks of global trade languages that are increasingly technologized have led to over half of the world’s population speaking one of only 13 languages. Such linguistic homogenization leaves in its wake a linguistic landscape that is increasingly endangered.
A wide range of factors contribute to language loss and attrition. While some—such as natural disasters—are unique to particular language communities and specific geographical regions, many have similar origins and are common across endangered language communities around the globe. The harmful legacy of colonization and the enduring impact of disenfranchising policies relating to Indigenous and minority languages are at the heart of language attrition from New Zealand to Hawai’i, and from Canada to Nepal.
Language loss does not occur in isolation, nor is it inevitable or in any way “natural.” The process also has wide-ranging social and economic repercussions for the language communities in question. Language is so heavily intertwined with cultural knowledge and political identity that speech forms often serve as meaningful indicators of a community’s vitality and social well-being. More than ever before, there are vigorous and collaborative efforts underway to reverse the trend of language loss and to reclaim and revitalize endangered languages. Such approaches vary significantly, from making use of digital technologies in order to engage individual and younger learners to community-oriented language nests and immersion programs. Drawing on diverse techniques and communities, the question of measuring the success of language revitalization programs has driven research forward in the areas of statistical assessments of linguistic diversity, endangerment, and vulnerability. Current efforts are re-evaluating the established triad of documentation-conservation-revitalization in favor of more unified, holistic, and community-led approaches.
Victor A. Friedman
The Balkan languages were the first group of languages whose similarities were explained in modern linguistic terms as a result of language contact rather than as a result of descent from a common ancestor. Nikolai Trubetzkoy coined the term Sprachbund ‘linguistic league’ (as opposed to Sprachfamilie ‘language family’) to describe this relationship. Balkan linguistics, as both a subset of and precursor to contact linguistics, is, at its base, an historical linguistic discipline. It seeks to explain similarities among the relevant languages as the result of diffusion rather than of either transmission or of putative universal, typological properties of human language (which latter assumes parallel developments whose causation is ahistorical, i.e., unconnected with either contact or ancestry). The relevant languages are, with the exception of Turkic, all part of the Indo-European language family, but they belong to five distinct groups that are known to have been separated for a significant length of time (presumably millennia). Moreover, for four out of five Indo-European groups as well as for Turkic, there exists documentation that goes back more than a millennium, and in some cases several millennia. The Balkan languages are thus the oldest example of a well-documented and still living Sprachbund.
The primary questions that Balkan linguistics seeks to answer are these: What are the results of language contact in the Balkan languages, and how did they come about? The Balkan languages are traditionally defined as Albanian, Modern Greek, Balkan Romance (Romanian, Aromanian, and Meglenoromanian), and Balkan Slavic (Bulgarian, Macedonian, and the southernmost dialects of the former Serbo-Croatian). In recent decades, it has been recognized that the relevant dialects of Romani, Judezmo, and Turkish and Gagauz also participate in at least some of the convergent processes that are taken as definitive of the Balkan linguistic league. While the language family is defined by regular sound correspondences, which in turn help define shared morphology and a core lexicon, the Balkan linguistic league is defined principally by shared morphosyntactic developments and a shared lexicon of borrowings often called “cultural.” In the Balkan linguistic league, phonological developments are sometimes shared among different languages at the dialectal level, but there are no such features that characterize the Balkan languages as a group. Just as in the language family not every diagnostic item is represented in every branch, so, too, in the Balkan linguistic league not every feature is equally represented in all languages and dialects.
Among the most characteristic morphosyntactic features are the following: (1) replacement of infinitives by analytic subjunctives, (2) the use of a particle derived from etymological ‘want’ to mark the future, (3) replacement of synthetic gradation of adjectives with analytic constructions, (4) replacement of conditionals by anterior futures, (5) resumptive clitic pronouns for certain direct and indirect objects, (6) various simplifications in the declensional system, (7) postposed definite articles (for Balkan Slavic, Balkan Romance, and Albanian), (8) grammaticalized evidentials (Balkan Slavic, Albanian, Turkic, and to some extent Balkan Romance and Romani). While some of these convergences began in the ancient or medieval periods, the Balkan linguistic league took its definitive modern shape during the centuries of the Ottoman Empire (14th to early 20th centuries).
William R. Leben
About 7,000 languages are spoken around the world today. The actual number depends on where the line is drawn between language and dialect—an arbitrary decision, because languages are always in flux. But specialists applying a reasonably uniform criterion across the globe count well over 2,000 languages in Asia and Africa, while Europe has just shy of 300. In between are the Pacific region, with over 1,300 languages, and the Americas, with just over 1,000. Languages spoken natively by over a million speakers number around 250, but the vast majority have very few speakers. Something like half are thought likely to disappear over the next few decades, as speakers of endangered languages turn to more widely spoken ones.
The languages of the world are grouped into perhaps 430 language families, based on their origin, as determined by comparing similarities among languages and deducing how they evolved from earlier ones. As with languages, there’s quite a lot of disagreement about the number of language families, reflecting our meager knowledge of many present-day languages and even sparser knowledge of their history. The figure 430 comes from Glottolog.org, which actually lists them all. While the world’s language families may well go back to a smaller number of original languages, even to a single mother tongue, scholars disagree on how far back current methods permit us to trace the history of languages.
While it is normal for languages to borrow from other languages, occasionally a totally new language is created by mixing elements of two distinct languages to such a degree that we would not want to identify one of the source languages as the mother tongue. This is what led to the development of Media Lengua, a language of Ecuador formed through contact among speakers of Spanish and speakers of Quechua. In this language, practically all the word stems are from Spanish, while all of the endings are from Quechua. Just a handful of languages have come into being in this way, but less extreme forms of language mixture have resulted in over a hundred pidgins and creoles currently spoken in many parts of the world. Most arose during Europe’s colonial era, when European colonists used their language to communicate with local inhabitants, who in turn blended vocabulary from the European language with grammar largely from their native language.
Also among the languages of the world are about 300 sign languages used mainly in communicating among and with the deaf. The structure of sign languages typically has little historical connection to the structure of nearby spoken languages.
Some languages have been constructed expressly, often by a single individual, to meet communication demands among speakers with no common language. Esperanto, designed to serve as a universal language and used as a second language by some two million, according to some estimates, is the prime example, but it is only one among several hundred would-be international auxiliary languages.
This essay surveys the languages of the world continent by continent, ending with descriptions of sign languages and of pidgins and creoles. A set of references grouped by section appears at the very end. The main source for data on language classification, numbers of languages, and speakers is the 19th edition of Ethnologue (see Resources), except where a different source is cited.
Mande is a mid-range language family in Western Sub-Saharan Africa that includes 60 to 75 languages spoken by 30 to 40 million people. According to the glottochronological data, its genetic depth is between 5,000 and 5,500 years. The Proto-Mande homeland can be presumably localized in the western part of the southern Sahara. Lexical data suggests that the Mande family belongs to the Niger-Congo macrofamily, but some scholars doubt it, mainly because of the lack of morphological cognates.
The first division of Mande is binary, into Western and Southeastern branches. Further on, the Western branch is subdivided into nine groups: Manding, Mokole, Vai-Kono, Jogo-Jeri, Southwestern, Susu-Jalonke, Samogho, Soninke-Bozo, and Bobo. The Southeastern branch consists of Southern and Eastern groups. The biggest Mande languages, Bambara, Maninka, Mandinka, and Jula, belong to the Manding group.
Practically all Mande languages are tonal (two to five level tones), and the tones fulfil both lexical and grammatical functions. The typical syllable structure is CV; in many languages the type CVN is also attested, while CVC is rare (Soninke, Bisa). The metrical foot is a relevant unit for many Mande languages.
The typical basic word order in a verbal clause is Subject—Auxiliary—Direct Object—Verb—Oblique. Omission of a subject is possible in some Southern and Southwestern languages, where subject pronouns have merged with auxiliaries into Personal Pronominal Markers; otherwise an overt subject is obligatory.
Inflectional morphology is almost missing in some languages, mainly innovative in some others. Noun classes and grammatical genders are lacking. In most languages, there is only one plural marker (sometimes two); agreement in number is usually missing. Morphological case is most often absent, although it is attested in some pronominal systems; noun declination is emerging in Dan (Southern Mande). In Southern, Southwestern, and Eastern groups and Bobo, there are multiple series of personal pronouns expressing case, communicative status, and often verbal categories as well (aspect, mode, polarity).
Verbal lability (mainly P-lability) is highly productive in many Mande languages, including typologically rare passive lability.
Derivational morphology is relatively rich, only suffixal for nouns, but either suffixal or prefixal for verbs. In many languages, preverbs are still separable. Reduplication is productive in many languages for pluriactionality and intensity, sometimes for nominal plurality. Word compounding is highly productive.
The structure of noun phrase is N2 + N1 + Adj + Det (N1 is head noun, N2 is dependent noun). In most Mande languages, alienable and inalienable nouns are formally distinguished; the former are connected to the possessor by auxiliary words, and in some languages, they require a special possessive series of personal pronouns.
Nominative-accusative alignment is predominant; in the Southwestern group, split semantic or ergative alignments are attested.
For relativization, varieties of correlative strategy are mostly used.
At the beginning of the 21st century, Roman-based alphabets are used for nearly all languages of the family. Arabic-based writing systems (Ajami) are of limited use for Mandinka, Jula, Susu, and Mogofin. An original syllabic writing has existed since the 1820s for Vai; since the 1950s, an original alphabet, N’ko, is broadly used for Manding languages.
Nora C. England
Mayan languages are spoken by over 5 million people in Guatemala, Mexico, Belize, and Honduras. There are around 30 different languages today, ranging in size from fairly large (about a million speakers) to very small (fewer than 30 speakers). All Mayan languages are endangered given that at least some children in some communities are not learning the language, and two languages have disappeared since European contact. Mayas developed the most elaborated and most widely attested writing system in the Americas (starting about 300 BC).
The sounds of Mayan languages consist of a voiceless stop and affricate series with corresponding glottalized stops (either implosive and ejective) and affricates, glottal stop, voiceless fricatives (including h in some of them inherited from Proto-Maya), two to three nasals, three to four approximants, and a five vowel system with contrasting vowel length (or tense/lax distinctions) in most languages. Several languages have developed contrastive tone.
The major word classes in Mayan languages include nouns, verbs, adjectives, positionals, and affect words. The difference between transitive verbs and intransitive verbs is rigidly maintained in most languages. They usually use the same aspect markers (but not always). Intransitive verbs only indicate their subjects while transitive verbs indicate both subjects and objects. Some languages have a set of status suffixes which is different for the two classes. Positionals are a root class whose most characteristic word form is a non-verbal predicate. Affect words indicate impressions of sounds, movements, and activities. Nouns have a number of different subclasses defined on the basis of characteristics when possessed, or the structure of compounds. Adjectives are formed from a small class of roots (under 50) and many derived forms from verbs and positionals.
Predicate types are transitive, intransitive, and non-verbal. Non-verbal predicates are based on nouns, adjectives, positionals, numbers, demonstratives, and existential and locative particles. They are distinct from verbs in that they do not take the usual verbal aspect markers. Mayan languages are head marking and verb initial; most have VOA flexible order but some have VAO rigid order. They are morphologically ergative and also have at least some rules that show syntactic ergativity. The most common of these is a constraint on the extraction of subjects of transitive verbs (ergative) for focus and/or interrogation, negation, or relativization. In addition, some languages make a distinction between agentive and non-agentive intransitive verbs. Some also can be shown to use obviation and inverse as important organizing principles. Voice categories include passive, antipassive and agent focus, and an applicative with several different functions.
Cynthia L. Allen
Middle English is the name given to the English of the period from approximately 1100 to approximately 1450. This period is marked by substantial developments in all areas of English grammar. It is also the period of English when different dialects are the most fully attested in the texts. At the beginning of the Middle English period, the sociolinguistic status of English was low due to the Norman Invasion, and although religious texts of Old English composition continued to be copied and updated, few original compositions are extant. By the end of the period, English had regained its status as the language of government, law, and literature generally.
Although some notable changes to the phonemic inventory of consonants date from the Middle English period, the most dramatic phonological developments of the period involve vowels. The reduction of the vowels of unstressed syllables, one of the changes that marks the beginning of the Middle English period, is a phonological change with substantial morphological effects, as it substantially reduced the number of distinctive inflectional forms. Constituent order replaced case marking as the primary means of signaling grammatical relations. By the end of the Middle English period, subject-verb-object order had become established as the norm.
The lexicon of English was transformed in this period by an enormous influx of French words. The role of derivational morphology declined as its functions were to some extent replaced by the adoption of French words. Most Scandinavian loans in English first appear in the texts of this period. The Scandinavian loans are typically everyday words, while the words adopted from French are more often in areas of government, law, and higher culture, reflecting the nature of the contact between English speakers and the speakers of these languages.
The density of the Scandinavian population in the northern part of England is generally held to be responsible for the earlier appearance of changes in the north than in the south. The replacement of the third person plural personal pronoun hie by the Scandinavian they is an example of a development which is apparent only in the north early in Middle English but became general in English by the end of this period.
An important phonological development of later Middle English is the beginning of the Great Vowel Shift, which affected long vowels and involved successive changes and was implemented differently in different dialects, the north-south divide being the most evident.
Early Middle English is a language that cannot be understood by Modern English readers without special study, while the language of the late Middle English period, especially that coming from the London area, can be understood with the heavy use of explanatory notes.