You are looking at 121-140 of 241 articles
William R. Leben
About 7,000 languages are spoken around the world today. The actual number depends on where the line is drawn between language and dialect—an arbitrary decision, because languages are always in flux. But specialists applying a reasonably uniform criterion across the globe count well over 2,000 languages in Asia and Africa, while Europe has just shy of 300. In between are the Pacific region, with over 1,300 languages, and the Americas, with just over 1,000. Languages spoken natively by over a million speakers number around 250, but the vast majority have very few speakers. Something like half are thought likely to disappear over the next few decades, as speakers of endangered languages turn to more widely spoken ones.
The languages of the world are grouped into perhaps 430 language families, based on their origin, as determined by comparing similarities among languages and deducing how they evolved from earlier ones. As with languages, there’s quite a lot of disagreement about the number of language families, reflecting our meager knowledge of many present-day languages and even sparser knowledge of their history. The figure 430 comes from Glottolog.org, which actually lists them all. While the world’s language families may well go back to a smaller number of original languages, even to a single mother tongue, scholars disagree on how far back current methods permit us to trace the history of languages.
While it is normal for languages to borrow from other languages, occasionally a totally new language is created by mixing elements of two distinct languages to such a degree that we would not want to identify one of the source languages as the mother tongue. This is what led to the development of Media Lengua, a language of Ecuador formed through contact among speakers of Spanish and speakers of Quechua. In this language, practically all the word stems are from Spanish, while all of the endings are from Quechua. Just a handful of languages have come into being in this way, but less extreme forms of language mixture have resulted in over a hundred pidgins and creoles currently spoken in many parts of the world. Most arose during Europe’s colonial era, when European colonists used their language to communicate with local inhabitants, who in turn blended vocabulary from the European language with grammar largely from their native language.
Also among the languages of the world are about 300 sign languages used mainly in communicating among and with the deaf. The structure of sign languages typically has little historical connection to the structure of nearby spoken languages.
Some languages have been constructed expressly, often by a single individual, to meet communication demands among speakers with no common language. Esperanto, designed to serve as a universal language and used as a second language by some two million, according to some estimates, is the prime example, but it is only one among several hundred would-be international auxiliary languages.
This essay surveys the languages of the world continent by continent, ending with descriptions of sign languages and of pidgins and creoles. A set of references grouped by section appears at the very end. The main source for data on language classification, numbers of languages, and speakers is the 19th edition of Ethnologue (see Resources), except where a different source is cited.
Phonological learnability deals with the formal properties of phonological languages and grammars, which are combined with algorithms that attempt to learn the language-specific aspects of those grammars. The classical learning task can be outlined as follows: Beginning at a predetermined initial state, the learner is exposed to positive evidence of legal strings and structures from the target language, and its goal is to reach a predetermined end state, where the grammar will produce or accept all and only the target language’s strings and structures. In addition, a phonological learner must also acquire a set of language-specific representations for morphemes, words and so on—and in many cases, the grammar and the representations must be acquired at the same time.
Phonological learnability research seeks to determine how the architecture of the grammar, and the workings of an associated learning algorithm, influence success in completing this learning task, i.e., in reaching the end-state grammar. One basic question is about convergence: Is the learning algorithm guaranteed to converge on an end-state grammar, or will it never stabilize? Is there a class of initial states, or a kind of learning data (evidence), which can prevent a learner from converging? Next is the question of success: Assuming the algorithm will reach an end state, will it match the target? In particular, will the learner ever acquire a grammar that deems grammatical a superset of the target language’s legal outputs? How can the learner avoid such superset end-state traps? Are learning biases advantageous or even crucial to success?
In assessing phonological learnability, the analysist also has many differences between potential learning algorithms to consider. At the core of any algorithm is its update rule, meaning its method(s) of changing the current grammar on the basis of evidence. Other key aspects of an algorithm include how it is triggered to learn, how it processes and/or stores the errors that it makes, and how it responds to noise or variability in the learning data. Ultimately, the choice of algorithm is also tied to the type of phonological grammar being learned, i.e., whether the generalizations being learned are couched within rules, features, parameters, constraints, rankings, and/or weightings.
Eve V. Clark
The words and word-parts children acquire at different stages offer insights into how the mental lexicon might be organized. Children first identify ‘words,’ recurring sequences of sounds, in the speech stream, attach some meaning to them, and, later, analyze such words further into parts, namely stems and affixes. These are the elements they store in memory in order to recognize them on subsequent occasions. They also serve as target models when children try to produce those words themselves. When they coin words, they make use of bare stems, combine certain stems with each other, and sometimes add affixes as well. The options they choose depend on how much they need to add to coin a new word, which familiar elements they can draw on, and how productive that option is in the language. Children’s uses of stems and affixes in coining new words also reveal that they must be relying on one representation in comprehension and a different representation in production. For comprehension, they need to store information about the acoustic properties of a word, taking into account different occasions, different speakers, and different dialects, not to mention second-language speakers. For production, they need to work out which articulatory plan to follow in order to reproduce the target word. And they take time to get their production of a word aligned with the representation they have stored for comprehension. In fact, there is a general asymmetry here, with comprehension being ahead of production for children, and also being far more extensive than production, for both children and adults. Finally, as children add more words to their repertoires, they organize and reorganize their vocabulary into semantic domains. In doing this, they make use of pragmatic directions from adults that help them link related words through a variety of semantic relations.
The term lexicalization describes the addition of new open-class elements to a repository of holistically processed linguistic units. At the basis of lexicalization are word-formation processes such as affixation, compounding, or borrowing, which are a necessary precondition for lexicalization. Still, lexicalization goes beyond word formation in important respects. First, lexicalization also involves multi-word expressions and set phrases; second, it includes a range of processes that follow the coinage of a new element. These processes conjointly lead to holistic processing, that is, the cognitive treatment of a linguistic element as a unified whole. Holistic processing contrasts with analytic processing, which is the cognitive treatment of a linguistic unit as a complex whole that is composed of several parts. Lexicalization is usefully contrasted with grammaticalization, that is, the emergence of new linguistic units that fulfill grammatical functions. Finally, lexicalization is also a concept that lends itself to the study of cross-linguistic differences in the types of meaning that are lexicalized in specific domains such as, for example, motion.
The central goal of the Lexical Semantic Framework (LSF) is to characterize the meaning of simple lexemes and affixes and to show how these meanings can be integrated in the creation of complex words. LSF offers a systematic treatment of issues that figure prominently in the study of word formation, such as the polysemy question, the multiple-affix question, the zero-derivation question, and the form and meaning mismatches question.
LSF has its source in a confluence of research approaches that follow a decompositional approach to meaning and, thus, defines simple lexemes and affixes by way of a systematic representation that is achieved via a constrained formal language that enforces consistency of annotation. Lexical-semantic representations in LSF consist of two parts: the Semantic/Grammatical Skeleton and the Semantic/Pragmatic Body (henceforth ‘skeleton’ and ‘body’ respectively). The skeleton is comprised of features that are of relevance to the syntax. These features act as functions and may take arguments. Functions and arguments of a skeleton are hierarchically arranged. The body encodes all those aspects of meaning that are perceptual, cultural, and encyclopedic.
Features in LSF are used in (a) a cross-categorial, (b) an equipollent, and (c) a privative way. This means that they are used to account for the distinction between the major ontological categories, may have a binary (i.e., positive or negative) value, and may or may not form part of the skeleton of a given lexeme. In order to account for the fact that several distinct parts integrate into a single referential unit that projects its arguments to the syntax, LSF makes use of the Principle of Co-indexation. Co-indexation is a device needed in order to tie together the arguments that come with different parts of a complex word to yield only those arguments that are syntactically active.
LSF has an important impact on the study of the morphology-lexical semantics interface and provides a unitary theory of meaning in word formation.
Lexical semantics is the study of word meaning. Descriptively speaking, the main topics studied within lexical semantics involve either the internal semantic structure of words, or the semantic relations that occur within the vocabulary. Within the first set, major phenomena include polysemy (in contrast with vagueness), metonymy, metaphor, and prototypicality. Within the second set, dominant topics include lexical fields, lexical relations, conceptual metaphor and metonymy, and frames. Theoretically speaking, the main theoretical approaches that have succeeded each other in the history of lexical semantics are prestructuralist historical semantics, structuralist semantics, and cognitive semantics. These theoretical frameworks differ as to whether they take a system-oriented rather than a usage-oriented approach to word-meaning research but, at the same time, in the historical development of the discipline, they have each contributed significantly to the descriptive and conceptual apparatus of lexical semantics.
Elizabeth Lanza and Hirut Woldemariam
The linguistic landscape (henceforth LL) has proven to be a fruitful approach for investigating various societal dimensions of written language use in the public sphere. First introduced in the context of bilingual Canada as a gauge for measuring ethnolinguistic vitality, in the 21st century it is the focus of a thriving field of inquiry with its own conference series, an increasing number of publications, and an international journal dedicated exclusively to investigating language and other semiotic resources used in the public arena. The scholarship in this domain has centered on European and North American geographical sites; however, an increasingly voluminous share of studies addresses the LL of sites across the world through both books and articles. African contributions have added an important dimension to this knowledge base as southern multilingualisms bring into question the very concept of language in that speakers and writers draw on their rich linguistic repertoires, avoiding any compartmentalization or separation of what is traditionally conceived of as languages. The LL of Ethiopia has contributed to this growing base of empirical studies in the exploration of language policy issues, identity constructions, language contact, and the sociolinguistics of globalization. A new language policy of ethnic federalism was introduced to the country in the 1990s following a civil war and through a new constitution. This policy was set to recognize the various ethnolinguistic groups in the country and the official use of ethnic/regional languages to satisfy local political and educational needs. Through this, languages previously unwritten required a script in order for speakers to communicate in them in written texts. And many regions have chosen the Latin script above the Ethiopic script. Nonetheless, some languages remain invisible in the public sphere. These events create an exciting laboratory for studying the LL. Given the change of language policy since the late 20th century and the fast-growing economy of Ethiopia (one of the poorest countries on the continent) the manifest and increasingly visible display of languages in the LL provides an excellent lens for studying various sociolinguistic phenomena.
Indian linguistic thought begins around the 8th–6th centuries
The greater part of documented thought is related to Sanskrit (Ancient Indo-Aryan). Very early, the oral transmission of sacred texts—the Vedas, composed in Vedic Sanskrit—made it necessary to develop techniques based on a subtle analysis of language. The Vedas also—but presumably later—gave birth to bodies of knowledge dealing with language, which are traditionally called Vedāṅgas: phonetics (śikṣā), metrics (chandas), grammar (vyākaraṇa), and semantic explanation (nirvacana, nirukta). Later on, Vedic exegesis (mīmāṃsā), new dialectics (navya-nyāya), lexicography, and poetics (alaṃkāra) also contributed to linguistic thought.
Though languages other than Sanskrit were described in premodern India, the grammatical description of Sanskrit—given in Sanskrit—dominated and influenced them more or less strongly. Sanskrit grammar (vyākaraṇa) has a long history marked by several major steps (Padapāṭha versions of Vedic texts, Aṣṭādhyāyī of Pāṇini, Mahābhāṣya of Patañjali, Bhartṛhari’s works, Siddhāntakaumudī of Bhaṭṭoji Dīkṣita, Nāgeśa’s works), and the main topics it addresses (minimal meaning-bearer units, classes of words, relation between word and meaning/referent, the primary meaning/referent of nouns) are still central issues for contemporary linguistics.
The linguistic study of literature addresses the ways in which language is differently organized in verbal art (literature): form is added to language, altered, attenuated, and differently grouped. These different kinds of organization are normatively subject to limits, some derived from limits on general linguistic form or language-specific linguistic form. However, linguistic form can in principle be altered in any way at all, for example, in avant-garde texts or to produce artificial languages for literature; this possibility raises the general question of whether some organizations of literary language are cognitively transparent and others are cognitively opaque.
Of the various added forms, the most extensively studied has been metrical form, which requires the words of the text to be grouped into lines. Metrical form combines a non-linguistic counting system with a rhythmic system that adapts the rhythmic systems of ordinary phonology; most accounts of meter have focused on the rhythms as these are of greater linguistic interest than counting (which plays no significant role in language in general). The metrical line may have a special status, as a cognitively privileged level of grouping, possibly because it is fitted to working memory. Rhyme and alliteration are two common kinds of added form; most linguistic interest has been in what counts as “similarity of sound” between two words, whether at a surface or underlying level. Rhyming and alliterating words are distributed relative to the grouping into lines and other constituents. The other major kind of added form is parallelism, where two sections of text are structurally similar, usually in syntax and vocabulary. The various added forms may allow for variation (e.g., every line in an English sonnet can be in a different rhythmic variation of iambic pentameter), and can be intermittently present; there is no clear equivalent to ‘grammaticality’ in literary linguistic form. This may be because literary linguistic form holds as a presumption about a text, derived by inference, rather than as a constitutive structural device.
All literary texts have a discourse structure, which includes division into various types of group or constituent, including the division of a narrative into episodes, exploiting verbal cues of episodic boundaries. Narratives also require the tracking of referents such as people and objects across the discourse, which draws on the study of pronominals. Literary texts may also have a distinctive vocabulary, borrowing or inventing words to an unusual degree, and engaging in various kinds of wordplay.
Literary texts have ‘style’ and ‘markedness’, ways in which the language varies in noticeable ways but without coding a different linguistic semantics. These stylistic variations are sometimes treated as having determinate interpretations, but there are also approaches to stylistic variations in literature that treat them as having a non-determinate relation to meaning. Literature cannot have a different semantics or pragmatics from ordinary language, but meaning can be ‘difficult’ in literature in ways not characteristic of much ordinary language (but in common with ritual speech and other ways of speaking).
A major mode of linguistic investigation involves corpora, over which statistical analyses are undertaken. This has a relation to the question of whether our literary-linguistic knowledge has a probabilistic basis, a question that ties the study of language to questions of expectation in aesthetics (e.g., music) more generally. Literature exists in various modalities—writing, oral literature, and signed literature—and linguistic approaches to literature have been sensitive to this, as well as to the special questions about how texts are set to music in songs.
Phenomena involving the displacement of syntactic units are widespread in human languages. The term displacement refers here to a dependency relation whereby a given syntactic constituent is interpreted simultaneously in two different positions. Only one position is pronounced, in general the hierarchically higher one in the syntactic structure. Consider a wh-question like (1) in English:
(1) Whom did you give the book to <whom>
The phrase containing the interrogative wh-word is located at the beginning of the clause, and this guarantees that the clause is interpreted as a question about this phrase; at the same time, whom is interpreted as part of the argument structure of the verb give (the copy, in <> brackets). In current terms, inspired by minimalist developments in generative syntax, the phrase whom is first merged as (one of) the complement(s) of give (External Merge) and then re-merged (Internal Merge, i.e., movement) in the appropriate position in the left periphery of the clause. This peripheral area of the clause hosts operator-type constituents, among which interrogative ones (yielding the relevant interpretation: for which x, you gave a book to x, for sentence 1). Scope-discourse phenomena—such as, e.g., the raising of a question as in (1), the focalization of one constituent as in TO JOHN I gave the book (not to Mary)—have the effect that an argument of the verb is fronted in the left periphery of the clause rather than filling its clause internal complement position, whence the term displacement. Displacement can be to a position relatively close to the one of first merge (the copy), or else it can be to a position farther away. In the latter case, the relevant dependency becomes more long-distance than in (1), as in (2)a and even more so (2)b:
a Whom did Mary expect [that you would give the book to<whom >]
b Whom do you think [that Mary expected [that you would give the book to <whom >]]
50 years or so of investigation on locality in formal generative syntax have shown that, despite its potentially very distant realization, syntactic displacement is in fact a local process. The audible position in which a moved constituent is pronounced and the position of its copy inside the clause can be far from each other. However, the long-distance dependency is split into steps through iterated applications of short movements, so that any dependency holding between two occurrences of the same constituent is in fact very local. Furthermore, there are syntactic domains that resist movement out of them, traditionally referred to as islands. Locality is a core concept of syntactic computations. Syntactic locality requires that syntactic computations apply within small domains (cyclic domains), possibly in the mentioned iterated way (successive cyclicity), currently rethought of in terms of Phase theory. Furthermore, in the Relativized Minimality tradition, syntactic locality requires that, given X . . . Z . . . Y, the dependency between the relevant constituent in its target position X and its first merge position Y should not be interrupted by any constituent Z which is similar to X in relevant formal features and thus intervenes, blocking the relation between X and Y. Intervention locality has also been shown to allow for an explicit characterization of aspects of children’s linguistic development in their capacity to compute complex object dependencies (also relevant in different impaired populations).
Konstantin Pozdniakov, Guillaume Segerer, and Valentin Vydrin
The Atlantic family includes 40 to 50 languages spoken in the coastal countries of West Africa, from southern Mauritania to Liberia; the Fula language of the Fulbe people is dispersed over Sahelian Africa up to Sudan and Eritrea. The Proto-Mande (second half of the 3rd millennium
In the study of Mande and Atlantic language contacts, the major interest is represented by lexical borrowings that can be subdivided into recent (2nd millennium
Among the recent borrowings, those from Mande to Atlantic languages are more numerous. The most visible layers are the following:
– from Soninke to Fula; these loans are quite numerous and date back mostly to the period of the mighty Wagadu/Ghana medieval polity (before the 12th to 13th centuries); the dispersion of Fulbe over West Africa took place afterward;
– from Soninke to Sereer. These loans are much scarcer; they go back to the period of coexistence of the ancestors of Soninke and Sereer in the Southern Mauritania or the lower Senegal, before the Sereers moved further to the south;
– from Mandinka to numerous Atlantic languages of the Southern Senegambia, since the end of the 1st millennium
– from Maninka to Atlantic languages of Guinea (especially those of the Tenda and Jaad groups, but also to the Futa-Jallon Fula);
– from Kakabe to Pular, since the 18th century, when Kakabe (and probably other varieties of the Mokole group) served as substrata for the dominant Pular language;
– from Susu (and probably Jalonke) to Atlantic languages of the Maritime Guinea: Baga Fore, Baga Pukur (Mboteni and Binari), Nalu, Basari, but also to the Futa-Jallon Fula.
The main groups of Atlantic loans into Mande are the following:
– Fula loans in Kakabe constitute up to 30% of the vocabulary of the language (with the exception of the southeastern dialects, much less influenced by Fula);
– there are numerous Fula loans in Soninke dating back to the same period of coexistence of the ancestors of Fulbe and Soninke in Takrūr and Futa-Toro;
– much less numerous Sereer loans in Soninke, most probably dating back to the same period as Sereer > Soninke borrowings;
– borrowings from Wolof to Soninke, but also to Bambara and Mandinka, dating back mainly to the colonial or postcolonial periods;
– Mandinka words from the substrata of minor Atlantic languages of Senegambia.
Cases of chain borrowing (e.g., Soninke > Fula > Kakabe) are attested.
Ancient borrowings are often difficult to distinguish from the common Niger-Congo stock, and it is not evident, in many cases, in what direction the borrowing occurred.
In the phonology and morphosyntax, several important features of Soninke may be due to the Fula or Fula-Sereer influence: the 5-vowel (instead of 7-vowel) system, initial consonant alternation, presence of geminated consonants. There are instances of borrowing of derivational suffixes from Fula to Soninke and from Soninke to Fula. In Kakabe, massive Fula loans have resulted in borrowing of implosive consonants ɓ, ɗ, ƴ and in the emergence of geminated consonants. In the northwestern dialect of Kakabe, a suffix of passive voice has been borrowed from Fula.
Mande is a mid-range language family in Western Sub-Saharan Africa that includes 60 to 75 languages spoken by 30 to 40 million people. According to the glottochronological data, its genetic depth is between 5,000 and 5,500 years. The Proto-Mande homeland can be presumably localized in the western part of the southern Sahara. Lexical data suggests that the Mande family belongs to the Niger-Congo macrofamily, but some scholars doubt it, mainly because of the lack of morphological cognates.
The first division of Mande is binary, into Western and Southeastern branches. Further on, the Western branch is subdivided into nine groups: Manding, Mokole, Vai-Kono, Jogo-Jeri, Southwestern, Susu-Jalonke, Samogho, Soninke-Bozo, and Bobo. The Southeastern branch consists of Southern and Eastern groups. The biggest Mande languages, Bambara, Maninka, Mandinka, and Jula, belong to the Manding group.
Practically all Mande languages are tonal (two to five level tones), and the tones fulfil both lexical and grammatical functions. The typical syllable structure is CV; in many languages the type CVN is also attested, while CVC is rare (Soninke, Bisa). The metrical foot is a relevant unit for many Mande languages.
The typical basic word order in a verbal clause is Subject—Auxiliary—Direct Object—Verb—Oblique. Omission of a subject is possible in some Southern and Southwestern languages, where subject pronouns have merged with auxiliaries into Personal Pronominal Markers; otherwise an overt subject is obligatory.
Inflectional morphology is almost missing in some languages, mainly innovative in some others. Noun classes and grammatical genders are lacking. In most languages, there is only one plural marker (sometimes two); agreement in number is usually missing. Morphological case is most often absent, although it is attested in some pronominal systems; noun declination is emerging in Dan (Southern Mande). In Southern, Southwestern, and Eastern groups and Bobo, there are multiple series of personal pronouns expressing case, communicative status, and often verbal categories as well (aspect, mode, polarity).
Verbal lability (mainly P-lability) is highly productive in many Mande languages, including typologically rare passive lability.
Derivational morphology is relatively rich, only suffixal for nouns, but either suffixal or prefixal for verbs. In many languages, preverbs are still separable. Reduplication is productive in many languages for pluriactionality and intensity, sometimes for nominal plurality. Word compounding is highly productive.
The structure of noun phrase is N2 + N1 + Adj + Det (N1 is head noun, N2 is dependent noun). In most Mande languages, alienable and inalienable nouns are formally distinguished; the former are connected to the possessor by auxiliary words, and in some languages, they require a special possessive series of personal pronouns.
Nominative-accusative alignment is predominant; in the Southwestern group, split semantic or ergative alignments are attested.
For relativization, varieties of correlative strategy are mostly used.
At the beginning of the 21st century, Roman-based alphabets are used for nearly all languages of the family. Arabic-based writing systems (Ajami) are of limited use for Mandinka, Jula, Susu, and Mogofin. An original syllabic writing has existed since the 1820s for Vai; since the 1950s, an original alphabet, N’ko, is broadly used for Manding languages.
Nora C. England
Mayan languages are spoken by over 5 million people in Guatemala, Mexico, Belize, and Honduras. There are around 30 different languages today, ranging in size from fairly large (about a million speakers) to very small (fewer than 30 speakers). All Mayan languages are endangered given that at least some children in some communities are not learning the language, and two languages have disappeared since European contact. Mayas developed the most elaborated and most widely attested writing system in the Americas (starting about 300 BC).
The sounds of Mayan languages consist of a voiceless stop and affricate series with corresponding glottalized stops (either implosive and ejective) and affricates, glottal stop, voiceless fricatives (including h in some of them inherited from Proto-Maya), two to three nasals, three to four approximants, and a five vowel system with contrasting vowel length (or tense/lax distinctions) in most languages. Several languages have developed contrastive tone.
The major word classes in Mayan languages include nouns, verbs, adjectives, positionals, and affect words. The difference between transitive verbs and intransitive verbs is rigidly maintained in most languages. They usually use the same aspect markers (but not always). Intransitive verbs only indicate their subjects while transitive verbs indicate both subjects and objects. Some languages have a set of status suffixes which is different for the two classes. Positionals are a root class whose most characteristic word form is a non-verbal predicate. Affect words indicate impressions of sounds, movements, and activities. Nouns have a number of different subclasses defined on the basis of characteristics when possessed, or the structure of compounds. Adjectives are formed from a small class of roots (under 50) and many derived forms from verbs and positionals.
Predicate types are transitive, intransitive, and non-verbal. Non-verbal predicates are based on nouns, adjectives, positionals, numbers, demonstratives, and existential and locative particles. They are distinct from verbs in that they do not take the usual verbal aspect markers. Mayan languages are head marking and verb initial; most have VOA flexible order but some have VAO rigid order. They are morphologically ergative and also have at least some rules that show syntactic ergativity. The most common of these is a constraint on the extraction of subjects of transitive verbs (ergative) for focus and/or interrogation, negation, or relativization. In addition, some languages make a distinction between agentive and non-agentive intransitive verbs. Some also can be shown to use obviation and inverse as important organizing principles. Voice categories include passive, antipassive and agent focus, and an applicative with several different functions.
Laura A. Michaelis
Meanings are assembled in various ways in a construction-based grammar, and this array can be represented as a continuum of idiomaticity, a gradient of lexical fixity. Constructional meanings are the meanings to be discovered at every point along the idiomaticity continuum. At the leftmost, or ‘fixed,’ extreme of this continuum are frozen idioms, like the salt of the earth and in the know. The set of frozen idioms includes those with idiosyncratic syntactic properties, like the fixed expression by and large (an exceptional pattern of coordination in which a preposition and adjective are conjoined). Other frozen idioms, like the unexceptionable modified noun red herring, feature syntax found elsewhere. At the rightmost, or ‘open’ end of this continuum are fully productive patterns, including the rule that licenses the string Kim blinked, known as the Subject-Predicate construction. Between these two poles are (a) lexically fixed idiomatic expressions, verb-headed and otherwise, with regular inflection, such as chew/chews/chewed the fat; (b) flexible expressions with invariant lexical fillers, including phrasal idioms like spill the beans and the Correlative Conditional, such as the more, the merrier; and (c) specialized syntactic patterns without lexical fillers, like the Conjunctive Conditional (e.g., One more remark like that and you’re out of here). Construction Grammar represents this range of expressions in a uniform way: whether phrasal or lexical, all are modeled as feature structures that specify phonological and morphological structure, meaning, use conditions, and relevant syntactic information (including syntactic category and combinatoric potential).
Matthew K. Gordon
Metrical structure refers to the phonological representations capturing the prominence relationships between syllables, usually manifested phonetically as differences in levels of stress. There is considerable diversity in the range of stress systems found cross-linguistically, although attested patterns represent a small subset of those that are logically possible. Stress systems may be broadly divided into two groups, based on whether they are sensitive to the internal structure, or weight, of syllables or not, with further subdivisions based on the number of stresses per word and the location of those stresses. An ongoing debate in metrical stress theory concerns the role of constituency in characterizing stress patterns. Certain approaches capture stress directly in terms of a metrical grid in which more prominent syllables are associated with a greater number of grid marks than less prominent syllables. Others assume the foot as a constituent, where theories differ in the inventory of feet they assume. Support for foot-based theories of stress comes from segmental alternations that are explicable with reference to the foot but do not readily emerge in an apodal framework. Computational tools, increasingly, are being incorporated in the evaluation of phonological theories, including metrical stress theories. Computer-generated factorial typologies provide a rigorous means for determining the fit between the empirical coverage afforded by metrical theories and the typology of attested stress systems. Computational simulations also enable assessment of the learnability of metrical representations within different theories.
Cynthia L. Allen
Middle English is the name given to the English of the period from approximately 1100 to approximately 1450. This period is marked by substantial developments in all areas of English grammar. It is also the period of English when different dialects are the most fully attested in the texts. At the beginning of the Middle English period, the sociolinguistic status of English was low due to the Norman Invasion, and although religious texts of Old English composition continued to be copied and updated, few original compositions are extant. By the end of the period, English had regained its status as the language of government, law, and literature generally.
Although some notable changes to the phonemic inventory of consonants date from the Middle English period, the most dramatic phonological developments of the period involve vowels. The reduction of the vowels of unstressed syllables, one of the changes that marks the beginning of the Middle English period, is a phonological change with substantial morphological effects, as it substantially reduced the number of distinctive inflectional forms. Constituent order replaced case marking as the primary means of signaling grammatical relations. By the end of the Middle English period, subject-verb-object order had become established as the norm.
The lexicon of English was transformed in this period by an enormous influx of French words. The role of derivational morphology declined as its functions were to some extent replaced by the adoption of French words. Most Scandinavian loans in English first appear in the texts of this period. The Scandinavian loans are typically everyday words, while the words adopted from French are more often in areas of government, law, and higher culture, reflecting the nature of the contact between English speakers and the speakers of these languages.
The density of the Scandinavian population in the northern part of England is generally held to be responsible for the earlier appearance of changes in the north than in the south. The replacement of the third person plural personal pronoun hie by the Scandinavian they is an example of a development which is apparent only in the north early in Middle English but became general in English by the end of this period.
An important phonological development of later Middle English is the beginning of the Great Vowel Shift, which affected long vowels and involved successive changes and was implemented differently in different dialects, the north-south divide being the most evident.
Early Middle English is a language that cannot be understood by Modern English readers without special study, while the language of the late Middle English period, especially that coming from the London area, can be understood with the heavy use of explanatory notes.
Missionary dictionaries are printed books or manuscripts compiled by missionaries in which words are listed systematically followed by words which have the same meaning in another language. These dictionaries were mainly written as tools for language teaching and learning in a missionary-colonial setting, although quite a few dictionaries have also a more encyclopedic character, containing invaluable information on non-Western cultures from all continents. In this article, several types of dictionaries are analyzed: bilingual-monodirectional, bilingual and bidirectional, and multilingual. Most examples are taken from an illustrative selected corpus of missionary dictionaries describing non-Western and languages during the colonial period, with particular focus on the function of these dictionaries in a missionary context, the users, macrostructure, organizational principles, and the typology of the microstructure and markedness in lemmatization.
Missionary grammars are printed books or manuscripts compiled by missionaries in which a particular language is described. These grammars were mainly written as pedagogical tools for language teaching and learning in a missionary-colonial setting, although quite a few grammars have also a more normative character. Missionary grammars contain usually an opening section, a prologue, in which the author exhibits the objectives of his work. The first part is usually a short introduction into phonology and orthography, followed by the largest section, which is devoted to morphology, arranged according to the traditional division of the parts of speech. The final section is sometimes devoted to syntax, but the topics included can vary considerably. Sometimes word lists are appended, containing body parts, measures, counting, manners of speaking, or rhetorical figures. The data presented in the grammar are mainly based on an oral corpus, whereas in other cases high registers from prestigious texts are used in which the eloquence or elegance of the language under study is illustrated. These grammars are modeled according to the traditional Greco-Latin framework and often contain invaluable information regarding language typologies, semantics, and pragmatics. In the New World, Asia, and elsewhere, missionaries had to find an adequate methodology in order to describe typological features they had never seen before. They adapted European models to new linguistic realities and created original works which deserve our attention within the discipline of the history of linguistics alongside contemporary pedagogical works written in Europe. This article concentrates on sources written in Spanish, Portuguese, and Latin during the colonial period, since these sources outnumber the production of missionary grammars in other languages.
Mixed languages are a rare category of contact language which has gone from being an oddity of contact linguistics to the subject of media excitement, at least for one mixed language—Light Warlpiri. They show considerable diversity in structure, social function, and historical origins; nonetheless, they all emerged in situations of bilingualism where a common language is already present. In this respect, they do not serve a communicative function, but rather are markers of an in-group identity. Mixed languages provide a unique opportunity to study the often observable birth, life, and death of languages both in terms of the sociohistorical context of language genesis and the structural evolution of language.
Computational models of human sentence comprehension help researchers reason about how grammar might actually be used in the understanding process. Taking a cognitivist approach, this article relates computational psycholinguistics to neighboring fields (such as linguistics), surveys important precedents, and catalogs open problems.