Type theory is a regime for classifying objects (including events) into categories called types. It was originally designed in order to overcome problems relating to the foundations of mathematics relating to Russell’s paradox. It has made an immense contribution to the study of logic and computer science and has also played a central role in formal semantics for natural languages since the initial work of Richard Montague building on the typed λ-calculus. More recently, type theories following in the tradition created by Per Martin-Löf have presented an important alternative to Montague’s type theory for semantic analysis. These more modern type theories yield a rich collection of types which take on a role of representing semantic content rather than simply structuring the universe in order to avoid paradoxes.
12
Article
Yvan Rose
Child phonology refers to virtually every phonetic and phonological phenomenon observable in the speech productions of children, including babbles. This includes qualitative and quantitative aspects of babbled utterances as well as all behaviors such as the deletion or modification of the sounds and syllables contained in the adult (target) forms that the child is trying to reproduce in his or her spoken utterances. This research is also increasingly concerned with issues in speech perception, a field of investigation that has traditionally followed its own course; it is only recently that the two fields have started to converge. The recent history of research on child phonology, the theoretical approaches and debates surrounding it, as well as the research methods and resources that have been employed to address these issues empirically, parallel the evolution of phonology, phonetics, and psycholinguistics as general fields of investigation. Child phonology contributes important observations, often organized in terms of developmental time periods, which can extend from the child’s earliest babbles to the stage when he or she masters the sounds, sound combinations, and suprasegmental properties of the ambient (target) language. Central debates within the field of child phonology concern the nature and origins of phonological representations as well as the ways in which they are acquired by children. Since the mid-1900s, the most central approaches to these questions have tended to fall on each side of the general divide between generative vs. functionalist (usage-based) approaches to phonology. Traditionally, generative approaches have embraced a universal stance on phonological primitives and their organization within hierarchical phonological representations, assumed to be innately available as part of the human language faculty. In contrast to this, functionalist approaches have utilized flatter (non-hierarchical) representational models and rejected nativist claims about the origin of phonological constructs. Since the beginning of the 1990s, this divide has been blurred significantly, both through the elaboration of constraint-based frameworks that incorporate phonetic evidence, from both speech perception and production, as part of accounts of phonological patterning, and through the formulation of emergentist approaches to phonological representation. Within this context, while controversies remain concerning the nature of phonological representations, debates are fueled by new outlooks on factors that might affect their emergence, including the types of learning mechanisms involved, the nature of the evidence available to the learner (e.g., perceptual, articulatory, and distributional), as well as the extent to which the learner can abstract away from this evidence. In parallel, recent advances in computer-assisted research methods and data availability, especially within the context of the PhonBank project, offer researchers unprecedented support for large-scale investigations of child language corpora. This combination of theoretical and methodological advances provides new and fertile grounds for research on child phonology and related implications for phonological theory.
Article
Anne Storch
Even though the concept of multilingualism is well established in linguistics, it is problematic, especially in light of the actual ways in which repertoires are composed and used. The term “multilingualism” bears in itself the notion of several clearly discernable languages and suggests that regardless of the sociolinguistic setting, language ideologies, social history and context, a multilingual individual will be able to separate the various codes that constitute his or her communicative repertoire and use them deliberately in a reflected way. Such a perspective on language isn’t helpful in understanding any sociolinguistic setting and linguistic practice that is not a European one and that doesn’t correlate with ideologies and practices of a standardized, national language. This applies to the majority of people living on the planet and to most people who speak African languages. These speakers differ from the ideological concept of the “Western monolingual,” as they employ diverse practices and linguistic features on a daily basis and do so in a very flexible way. Which linguistic features a person uses thereby depends on factors such as socialization, placement, and personal interest, desires and preferences, which are all likely to change several times during a person’s life. Therefore, communicative repertoires are never stable, neither in their composition nor in the ways they are ideologically framed and evaluated. A more productive perspective on the phenomenon of complex communicative repertoires puts the concept of languaging in the center, which refers to communicative practices, dynamically operating between different practices and (multimodal) linguistic features. Individual speakers thereby perceive and evaluate ways of speaking according to the social meaning, emotional investment, and identity-constituting functions they can attribute to them. The fact that linguistic reflexivity to African speakers might almost always involve the negotiation of the self in a (post)colonial world invites us to consider a critical evaluation, based on approaches such as Southern Theory, of established concepts of “language” and “multilingualism”: languaging is also a postcolonial experience, and this experience often translates into how speakers single out specific ways of speaking as “more prestigious” or “more developed” than others. The inclusion of African metalinguistics and indigenuous knowledge consequently is an important task of linguists studying communicative repertoires in Africa or its diaspora.
Article
Carola Trips
Morphological change refers to change(s) in the structure of words. Since morphology is interrelated with phonology, syntax, and semantics, changes affecting the structure and properties of words should be seen as changes at the respective interfaces of grammar.
On a more abstract level, this point relates to linguistic theory. Looking at the history of morphological theory, mainly from a generative perspective, it becomes evident that despite a number of papers that have contributed to a better understanding of the role of morphology in grammar, both from a synchronic and diachronic point of view, it is still seen as a “Cinderella subject” today. So there is still a need for further research in this area.
Generally, the field of diachronic morphology has been dealing with the identification of the main types of change, their mechanisms as well as the causes of morphological change, the latter of which are traditionally categorized as internal and external change. Some authors take a more general view and state the locus of change can be seen in the transmission of grammar from one generation to the next (abductive change). Concerning the main types of change, we can say that many of them occur at the interfaces with morphology: changes on the phonology–morphology interface like i-mutation, changes on the syntax–morphology interface like the rise of inflectional morphology, and changes on the semantics–morphology like the rise of derivational suffixes. Examples from the history of English (which in this article are sometimes complemented with examples from German and the Romance languages) illustrate that sometimes changes indeed cross component boundaries, at least once (the history of the linking-s in German has even become a prosodic phenomenon). Apart from these interface phenomena, it is common lore to assume morphology-internal changes, analogy being the most prominent example.
A phenomenon regularly discussed in the context of morphological change is grammaticalization. Some authors have posed the question of whether such special types of change really exist or whether they are, after all, general processes of change that should be modeled in a general theory of linguistic change. Apart from this pressing question, further aspects that need to be addressed in the future are the modularity of grammar and the place of morphology.
Article
Ur Shlonsky and Giuliano Bocci
Syntactic cartography emerged in the 1990s as a result of the growing consensus in the field about the central role played by functional elements and by morphosyntactic features in syntax. The declared aim of this research direction is to draw maps of the structures of syntactic constituents, characterize their functional structure, and study the array and hierarchy of syntactically relevant features. Syntactic cartography has made significant empirical discoveries, and its methodology has been very influential in research in comparative syntax and morphosyntax. A central theme in current cartographic research concerns the source of the emerging featural/structural hierarchies. The idea that the functional hierarchy is not a primitive of Universal Grammar but derives from other principles does not undermine the scientific relevance of the study of the cartographic structures. On the contrary, the cartographic research aims at providing empirical evidence that may help answer these questions about the source of the hierarchy and shed light on how the computational principles and requirements of the interface with sound and meaning interact.
Article
Masahiko Minami
Empirical and theoretical research on language has recently experienced a period of extensive growth. Unfortunately, however, in the case of the Japanese language, far fewer studies—particularly those written in English—have been presented on adult second language (L2) learners and bilingual children. As the field develops, it is increasingly important to integrate theoretical concepts and empirical research findings in second language acquisition (SLA) of Japanese, so that the concepts and research can be eventually applied to educational practice. This article attempts to: (a) address at least some of the gaps currently existing in the literature, (b) deal with important topics to the extent possible, and (c) discuss various problems with regard to adult learners of Japanese as an L2 and English–Japanese bilingual children. Specifically, the article first examines the characteristics of the Japanese language. Tracing the history of SLA studies, this article then deliberately touches on a wide spectrum of domains of linguistic knowledge (e.g., phonology and phonetics, morphology, lexicon, semantics, syntax, discourse), context of language use (e.g., interactive conversation, narrative), research orientations (e.g., formal linguistics, psycholinguistics, social psychology, sociolinguistics), and age groups (e.g., children, adults). Finally, by connecting past SLA research findings in English and recent/present concerns in Japanese as SLA with a focus on the past 10 years including corpus linguistics, this article provides the reader with an overview of the field of Japanese linguistics and its critical issues.
Article
D. H. Whalen
The Motor Theory of Speech Perception is a proposed explanation of the fundamental relationship between the way speech is produced and the way it is perceived. Associated primarily with the work of Liberman and colleagues, it posited the active participation of the motor system in the perception of speech. Early versions of the theory contained elements that later proved untenable, such as the expectation that the neural commands to the muscles (as seen in electromyography) would be more invariant than the acoustics. Support drawn from categorical perception (in which discrimination is quite poor within linguistic categories but excellent across boundaries) was called into question by studies showing means of improving within-category discrimination and finding similar results for nonspeech sounds and for animals perceiving speech. Evidence for motor involvement in perceptual processes nonetheless continued to accrue, and related motor theories have been proposed. Neurological and neuroimaging results have yielded a great deal of evidence consistent with variants of the theory, but they highlight the issue that there is no single “motor system,” and so different components appear in different contexts. Assigning the appropriate amount of effort to the various systems that interact to result in the perception of speech is an ongoing process, but it is clear that some of the systems will reflect the motor control of speech.
Article
Anne-Michelle Tessier
Phonological learnability deals with the formal properties of phonological languages and grammars, which are combined with algorithms that attempt to learn the language-specific aspects of those grammars. The classical learning task can be outlined as follows: Beginning at a predetermined initial state, the learner is exposed to positive evidence of legal strings and structures from the target language, and its goal is to reach a predetermined end state, where the grammar will produce or accept all and only the target language’s strings and structures. In addition, a phonological learner must also acquire a set of language-specific representations for morphemes, words and so on—and in many cases, the grammar and the representations must be acquired at the same time.
Phonological learnability research seeks to determine how the architecture of the grammar, and the workings of an associated learning algorithm, influence success in completing this learning task, i.e., in reaching the end-state grammar. One basic question is about convergence: Is the learning algorithm guaranteed to converge on an end-state grammar, or will it never stabilize? Is there a class of initial states, or a kind of learning data (evidence), which can prevent a learner from converging? Next is the question of success: Assuming the algorithm will reach an end state, will it match the target? In particular, will the learner ever acquire a grammar that deems grammatical a superset of the target language’s legal outputs? How can the learner avoid such superset end-state traps? Are learning biases advantageous or even crucial to success?
In assessing phonological learnability, the analysist also has many differences between potential learning algorithms to consider. At the core of any algorithm is its update rule, meaning its method(s) of changing the current grammar on the basis of evidence. Other key aspects of an algorithm include how it is triggered to learn, how it processes and/or stores the errors that it makes, and how it responds to noise or variability in the learning data. Ultimately, the choice of algorithm is also tied to the type of phonological grammar being learned, i.e., whether the generalizations being learned are couched within rules, features, parameters, constraints, rankings, and/or weightings.
Article
Laura A. Michaelis
Meanings are assembled in various ways in a construction-based grammar, and this array can be represented as a continuum of idiomaticity, a gradient of lexical fixity. Constructional meanings are the meanings to be discovered at every point along the idiomaticity continuum. At the leftmost, or ‘fixed,’ extreme of this continuum are frozen idioms, like the salt of the earth and in the know. The set of frozen idioms includes those with idiosyncratic syntactic properties, like the fixed expression by and large (an exceptional pattern of coordination in which a preposition and adjective are conjoined). Other frozen idioms, like the unexceptionable modified noun red herring, feature syntax found elsewhere. At the rightmost, or ‘open’ end of this continuum are fully productive patterns, including the rule that licenses the string Kim blinked, known as the Subject-Predicate construction. Between these two poles are (a) lexically fixed idiomatic expressions, verb-headed and otherwise, with regular inflection, such as chew/chews/chewed the fat; (b) flexible expressions with invariant lexical fillers, including phrasal idioms like spill the beans and the Correlative Conditional, such as the more, the merrier; and (c) specialized syntactic patterns without lexical fillers, like the Conjunctive Conditional (e.g., One more remark like that and you’re out of here). Construction Grammar represents this range of expressions in a uniform way: whether phrasal or lexical, all are modeled as feature structures that specify phonological and morphological structure, meaning, use conditions, and relevant syntactic information (including syntactic category and combinatoric potential).
Article
Nicholas Allott
Conversational implicatures (i) are implied by the speaker in making an utterance; (ii) are part of the content of the utterance, but (iii) do not contribute to direct (or explicit) utterance content; and (iv) are not encoded by the linguistic meaning of what has been uttered. In (1), Amelia asserts that she is on a diet, and implicates something different: that she is not having cake.
(1)Benjamin:Are you having some of this chocolate cake?Amelia:I’m on a diet.
Conversational implicatures are a subset of the implications of an utterance: namely those that are part of utterance content. Within the class of conversational implicatures, there are distinctions between particularized and generalized implicatures; implicated premises and implicated conclusions; and weak and strong implicatures.
An obvious question is how implicatures are possible: how can a speaker intentionally imply something that is not part of the linguistic meaning of the phrase she utters, and how can her addressee recover that utterance content? Working out what has been implicated is not a matter of deduction, but of inference to the best explanation. What is to be explained is why the speaker has uttered the words that she did, in the way and in the circumstances that she did.
Grice proposed that rational talk exchanges are cooperative and are therefore governed by a Cooperative Principle (CP) and conversational maxims: hearers can reasonably assume that rational speakers will attempt to cooperate and that rational cooperative speakers will try to make their contribution truthful, informative, relevant and clear, inter alia, and these expectations therefore guide the interpretation of utterances. On his view, since addressees can infer implicatures, speakers can take advantage of their ability, conveying implicatures by exploiting the maxims.
Grice’s theory aimed to show how implicatures could in principle arise. In contrast, work in linguistic pragmatics has attempted to model their actual derivation. Given the need for a cognitively tractable decision procedure, both the neo-Gricean school and work on communication in relevance theory propose a system with fewer principles than Grice’s. Neo-Gricean work attempts to reduce Grice’s array of maxims to just two (Horn) or three (Levinson), while Sperber and Wilson’s relevance theory rejects maxims and the CP and proposes that pragmatic inference hinges on a single communicative principle of relevance.
Conversational implicatures typically have a number of interesting properties, including calculability, cancelability, nondetachability, and indeterminacy. These properties can be used to investigate whether a putative implicature is correctly identified as such, although none of them provides a fail-safe test. A further test, embedding, has also been prominent in work on implicatures.
A number of phenomena that Grice treated as implicatures would now be treated by many as pragmatic enrichment contributing to the proposition expressed. But Grice’s postulation of implicatures was a crucial advance, both for its theoretical unification of apparently diverse types of utterance content and for the attention it drew to pragmatic inference and the division of labor between linguistic semantics and pragmatics in theorizing about verbal communication.
Article
Yarden Kedar
A fundamental question in epistemological philosophy is whether reason may be based on a priori knowledge—that is, knowledge that precedes and which is independent of experience. In modern science, the concept of innateness has been associated with particular behaviors and types of knowledge, which supposedly have been present in the organism since birth (in fact, since fertilization)—prior to any sensory experience with the environment.
This line of investigation has been traditionally linked to two general types of qualities: the first consists of instinctive and inflexible reflexes, traits, and behaviors, which are apparent in survival, mating, and rearing activities. The other relates to language and cognition, with certain concepts, ideas, propositions, and particular ways of mental computation suggested to be part of one’s biological make-up. While both these types of innatism have a long history (e.g., debate by Plato and Descartes), some bias appears to exist in favor of claims for inherent behavioral traits, which are typically accepted when satisfactory empirical evidence is provided. One famous example is Lorenz’s demonstration of imprinting, a natural phenomenon that obeys a predetermined mechanism and schedule (incubator-hatched goslings imprinted on Lorenz’s boots, the first moving object they encountered). Likewise, there seems to be little controversy in regard to predetermined ways of organizing sensory information, as is the case with the detection and classification of shapes and colors by the mind.
In contrast, the idea that certain types of abstract knowledge may be part of an organism’s biological endowment (i.e., not learned) is typically met with a greater sense of skepticism. The most influential and controversial claim for such innate knowledge in modern science is Chomsky’s nativist theory of Universal Grammar in language, which aims to define the extent to which human languages can vary; and the famous Argument from the Poverty of the Stimulus. The main Chomskyan hypothesis is that all human beings share a preprogrammed linguistic infrastructure consisting of a finite set of general principles, which can generate (through combination or transformation) an infinite number of (only) grammatical sentences. Thus, the innate grammatical system constrains and structures the acquisition and use of all natural languages.
Article
Bert Remijsen
When the phonological form of a morpheme—a unit of meaning that cannot be decomposed further into smaller units of meaning—involves a particular melodic pattern as part of its sound shape, this morpheme is specified for tone. In view of this definition, phrase- and utterance-level melodies—also known as intonation—are not to be interpreted as instances of tone. That is, whereas the question “Tomorrow?” may be uttered with a rising melody, this melody is not tone, because it is not a part of the lexical specification of the morpheme tomorrow. A language that presents morphemes that are specified with specific melodies is called a tone language. It is not the case that in a tone language every morpheme, content word, or syllable would be specified for tone. Tonal specification can be highly restricted within the lexicon. Examples of such sparsely specified tone languages include Swedish, Japanese, and Ekagi (a language spoken in the Indonesian part of New Guinea); in these languages, only some syllables in some words are specified for tone. There are also tone languages where each and every syllable of each and every word has a specification. Vietnamese and Shilluk (a language spoken in South Sudan) illustrate this configuration. Tone languages also vary greatly in terms of the inventory of phonological tone forms. The smallest possible inventory contrasts one specification with the absence of specification. But there are also tone languages with eight or more distinctive tone categories. The physical (acoustic) realization of the tone categories is primarily fundamental frequency (F0), which is perceived as pitch. However, often other phonetic correlates are also involved, in particular voice quality. Tone plays a prominent role in the study of phonology because of its structural complexity. That is, in many languages, the way a tone surfaces is conditioned by factors such as the segmental composition of the morpheme, the tonal specifications of surrounding constituents, morphosyntax, and intonation. On top of this, tone is diachronically unstable. This means that, when a language has tone, we can expect to find considerable variation between dialects, and more of it than in relation to other parts of the sound system.
Article
Marieke Woensdregt and Kenny Smith
Pragmatics is the branch of linguistics that deals with language use in context. It looks at the meaning linguistic utterances can have beyond their literal meaning (implicature), and also at presupposition and turn taking in conversation. Thus, pragmatics lies on the interface between language and social cognition.
From the point of view of both speaker and listener, doing pragmatics requires reasoning about the minds of others. For instance, a speaker has to think about what knowledge they share with the listener to choose what information to explicitly encode in their utterance and what to leave implicit. A listener has to make inferences about what the speaker meant based on the context, their knowledge about the speaker, and their knowledge of general conventions in language use. This ability to reason about the minds of others (usually referred to as “mindreading” or “theory of mind”) is a cognitive capacity that is uniquely developed in humans compared to other animals.
What we know about how pragmatics (and the underlying ability to make inferences about the minds of others) has evolved. Biological evolution and cultural evolution are the two main processes that can lead to the development of a complex behavior over generations, and we can explore to what extent they account for what we know about pragmatics.
In biological evolution, changes happen as a result of natural selection on genetically transmitted traits. In cultural evolution on the other hand, selection happens on skills that are transmitted through social learning. Many hypotheses have been put forward about the role that natural selection may have played in the evolution of social and communicative skills in humans (for example, as a result of changes in food sources, foraging strategy, or group size). The role of social learning and cumulative culture, however, has been often overlooked. This omission is particularly striking in the case of pragmatics, as language itself is a prime example of a culturally transmitted skill, and there is solid evidence that the pragmatic capacities that are so central to language use may themselves be partially shaped by social learning.
In light of empirical findings from comparative, developmental, and experimental research, we can consider the potential contributions of both biological and cultural evolutionary mechanisms to the evolution of pragmatics. The dynamics of types of evolutionary processes can also be explored using experiments and computational models.
Article
Kodi Weatherholtz and T. Florian Jaeger
The seeming ease with which we usually understand each other belies the complexity of the processes that underlie speech perception. One of the biggest computational challenges is that different talkers realize the same speech categories (e.g., /p/) in physically different ways. We review the mixture of processes that enable robust speech understanding across talkers despite this lack of invariance. These processes range from automatic pre-speech adjustments of the distribution of energy over acoustic frequencies (normalization) to implicit statistical learning of talker-specific properties (adaptation, perceptual recalibration) to the generalization of these patterns across groups of talkers (e.g., gender differences).
Article
Birgit Alber and Sabine Arndt-Lappe
Work on the relationship between morphology and metrical structure has mainly addressed three questions:
1. How does morphological constituent structure map onto prosodic constituent structure, i.e., the structure that is responsible for metrical organization?
2. What are the reflexes of morphological relations between complex words and their bases in metrical structure?
3. How variable or categorical are metrical alternations?
The focus in the work specified in question 1 has been on establishing prosodic constituency with supported evidence from morphological constituency. Pertinent prosodic constituents are the prosodic (or phonological) word, the metrical foot, the syllable, and the mora (Selkirk, 1980). For example, the phonological behavior of certain affixes has been used to argue that they are word-internal prosodic words, which thus means that prosodic words may be recursive structures (e.g., Aronoff & Sridhar, 1987). Similarly, the shape of truncated words has been used as evidence for the shape of the metrical foot (cf., e.g., Alber & Arndt-Lappe, 2012).
Question 2 considers morphologically conditioned metrical alternations. Stress alternations have received particular attention. Affixation processes differ in whether or not they exhibit stress alternations. Affixes that trigger stress alternations are commonly referred to as 'stress-shifting' affixes, those that do not are referred to as 'stress-preserving' affixes. The fact that morphological categories differ in their stress behavior has figured prominently in theoretical debates about the phonology-morphology interface, in particular between accounts that assume a stratal architecture with interleaving phonology-morphology modules (such as lexical phonology, esp. Kiparsky, 1982, 1985) and those that assume that morphological categories come with their own phonologies (e.g., Inkelas, Orgun, & Zoll, 1997; Inkelas & Zoll, 2007; Orgun, 1996).
Question 3 looks at metrical variation and its relation to the processing of morphologically complex words. There is a growing body of recent empirical work showing that some metrical alternations seem variable (e.g., Collie, 2008; Dabouis, 2019). This means that different stress patterns occur within a single morphological category. Theoretical explanations of the phenomenon vary depending on the framework adopted. However, what unites pertinent research seems to be that the variation is codetermined by measures that are usually associated with lexical storage. These are semantic transparency, productivity, and measures of lexical frequency.
Article
Kazutaka Kurisu
This article discusses several important phonological issues concerning subtractive processes in morphology. First, this article addresses the scope of subtractive processes that linguistic theories should be concerned with. Many subtractive processes fall in the realm of grammatical theories. Subsequently, previous processual and affixal approaches to subtractive morphology and nonconcatenative allomorphy are reviewed. Then, theoretical restrictiveness is taken up. Proponents of the affixal view often claim that it is more restrictive than the processual view, but their argument is not convincing. We do not know enough to discuss theoretical restrictiveness. Finally, earlier analyses of subtractive morphology in parallel and serial Optimality Theory are reviewed. We have not accomplished enough in this respect, so no conclusive choice of parallelism or serialism is possible at present. As a whole, there are too many unsettled matters to conclude about the nature of subtractive processes in morphology.
Article
Dirk Geeraerts
Lexical semantics is the study of word meaning. Descriptively speaking, the main topics studied within lexical semantics involve either the internal semantic structure of words, or the semantic relations that occur within the vocabulary. Within the first set, major phenomena include polysemy (in contrast with vagueness), metonymy, metaphor, and prototypicality. Within the second set, dominant topics include lexical fields, lexical relations, conceptual metaphor and metonymy, and frames. Theoretically speaking, the main theoretical approaches that have succeeded each other in the history of lexical semantics are prestructuralist historical semantics, structuralist semantics, and cognitive semantics. These theoretical frameworks differ as to whether they take a system-oriented rather than a usage-oriented approach to word-meaning research but, at the same time, in the historical development of the discipline, they have each contributed significantly to the descriptive and conceptual apparatus of lexical semantics.
Article
Jordan Kodner
A computational learner needs three things: Data to learn from, a class of representations to acquire, and a way to get from one to the other. Language acquisition is a very particular learning setting that can be defined in terms of the input (the child’s early linguistic experience) and the output (a grammar capable of generating a language very similar to the input). The input is infamously impoverished. As it relates to morphology, the vast majority of potential forms are never attested in the input, and those that are attested follow an extremely skewed frequency distribution. Learners nevertheless manage to acquire most details of their native morphologies after only a few years of input. That said, acquisition is not instantaneous nor is it error-free. Children do make mistakes, and they do so in predictable ways which provide insights into their grammars and learning processes.
The most elucidating computational model of morphology learning from the perspective of a linguist is one that learns morphology like a child does, that is, on child-like input and along a child-like developmental path. This article focuses on clarifying those aspects of morphology acquisition that should go into such an elucidating a computational model. Section 1 describes the input with a focus on child-directed speech corpora and input sparsity. Section 2 discusses representations with focuses on productivity, developmental paths, and formal learnability. Section 3 surveys the range of learning tasks that guide research in computational linguistics and NLP with special focus on how they relate to the acquisition setting. The conclusion in Section 4 presents a summary of morphology acquisition as a learning problem with Table 4 highlighting the key takeaways of this article.
Article
Arto Anttila
Language is a system that maps meanings to forms, but the mapping is not always one-to-one. Variation means that one meaning corresponds to multiple forms, for example faster ~ more fast. The choice is not uniquely determined by the rules of the language, but is made by the individual at the time of performance (speaking, writing). Such choices abound in human language. They are usually not just a matter of free will, but involve preferences that depend on the context, including the phonological context. Phonological variation is a situation where the choice among expressions is phonologically conditioned, sometimes statistically, sometimes categorically. In this overview, we take a look at three studies of variable vowel harmony in three languages (Finnish, Hungarian, and Tommo So) formulated in three frameworks (Partial Order Optimality Theory, Stochastic Optimality Theory, and Maximum Entropy Grammar). For example, both Finnish and Hungarian have Backness Harmony: vowels must be all [+back] or all [−back] within a single word, with the exception of neutral vowels that are compatible with either. Surprisingly, some stems allow both [+back] and [−back] suffixes in free variation, for example, analyysi-na ~ analyysi-nä ‘analysis-ess’ (Finnish) and arzén-nak ~ arzén-nek ‘arsenic-dat’ (Hungarian). Several questions arise. Is the variation random or in some way systematic? Where is the variation possible? Is it limited to specific lexical items? Is the choice predictable to some extent? Are the observed statistical patterns dictated by universal constraints or learned from the ambient data? The analyses illustrate the usefulness of recent advances in the technological infrastructure of linguistics, in particular the constantly improving computational tools.
Article
Jacques Durand
Corpus Phonology is an approach to phonology that places corpora at the center of phonological research. Some practitioners of corpus phonology see corpora as the only object of investigation; others use corpora alongside other available techniques (for instance, intuitions, psycholinguistic and neurolinguistic experimentation, laboratory phonology, the study of the acquisition of phonology or of language pathology, etc.). Whatever version of corpus phonology one advocates, corpora have become part and parcel of the modern research environment, and their construction and exploitation has been modified by the multidisciplinary advances made within various fields. Indeed, for the study of spoken usage, the term ‘corpus’ should nowadays only be applied to bodies of data meeting certain technical requirements, even if corpora of spoken usage are by no means new and coincide with the birth of recording techniques. It is therefore essential to understand what criteria must be met by a modern corpus (quality of recordings, diversity of speech situations, ethical guidelines, time-alignment with transcriptions and annotations, etc.) and what tools are available to researchers. Once these requirements are met, the way is open to varying and possibly conflicting uses of spoken corpora by phonological practitioners. A traditional stance in theoretical phonology sees the data as a degenerate version of a more abstract underlying system, but more and more researchers within various frameworks (e.g., usage-based approaches, exemplar models, stochastic Optimality Theory, sociophonetics) are constructing models that tightly bind phonological competence to language use, rely heavily on quantitative information, and attempt to account for intra-speaker and inter-speaker variation. This renders corpora essential to phonological research and not a mere adjunct to the phonological description of the languages of the world.
12