A computational learner needs three things: Data to learn from, a class of representations to acquire, and a way to get from one to the other. Language acquisition is a very particular learning setting that can be defined in terms of the input (the child’s early linguistic experience) and the output (a grammar capable of generating a language very similar to the input). The input is infamously impoverished. As it relates to morphology, the vast majority of potential forms are never attested in the input, and those that are attested follow an extremely skewed frequency distribution. Learners nevertheless manage to acquire most details of their native morphologies after only a few years of input. That said, acquisition is not instantaneous nor is it error-free. Children do make mistakes, and they do so in predictable ways which provide insights into their grammars and learning processes.
The most elucidating computational model of morphology learning from the perspective of a linguist is one that learns morphology like a child does, that is, on child-like input and along a child-like developmental path. This article focuses on clarifying those aspects of morphology acquisition that should go into such an elucidating a computational model. Section 1 describes the input with a focus on child-directed speech corpora and input sparsity. Section 2 discusses representations with focuses on productivity, developmental paths, and formal learnability. Section 3 surveys the range of learning tasks that guide research in computational linguistics and NLP with special focus on how they relate to the acquisition setting. The conclusion in Section 4 presents a summary of morphology acquisition as a learning problem with Table 4 highlighting the key takeaways of this article.
Article
Computational Models of Morphological Learning
Jordan Kodner
Article
Computational Phonology
Jane Chandlee and Jeffrey Heinz
Computational phonology studies the nature of the computations necessary and sufficient for characterizing phonological knowledge. As a field it is informed by the theories of computation and phonology.
The computational nature of phonological knowledge is important because at a fundamental level it is about the psychological nature of memory as it pertains to phonological knowledge. Different types of phonological knowledge can be characterized as computational problems, and the solutions to these problems reveal their computational nature. In contrast to syntactic knowledge, there is clear evidence that phonological knowledge is computationally bounded to the so-called regular classes of sets and relations. These classes have multiple mathematical characterizations in terms of logic, automata, and algebra with significant implications for the nature of memory. In fact, there is evidence that phonological knowledge is bounded by particular subregular classes, with more restrictive logical, automata-theoretic, and algebraic characterizations, and thus by weaker models of memory.
Article
Computational Semantics
Katrin Erk
Computational semantics performs automatic meaning analysis of natural language. Research in computational semantics designs meaning representations and develops mechanisms for automatically assigning those representations and reasoning over them. Computational semantics is not a single monolithic task but consists of many subtasks, including word sense disambiguation, multi-word expression analysis, semantic role labeling, the construction of sentence semantic structure, coreference resolution, and the automatic induction of semantic information from data.
The development of manually constructed resources has been vastly important in driving the field forward. Examples include WordNet, PropBank, FrameNet, VerbNet, and TimeBank. These resources specify the linguistic structures to be targeted in automatic analysis, and they provide high-quality human-generated data that can be used to train machine learning systems. Supervised machine learning based on manually constructed resources is a widely used technique.
A second core strand has been the induction of lexical knowledge from text data. For example, words can be represented through the contexts in which they appear (called distributional vectors or embeddings), such that semantically similar words have similar representations. Or semantic relations between words can be inferred from patterns of words that link them. Wide-coverage semantic analysis always needs more data, both lexical knowledge and world knowledge, and automatic induction at least alleviates the problem.
Compositionality is a third core theme: the systematic construction of structural meaning representations of larger expressions from the meaning representations of their parts. The representations typically use logics of varying expressivity, which makes them well suited to performing automatic inferences with theorem provers.
Manual specification and automatic acquisition of knowledge are closely intertwined. Manually created resources are automatically extended or merged. The automatic induction of semantic information is guided and constrained by manually specified information, which is much more reliable. And for restricted domains, the construction of logical representations is learned from data.
It is at the intersection of manual specification and machine learning that some of the current larger questions of computational semantics are located. For instance, should we build general-purpose semantic representations, or is lexical knowledge simply too domain-specific, and would we be better off learning task-specific representations every time? When performing inference, is it more beneficial to have the solid ground of a human-generated ontology, or is it better to reason directly with text snippets for more fine-grained and gradual inference? Do we obtain a better and deeper semantic analysis as we use better and deeper manually specified linguistic knowledge, or is the future in powerful learning paradigms that learn to carry out an entire task from natural language input and output alone, without pre-specified linguistic knowledge?
Article
Computer-Based Tools for Word and Paradigm Computational Morphology
Raphael Finkel
The Word and Paradigm approach to morphology associates lexemes with tables of surface forms for different morphosyntactic property sets. Researchers express their realizational theories, which show how to derive these surface forms, using formalisms such as Network Morphology and Paradigm Function Morphology. The tables of surface forms also lend themselves to a study of the implicative theories, which infer the realizations in some cells of the inflectional system from the realizations of other cells.
There is an art to building realizational theories. First, the theories should be correct, that is, they should generate the right surface forms. Second, they should be elegant, which is much harder to capture, but includes the desiderata of simplicity and expressiveness. Without software to test a realizational theory, it is easy to sacrifice correctness for elegance. Therefore, software that takes a realizational theory and generates surface forms is an essential part of any theorist’s toolbox.
Discovering implicative rules that connect the cells in an inflectional system is often quite difficult. Some rules are immediately apparent, but others can be subtle. Software that automatically analyzes an entire table of surface forms for many lexemes can help automate the discovery process.
Researchers can use Web-based computerized tools to test their realizational theories and to discover implicative rules.
Article
Discriminative Learning and the Lexicon: NDL and LDL
Yu-Ying Chuang and R. Harald Baayen
Naive discriminative learning (NDL) and linear discriminative learning (LDL) are simple computational algorithms for lexical learning and lexical processing. Both NDL and LDL assume that learning is discriminative, driven by prediction error, and that it is this error that calibrates the association strength between input and output representations. Both words’ forms and their meanings are represented by numeric vectors, and mappings between forms and meanings are set up. For comprehension, form vectors predict meaning vectors. For production, meaning vectors map onto form vectors. These mappings can be learned incrementally, approximating how children learn the words of their language. Alternatively, optimal mappings representing the end state of learning can be estimated. The NDL and LDL algorithms are incorporated in a computational theory of the mental lexicon, the ‘discriminative lexicon’. The model shows good performance both with respect to production and comprehension accuracy, and for predicting aspects of lexical processing, including morphological processing, across a wide range of experiments. Since, mathematically, NDL and LDL implement multivariate multiple regression, the ‘discriminative lexicon’ provides a cognitively motivated statistical modeling approach to lexical processing.
Article
Game Theory in Pragmatics: Evolution, Rationality, and Reasoning
Michael Franke
Game theory provides formal means of representing and explaining action choices in social decision situations where the choices of one participant depend on the choices of another. Game theoretic pragmatics approaches language production and interpretation as a game in this sense. Patterns in language use are explained as optimal, rational, or at least nearly optimal or rational solutions to a communication problem. Three intimately related perspectives on game theoretic pragmatics are sketched here: (i) the evolutionary perspective explains language use as the outcome of some optimization process, (ii) the rationalistic perspective pictures language use as a form of rational decision-making, and (iii) the probabilistic reasoning perspective considers specifically speakers’ and listeners’ beliefs about each other. There are clear commonalities behind these three perspectives, and they may in practice blend into each other.
At the heart of game theoretic pragmatics lies the idea that speaker and listener behavior, when it comes to using a language with a given semantic meaning, are attuned to each other. By focusing on the evolutionary or rationalistic perspective, we can then give a functional account of general patterns in our pragmatic language use. The probabilistic reasoning perspective invites modeling actual speaker and listener behavior, for example, as it shows in quantitative aspects of experimental data.
Article
Generative Grammar
Knut Tarald Taraldsen
This article presents different types of generative grammar that can be used as models of natural languages focusing on a small subset of all the systems that have been devised. The central idea behind generative grammar may be rendered in the words of Richard Montague: “I reject the contention that an important theoretical difference exists between formal and natural languages” (“Universal Grammar,” Theoria, 36 [1970], 373–398).
Article
Learnability and Learning Algorithms in Phonology
Anne-Michelle Tessier
Phonological learnability deals with the formal properties of phonological languages and grammars, which are combined with algorithms that attempt to learn the language-specific aspects of those grammars. The classical learning task can be outlined as follows: Beginning at a predetermined initial state, the learner is exposed to positive evidence of legal strings and structures from the target language, and its goal is to reach a predetermined end state, where the grammar will produce or accept all and only the target language’s strings and structures. In addition, a phonological learner must also acquire a set of language-specific representations for morphemes, words and so on—and in many cases, the grammar and the representations must be acquired at the same time.
Phonological learnability research seeks to determine how the architecture of the grammar, and the workings of an associated learning algorithm, influence success in completing this learning task, i.e., in reaching the end-state grammar. One basic question is about convergence: Is the learning algorithm guaranteed to converge on an end-state grammar, or will it never stabilize? Is there a class of initial states, or a kind of learning data (evidence), which can prevent a learner from converging? Next is the question of success: Assuming the algorithm will reach an end state, will it match the target? In particular, will the learner ever acquire a grammar that deems grammatical a superset of the target language’s legal outputs? How can the learner avoid such superset end-state traps? Are learning biases advantageous or even crucial to success?
In assessing phonological learnability, the analysist also has many differences between potential learning algorithms to consider. At the core of any algorithm is its update rule, meaning its method(s) of changing the current grammar on the basis of evidence. Other key aspects of an algorithm include how it is triggered to learn, how it processes and/or stores the errors that it makes, and how it responds to noise or variability in the learning data. Ultimately, the choice of algorithm is also tied to the type of phonological grammar being learned, i.e., whether the generalizations being learned are couched within rules, features, parameters, constraints, rankings, and/or weightings.
Article
Models of Human Sentence Comprehension in Computational Psycholinguistics
John Hale
Computational models of human sentence comprehension help researchers reason about how grammar might actually be used in the understanding process. Taking a cognitivist approach, this article relates computational psycholinguistics to neighboring fields (such as linguistics), surveys important precedents, and catalogs open problems.
Article
Morphology and Phonotactics
Maria Gouskova
Phonotactics is the study of restrictions on possible sound sequences in a language. In any language, some phonotactic constraints can be stated without reference to morphology, but many of the more nuanced phonotactic generalizations do make use of morphosyntactic and lexical information. At the most basic level, many languages mark edges of words in some phonological way. Different phonotactic constraints hold of sounds that belong to the same morpheme as opposed to sounds that are separated by a morpheme boundary. Different phonotactic constraints may apply to morphemes of different types (such as roots versus affixes). There are also correlations between phonotactic shapes and following certain morphosyntactic and phonological rules, which may correlate to syntactic category, declension class, or etymological origins.
Approaches to the interaction between phonotactics and morphology address two questions: (1) how to account for rules that are sensitive to morpheme boundaries and structure and (2) determining the status of phonotactic constraints associated with only some morphemes. Theories differ as to how much morphological information phonology is allowed to access. In some theories of phonology, any reference to the specific identities or subclasses of morphemes would exclude a rule from the domain of phonology proper. These rules are either part of the morphology or are not given the status of a rule at all. Other theories allow the phonological grammar to refer to detailed morphological and lexical information. Depending on the theory, phonotactic differences between morphemes may receive direct explanations or be seen as the residue of historical change and not something that constitutes grammatical knowledge in the speaker’s mind.
Article
Network Morphology
Andrew Hippisley
The morphological machinery of a language is at the service of syntax, but the service can be poor. A request may result in the wrong item (deponency), or in an item the syntax already has (syncretism), or in an abundance of choices (inflectional classes or morphological allomorphy). Network Morphology regulates the service by recreating the morphosyntactic space as a network of information sharing nodes, where sharing is through inheritance, and inheritance can be overridden to allow for the regular, irregular, and, crucially, the semiregular. The network expresses the system; the way the network can be accessed expresses possible deviations from the systematic. And so Network Morphology captures the semi-systematic nature of morphology. The key data used to illustrate Network Morphology are noun inflections in the West Slavonic language Lower Sorbian, which has three genders, a rich case system and three numbers. These data allow us to observe how Network Morphology handles inflectional allomorphy, syncretism, feature neutralization, and irregularity. Latin deponent verbs are used to illustrate a Network Morphology account of morphological mismatch, where morphosyntactic features used in the syntax are expressed by morphology regularly used for different features. The analysis points to a separation of syntax and morphology in the architecture of the grammar. An account is given of Russian nominal derivation which assumes such a separation, and is based on viewing derivational morphology as lexical relatedness. Areas of the framework receiving special focus include default inheritance, global and local inheritance, default inference, and orthogonal multiple inheritance. The various accounts presented are expressed in the lexical knowledge representation language DATR, due to Roger Evans and Gerald Gazdar.
Article
New Computational Methods and the Study of the Romance Languages
Basilio Calderone and Vito Pirrelli
Nowadays, computer models of human language are instrumental to millions of people, who use them every day with little if any awareness of their existence and role. Their exponential development has had a huge impact on daily life through practical applications like machine translation or automated dialogue systems. It has also deeply affected the way we think about language as an object of scientific inquiry. Computer modeling of Romance languages has helped scholars develop new theoretical frameworks and new ways of looking at traditional approaches. In particular, computer modeling of lexical phenomena has had a profound influence on some fundamental issues in human language processing, such as the purported dichotomy between rules and exceptions, or grammar and lexicon, the inherently probabilistic nature of speakers’ perception of analogy and word internal structure, and their ability to generalize to novel items from attested evidence. Although it is probably premature to anticipate and assess the prospects of these models, their current impact on language research can hardly be overestimated. In a few years, data-driven assessment of theoretical models is expected to play an irreplaceable role in pacing progress in all branches of language sciences, from typological and pragmatic approaches to cognitive and formal ones.
Article
Psycholinguistic Methods and Tasks in Morphology
Daniel Schmidtke and Victor Kuperman
Lexical representations in an individual mind are not given to direct scrutiny. Thus, in their theorizing of mental representations, researchers must rely on observable and measurable outcomes of language processing, that is, perception, production, storage, access, and retrieval of lexical information. Morphological research pursues these questions utilizing the full arsenal of analytical tools and experimental techniques that are at the disposal of psycholinguistics. This article outlines the most popular approaches, and aims to provide, for each technique, a brief overview of its procedure in experimental practice. Additionally, the article describes the link between the processing effect(s) that the tool can elicit and the representational phenomena that it may shed light on. The article discusses methods of morphological research in the two major human linguistic faculties—production and comprehension—and provides a separate treatment of spoken, written and sign language.
Article
Psycholinguistic Research on Inflectional Morphology in the Romance Languages
Claudia Marzi and Vito Pirrelli
Over the past decades, psycholinguistic aspects of word processing have made a considerable impact on views of language theory and language architecture. In the quest for the principles governing the ways human speakers perceive, store, access, and produce words, inflection issues have provided a challenging realm of scientific inquiry, and a battlefield for radically opposing views. It is somewhat ironic that some of the most influential cognitive models of inflection have long been based on evidence from an inflectionally impoverished language like English, where the notions of inflectional regularity, (de)composability, predictability, phonological complexity, and default productivity appear to be mutually implied. An analysis of more “complex” inflection systems such as those of Romance languages shows that this mutual implication is not a universal property of inflection, but a contingency of poorly contrastive, nearly isolating inflection systems. Far from presenting minor faults in a solid, theoretical edifice, Romance evidence appears to call into question the subdivision of labor between rules and exceptions, the on-line processing vs. long-term memory dichotomy, and the distinction between morphological processes and lexical representations. A dynamic, learning-based view of inflection is more compatible with this data, whereby morphological structure is an emergent property of the ways inflected forms are processed and stored, grounded in universal principles of lexical self-organization and their neuro-functional correlates.
Article
Psycholinguistics and Aging
Michael Ramscar
Healthy aging is associated with many cognitive, linguistic, and behavioral changes. For example, adults’ reaction times slow on many tasks as they grow older, while their memories, appear to fade, especially for apparently basic linguistic information such as other people’s names. These changes have traditionally been thought to reflect declines in the processing power of human minds and brains as they age. However, from the perspective of the information-processing paradigm that dominates the study of mind, the question of whether cognitive processing capacities actually decline across the life span can only be scientifically answered in relation to functional models of the information processes that are presumed to be involved in cognition.
Consider, for example, the problem of recalling someone’s name. We are usually reminded of the names of friends on a regular basis, and this makes us good at remembering them. However, as we move through life, we inevitably learn more names. Sometimes we hear these new names only once. As we learn each new name, the average exposure we will have had to any individual name we know is likely to decline, while the number of different names we know is likely to increase. This in turn is likely to make the task of recalling a particular name more complex. One consequence of this is as follows: If Mary can only recall names with 95% accuracy at age 60—when she knows 900 names—does she necessarily have a worse memory than she did at age 16, when she could recall any of only 90 names with 98% accuracy? Answering the question of whether Mary’s memory for names has actually declined (or improved even) will require some form of quantification of Mary’s knowledge of names at any given point in her life and the definition of a quantitative model that predicts expected recall performance for a given amount of name knowledge, as well as an empirical measure of the accuracy of the model across a wide range of circumstances.
Until the early 21st century, the study of cognition and aging was dominated by approaches that failed to meet these requirements. Researchers simply established that Mary’s name recall was less accurate at a later age than it was at an earlier one, and took this as evidence that Mary’s memory processes had declined in some significant way. However, as computational approaches to studying cognitive—and especially psycholinguistic—processes and processing became more widespread, a number of matters related to the development of processing across the life span began to become apparent: First, the complexity involved in establishing whether or not Mary’s name recall did indeed become less accurate with age began to be better understood. Second, when the impact of learning on processing was controlled for, it became apparent that at least some processes showed no signs of decline at all in healthy aging. Third, the degree to which the environment—both in terms of its structure, and its susceptibility to change—further complicates our understanding of life-span cognitive performance also began to be better comprehended. These new findings not only promise to change our understanding of healthy cognitive aging, but also seem likely to alter our conceptions of cognition and language themselves.
Article
Quantitative Methods in Morphology: Corpora and Other “Big Data” Approaches
Marco Marelli
Corpora are an all-important resource in linguistics, as they constitute the primary source for large-scale examples of language usage. This has been even more evident in recent years, with the increasing availability of texts in digital format leading more and more corpus linguistics toward a “big data” approach. As a consequence, the quantitative methods adopted in the field are becoming more sophisticated and various.
When it comes to morphology, corpora represent a primary source of evidence to describe morpheme usage, and in particular how often a particular morphological pattern is attested in a given language. There is hence a tight relation between corpus linguistics and the study of morphology and the lexicon. This relation, however, can be considered bi-directional. On the one hand, corpora are used as a source of evidence to develop metrics and train computational models of morphology: by means of corpus data it is possible to quantitatively characterize morphological notions such as productivity, and corpus data are fed to computational models to capture morphological phenomena at different levels of description. On the other hand, morphology has also been applied as an organization principle to corpora. Annotations of linguistic data often adopt morphological notions as guidelines. The resulting information, either obtained from human annotators or relying on automatic systems, makes corpora easier to analyze and more convenient to use in a number of applications.
Article
Textual Inference
Annie Zaenen
Hearers and readers make inferences on the basis of what they hear or read. These inferences are partly determined by the linguistic form that the writer or speaker chooses to give to her utterance. The inferences can be about the state of the world that the speaker or writer wants the hearer or reader to conclude are pertinent, or they can be about the attitude of the speaker or writer vis-à-vis this state of affairs. The attention here goes to the inferences of the first type. Research in semantics and pragmatics has isolated a number of linguistic phenomena that make specific contributions to the process of inference. Broadly, entailments of asserted material, presuppositions (e.g., factive constructions), and invited inferences (especially scalar implicatures) can be distinguished.
While we make these inferences all the time, they have been studied piecemeal only in theoretical linguistics. When attempts are made to build natural language understanding systems, the need for a more systematic and wholesale approach to the problem is felt. Some of the approaches developed in Natural Language Processing are based on linguistic insights, whereas others use methods that do not require (full) semantic analysis.
In this article, I give an overview of the main linguistic issues and of a variety of computational approaches, especially those stimulated by the RTE challenges first proposed in 2004.
Article
Type Theory for Natural Language Semantics
Stergios Chatzikyriakidis and Robin Cooper
Type theory is a regime for classifying objects (including events) into categories called types. It was originally designed in order to overcome problems relating to the foundations of mathematics relating to Russell’s paradox. It has made an immense contribution to the study of logic and computer science and has also played a central role in formal semantics for natural languages since the initial work of Richard Montague building on the typed λ-calculus. More recently, type theories following in the tradition created by Per Martin-Löf have presented an important alternative to Montague’s type theory for semantic analysis. These more modern type theories yield a rich collection of types which take on a role of representing semantic content rather than simply structuring the universe in order to avoid paradoxes.
Article
Usage-Based Approaches to Germanic Languages
Martin Hilpert
The theoretical outlook of usage-based linguistics is a position that views language as a dynamic, evolving system and that recognizes the importance of usage frequency and frequency effects in language, as well as the foundational role of domain-general sociocognitive processes. Methodologically, usage-based studies draw on corpus-linguistic methods, experimentation, and computational modeling, often in ways that combine different methods and triangulate the results. Given the availability of corpus resources and the availability of experimental participants, there is a rich literature of usage-based studies focusing on Germanic languages, which at the same time has greatly benefited from usage-based research into other language families. This research has uncovered frequency effects based on measurements of token frequency, type frequency, collocational strength, and dispersion. These frequency effects result from the repeated experience of linguistic units such as words, collocations, morphological patterns, and syntactic constructions, which impact language production, language processing, and language change. Usage-based linguistics further investigates how the properties of linguistic structures can be explained in terms of cognitive and social processes that are not in themselves linguistic. Domain-general sociocognitive processes such as categorization, joint attention, pattern recognition, and intention reading manifest themselves in language processing and production, as well as in the structure of linguistic units. In addition to research that addresses the form and meaning of such linguistic units at different levels of linguistic organization, domains of inquiry that are in the current focus of usage-based studies include linguistic variation, first and second-language acquisition, bilingualism, and language change.