You are looking at 21-40 of 227 articles
Cedric Boeckx and Pedro Tiago Martins
All humans can acquire at least one natural language. Biolinguistics is the name given to the interdisciplinary enterprise that aims to unveil the biological bases of this unique capacity.
Blocking can be defined as the non-occurrence of some linguistic form, whose existence could be expected on general grounds, due to the existence of a rival form. *Oxes, for example, is blocked by oxen, *stealer by thief. Although blocking is closely associated with morphology, in reality the competing “forms” can not only be morphemes or words, but can also be syntactic units. In German, for example, the compound Rotwein ‘red wine’ blocks the phrasal unit *roter Wein (in the relevant sense), just as the phrasal unit rote Rübe ‘beetroot; lit. red beet’ blocks the compound *Rotrübe. In these examples, one crucial factor determining blocking is synonymy; speakers apparently have a deep-rooted presumption against synonyms. Whether homonymy can also lead to a similar avoidance strategy, is still controversial. But even if homonymy blocking exists, it certainly is much less systematic than synonymy blocking.
In all the examples mentioned above, it is a word stored in the mental lexicon that blocks a rival formation. However, besides such cases of lexical blocking, one can observe blocking among productive patterns. Dutch has three suffixes for deriving agent nouns from verbal bases, -er, -der, and -aar. Of these three suffixes, the first one is the default choice, while -der and -aar are chosen in very specific phonological environments: as Geert Booij describes in The Morphology of Dutch (2002), “the suffix -aar occurs after stems ending in a coronal sonorant consonant preceded by schwa, and -der occurs after stems ending in /r/” (p. 122). Contrary to lexical blocking, the effect of this kind of pattern blocking does not depend on words stored in the mental lexicon and their token frequency but on abstract features (in the case at hand, phonological features).
Blocking was first recognized by the Indian grammarian Pāṇini in the 5th or 4th century
Bracketing paradoxes—constructions whose morphosyntactic and morpho-phonological structures appear to be irreconcilably at odds (e.g., unhappier)—are unanimously taken to point to truths about the derivational system that we have not yet grasped. Consider that the prefix un- must be structurally separate in some way from happier both for its own reasons (its [n] surprisingly does not assimilate in Place to a following consonant (e.g., u[n]popular)), and for reasons external to the prefix (the suffix -er must be insensitive to the presence of un-, as the comparative cannot attach to bases of three syllables or longer (e.g., *intelligenter)). But, un- must simultaneously be present in the derivation before -er is merged, so that unhappier can have the proper semantic reading (‘more unhappy’, and not ‘not happier’). Bracketing paradoxes emerged as a problem for generative accounts of both morphosyntax and morphophonology only in the 1970s. With the rise of restrictions on and technology used to describe and represent the behavior of affixes (e.g., the Affix-Ordering Generalization, Lexical Phonology and Morphology, the Prosodic Hierarchy), morphosyntacticians and phonologists were confronted with this type of inconsistent derivation in many unrelated languages.
Andrej L. Malchukov
Morphological case is conventionally defined as a system of marking of a dependent nominal for the type of relationship they bear to their heads. While most linguists would agree with this definition, in practice it is often a matter of controversy whether a certain marker X counts as case in language L, or how many case values language L features. First, the distinction between morphological cases and case particles/adpositions is fuzzy in a cross-linguistic perspective. Second, the distinctions between cases can be obscured by patterns of case syncretism, leading to different analyses of the underlying system. On the functional side, it is important to distinguish between syntactic (structural), semantic, and “pragmatic” cases, yet these distinctions are not clear-cut either, as syntactic cases historically arise from the two latter sources. Moreover, case paradigms of individual languages usually show a conflation between syntactic, semantic, and pragmatic cases (see the phenomenon of “focal ergativity,” where ergative case is used when the A argument is in focus). The composition of case paradigms can be shown to follow a certain typological pattern, which is captured by case hierarchy, as proposed by Greenberg and Blake, among others. Case hierarchy constrains the way how case systems evolve (or are reduced) across languages and derives from relative markedness and, ultimately, from frequencies of individual cases. The (one-dimensional) case hierarchy is, however, incapable of capturing all recurrent polysemies of individual case markers; rather, such polysemies can be represented through a more complex two-dimensional hierarchy (semantic map), which can also be given a diachronic interpretation.
Languages from at least five genetically unrelated families are spoken in the Caucasus, but there are only three endemic linguistic families belonging to the region: Kartvelian, West Caucasian, and Northeast Caucasian. These families are rather heterogeneous in terms of the number of languages and the distribution of the speakers across them. The Caucasus represents a situation where languages with millions of speakers have coexisted with one-village languages for hundreds of years, and where multilingualism has always been the norm. The richness of Caucasian languages on every linguistic stratum is dazzling: here we find some of the largest consonant inventories, inflectional systems where the mere number of word forms strains credibility (one of the Caucasian languages, Archi, is claimed to have over a million and a half word forms), and challenging syntactic structures. The typological interest of the Caucasian languages and the challenges they present to linguistic theory lie in different areas. Thus, for Kartvelian languages, the number of factors at play in the verbal system make the task of the production of a correct verbal form far from trivial. West Caucasian languages represent an instance of polysynthetic polypersonal verb inflection, which is unusual not only for Caucasus but for Eurasia in general. East Caucasian languages have large systems of non-finite forms which, unusually, retain the ability to realize agreement in gender and number while their non-finite nature is determined by the inability to head an independent clause and to express certain morpho-syntactic categories such as illocutionary force and evidentiality. Finally, all Caucasian languages are ergative to some extent.
Child phonology refers to virtually every phonetic and phonological phenomenon observable in the speech productions of children, including babbles. This includes qualitative and quantitative aspects of babbled utterances as well as all behaviors such as the deletion or modification of the sounds and syllables contained in the adult (target) forms that the child is trying to reproduce in his or her spoken utterances. This research is also increasingly concerned with issues in speech perception, a field of investigation that has traditionally followed its own course; it is only recently that the two fields have started to converge. The recent history of research on child phonology, the theoretical approaches and debates surrounding it, as well as the research methods and resources that have been employed to address these issues empirically, parallel the evolution of phonology, phonetics, and psycholinguistics as general fields of investigation. Child phonology contributes important observations, often organized in terms of developmental time periods, which can extend from the child’s earliest babbles to the stage when he or she masters the sounds, sound combinations, and suprasegmental properties of the ambient (target) language. Central debates within the field of child phonology concern the nature and origins of phonological representations as well as the ways in which they are acquired by children. Since the mid-1900s, the most central approaches to these questions have tended to fall on each side of the general divide between generative vs. functionalist (usage-based) approaches to phonology. Traditionally, generative approaches have embraced a universal stance on phonological primitives and their organization within hierarchical phonological representations, assumed to be innately available as part of the human language faculty. In contrast to this, functionalist approaches have utilized flatter (non-hierarchical) representational models and rejected nativist claims about the origin of phonological constructs. Since the beginning of the 1990s, this divide has been blurred significantly, both through the elaboration of constraint-based frameworks that incorporate phonetic evidence, from both speech perception and production, as part of accounts of phonological patterning, and through the formulation of emergentist approaches to phonological representation. Within this context, while controversies remain concerning the nature of phonological representations, debates are fueled by new outlooks on factors that might affect their emergence, including the types of learning mechanisms involved, the nature of the evidence available to the learner (e.g., perceptual, articulatory, and distributional), as well as the extent to which the learner can abstract away from this evidence. In parallel, recent advances in computer-assisted research methods and data availability, especially within the context of the PhonBank project, offer researchers unprecedented support for large-scale investigations of child language corpora. This combination of theoretical and methodological advances provides new and fertile grounds for research on child phonology and related implications for phonological theory.
Children’s acquisition of language is an amazing feat. Children master the syntax, the sentence structure of their language, through exposure and interaction with caregivers and others but, notably, with no formal tuition. How children come to be in command of the syntax of their language has been a topic of vigorous debate since Chomsky argued against Skinner’s claim that language is ‘verbal behavior.’ Chomsky argued that knowledge of language cannot be learned through experience alone but is guided by a genetic component. This language component, known as ‘Universal Grammar,’ is composed of abstract linguistic knowledge and a computational system that is special to language. The computational mechanisms of Universal Grammar give even young children the capacity to form hierarchical syntactic representations for the sentences they hear and produce. The abstract knowledge of language guides children’s hypotheses as they interact with the language input in their environment, ensuring they progress toward the adult grammar. An alternative school of thought denies the existence of a dedicated language component, arguing that knowledge of syntax is learned entirely through interactions with speakers of the language. Such ‘usage-based’ linguistic theories assume that language learning employs the same learning mechanisms that are used by other cognitive systems. Usage-based accounts of language development view children’s earliest productions as rote-learned phrases that lack internal structure. Knowledge of linguistic structure emerges gradually and in a piecemeal fashion, with frequency playing a large role in the order of emergence for different syntactic structures.
Haihua Pan and Yuli Feng
Cross-linguistic data can add new insights to the development of semantic theories or even induce the shift of the research paradigm. The major topics in semantic studies such as bare noun denotation, quantification, degree semantics, polarity items, donkey anaphora and binding principles, long-distance reflexives, negation, tense and aspects, eventuality are all discussed by semanticists working on the Chinese language. The issues which are of particular interest include and are not limited to: (i) the denotation of Chinese bare nouns; (ii) categorization and quantificational mapping strategies of Chinese quantifier expressions (i.e., whether the behaviors of Chinese quantifier expressions fit into the dichotomy of A-Quantification and D-quantification); (iii) multiple uses of quantifier expressions (e.g., dou) and their implication on the inter-relation of semantic concepts like distributivity, scalarity, exclusiveness, exhaustivity, maximality, etc.; (iv) the interaction among universal adverbials and that between universal adverbials and various types of noun phrases, which may pose a challenge to the Principle of Compositionality; (v) the semantics of degree expressions in Chinese; (vi) the non-interrogative uses of wh-phrases in Chinese and their influence on the theories of polarity items, free choice items, and epistemic indefinites; (vii) how the concepts of E-type pronouns and D-type pronouns are manifested in the Chinese language and whether such pronoun interpretations correspond to specific sentence types; (viii) what devices Chinese adopts to locate time (i.e., does tense interpretation correspond to certain syntactic projections or it is solely determined by semantic information and pragmatic reasoning); (ix) how the interpretation of Chinese aspect markers can be captured by event structures, possible world semantics, and quantification; (x) how the long-distance binding of Chinese ziji ‘self’ and the blocking effect by first and second person pronouns can be accounted for by the existing theories of beliefs, attitude reports, and logophoricity; (xi) the distribution of various negation markers and their correspondence to the semantic properties of predicates with which they are combined; and (xii) whether Chinese topic-comment structures are constrained by both semantic and pragmatic factors or syntactic factors only.
Clinical linguistics is the branch of linguistics that applies linguistic concepts and theories to the study of language disorders. As the name suggests, clinical linguistics is a dual-facing discipline. Although the conceptual roots of this field are in linguistics, its domain of application is the vast array of clinical disorders that may compromise the use and understanding of language. Both dimensions of clinical linguistics can be addressed through an examination of specific linguistic deficits in individuals with neurodevelopmental disorders, craniofacial anomalies, adult-onset neurological impairments, psychiatric disorders, and neurodegenerative disorders. Clinical linguists are interested in the full range of linguistic deficits in these conditions, including phonetic deficits of children with cleft lip and palate, morphosyntactic errors in children with specific language impairment, and pragmatic language impairments in adults with schizophrenia.
Like many applied disciplines in linguistics, clinical linguistics sits at the intersection of a number of areas. The relationship of clinical linguistics to the study of communication disorders and to speech-language pathology (speech and language therapy in the United Kingdom) are two particularly important points of intersection. Speech-language pathology is the area of clinical practice that assesses and treats children and adults with communication disorders. All language disorders restrict an individual’s ability to communicate freely with others in a range of contexts and settings. So language disorders are first and foremost communication disorders. To understand language disorders, it is useful to think of them in terms of points of breakdown on a communication cycle that tracks the progress of a linguistic utterance from its conception in the mind of a speaker to its comprehension by a hearer. This cycle permits the introduction of a number of important distinctions in language pathology, such as the distinction between a receptive and an expressive language disorder, and between a developmental and an acquired language disorder. The cycle is also a useful model with which to conceptualize a range of communication disorders other than language disorders. These other disorders, which include hearing, voice, and fluency disorders, are also relevant to clinical linguistics.
Clinical linguistics draws on the conceptual resources of the full range of linguistic disciplines to describe and explain language disorders. These disciplines include phonetics, phonology, morphology, syntax, semantics, pragmatics, and discourse. Each of these linguistic disciplines contributes concepts and theories that can shed light on the nature of language disorder. A wide range of tools and approaches are used by clinical linguists and speech-language pathologists to assess, diagnose, and treat language disorders. They include the use of standardized and norm-referenced tests, communication checklists and profiles (some administered by clinicians, others by parents, teachers, and caregivers), and qualitative methods such as conversation analysis and discourse analysis. Finally, clinical linguists can contribute to debates about the nosology of language disorders. In order to do so, however, they must have an understanding of the place of language disorders in internationally recognized classification systems such as the 2013 Diagnostic and Statistical Manual of Mental Disorders (DSM-5) of the American Psychiatric Association.
The study of coarticulation—namely, the articulatory modification of a given speech sound arising from coproduction or overlap with neighboring sounds in the speech chain—has attracted the close attention of phonetic researchers for at least the last 60 years. Knowledge about coarticulatory patterns in speech should provide information about the planning mechanisms of consecutive consonants and vowels and the execution of coordinative articulatory structures during the production of those segmental units. Coarticulatory effects involve changes in articulatory displacement over time toward the left (anticipatory) or the right (carryover) of the trigger, and their typology and extent depend on the articulator under investigation (lip, velum, tongue, jaw, larynx) and the articulatory characteristics of the individual consonants and vowels, as well as nonsegmental factors such as speech rate, stress, and language. A challenge for studying coarticulation is that different speakers may use different coarticulatory mechanisms when producing a given phonemic sequence and they also use coarticulatory information differently for phonemic identification in perception. More knowledge about all these research issues should contribute to a deeper understanding of coarticulation deficits in speakers with speech disorders, how the ability to coarticulate develops from childhood to adulthood, and the extent to which the failure to compensate for coarticulatory effects may give rise to sound change.
There are two main theoretical traditions in semantics. One is based on realism, where meanings are described as relations between language and the world, often in terms of truth conditions. The other is cognitivistic, where meanings are identified with mental structures. This article presents some of the main ideas and theories within the cognitivist approach.
A central tenet of cognitively oriented theories of meaning is that there are close connections between the meaning structures and other cognitive processes. In particular, parallels between semantics and visual processes have been studied. As a complement, the theory of embodied cognition focuses on the relation between actions and components of meaning.
One of the main methods of representing cognitive meaning structures is to use images schemas and idealized cognitive models. Such schemas focus on spatial relations between various semantic elements. Images schemas are often constructed using Gestalt psychological notions, including those of trajector and landmark, corresponding to figure and ground. In this tradition, metaphors and metonymies are considered to be central meaning transforming processes.
A related approach is force dynamics. Here, the semantic schemas are construed from forces and their relations rather than from spatial relations. Recent extensions involve cognitive representations of actions and events, which then form the basis for a semantics of verbs.
A third approach is the theory of conceptual spaces. In this theory, meanings are represented as regions of semantic domains such as space, time, color, weight, size, and shape. For example, strong evidence exists that color words in a large variety of languages correspond to such regions. This approach has been extended to a general account of the semantics of some of the main word classes, including adjectives, verbs, and prepositions. The theory of conceptual spaces shows similarities to the older frame semantics and feature analysis, but it puts more emphasis on geometric structures.
A general criticism against cognitive theories of semantics is that they only consider the meaning structures of individuals, but neglect the social aspects of semantics, that is, that meanings are shared within a community. Recent theoretical proposals counter this by suggesting that semantics should be seen as a meeting of minds, that is, communicative processes that lead to the alignment of meanings between individuals. On this approach, semantics is seen as a product of communication, constrained by the cognitive mechanisms of the individuals.
Even though the concept of multilingualism is well established in linguistics, it is problematic, especially in light of the actual ways in which repertoires are composed and used. The term “multilingualism” bears in itself the notion of several clearly discernable languages and suggests that regardless of the sociolinguistic setting, language ideologies, social history and context, a multilingual individual will be able to separate the various codes that constitute his or her communicative repertoire and use them deliberately in a reflected way. Such a perspective on language isn’t helpful in understanding any sociolinguistic setting and linguistic practice that is not a European one and that doesn’t correlate with ideologies and practices of a standardized, national language. This applies to the majority of people living on the planet and to most people who speak African languages. These speakers differ from the ideological concept of the “Western monolingual,” as they employ diverse practices and linguistic features on a daily basis and do so in a very flexible way. Which linguistic features a person uses thereby depends on factors such as socialization, placement, and personal interest, desires and preferences, which are all likely to change several times during a person’s life. Therefore, communicative repertoires are never stable, neither in their composition nor in the ways they are ideologically framed and evaluated. A more productive perspective on the phenomenon of complex communicative repertoires puts the concept of languaging in the center, which refers to communicative practices, dynamically operating between different practices and (multimodal) linguistic features. Individual speakers thereby perceive and evaluate ways of speaking according to the social meaning, emotional investment, and identity-constituting functions they can attribute to them. The fact that linguistic reflexivity to African speakers might almost always involve the negotiation of the self in a (post)colonial world invites us to consider a critical evaluation, based on approaches such as Southern Theory, of established concepts of “language” and “multilingualism”: languaging is also a postcolonial experience, and this experience often translates into how speakers single out specific ways of speaking as “more prestigious” or “more developed” than others. The inclusion of African metalinguistics and indigenuous knowledge consequently is an important task of linguists studying communicative repertoires in Africa or its diaspora.
Modification is a combinatorial semantic operation between a modifier and a modifiee. Take, for example, vegetarian soup: the attributive adjective vegetarian modifies the nominal modifiee soup and thus constrains the range of potential referents of the complex expression to soups that are vegetarian. Similarly, in Ben is preparing a soup in the camper, the adverbial in the camper modifies the preparation by locating it. Notably, modifiers can have fairly drastic effects; in fake stove, the attribute fake induces that the complex expression singles out objects that seem to be stoves, but are not. Intuitively, modifiers contribute additional information that is not explicitly called for by the target the modifier relates to. Speaking in terms of logic, this roughly says that modification is an endotypical operation; that is, it does not change the arity, or logical type, of the modified target constituent. Speaking in terms of syntax, this predicts that modifiers are typically adjuncts and thus do not change the syntactic distribution of their respective target; therefore, modifiers can be easily iterated (see, for instance, spicy vegetarian soup or Ben prepared a soup in the camper yesterday). This initial characterization sets modification apart from other combinatorial operations such as argument satisfaction and quantification: combining a soup with prepare satisfies an argument slot of the verbal head and thus reduces its arity (see, for instance, *prepare a soup a quiche). Quantification as, for example, in the combination of the quantifier every with the noun soup, maps a nominal property onto a quantifying expression with a different distribution (see, for instance, *a every soup). Their comparatively loose connection to their hosts renders modifiers a flexible, though certainly not random, means within combinatorial meaning constitution. The foundational question is how to work their being endotypical into a full-fledged compositional analysis. On the one hand, modifiers can be considered endotypical functors by virtue of their lexical endowment; for instance, vegetarian would be born a higher-ordered function from predicates to predicates. On the other hand, modification can be considered a rule-based operation; for instance, vegetarian would denote a simple predicate from entities to truth-values that receives its modifying endotypical function only by virtue of a separate modification rule. In order to assess this and related controversies empirically, research on modification pays particular attention to interface questions such as the following: how do structural conditions and the modifying function conspire in establishing complex interpretations? What roles do ontological information and fine-grained conceptual knowledge play in the course of concept combination?
Compound and complex predicates—predicates that consist of two or more lexical items and function as the predicate of a single sentence—present an important class of linguistic objects that pertain to an enormously wide range of issues in the interactions of morphology, phonology, syntax, and semantics. Japanese makes extensive use of compounding to expand a single verb into a complex one. These compounding processes range over multiple modules of the grammatical system, thus straddling the borders between morphology, syntax, phonology, and semantics. In terms of degree of phonological integration, two types of compound predicates can be distinguished. In the first type, called tight compound predicates, two elements from the native lexical stratum are tightly fused and inflect as a whole for tense. In this group, Verb-Verb compound verbs such as arai-nagasu [wash-let.flow] ‘to wash away’ and hare-agaru [sky.be.clear-go.up] ‘for the sky to clear up entirely’ are preponderant in numbers and productivity over Noun-Verb compound verbs such as tema-doru [time-take] ‘to take a lot of time (to finish).’
The second type, called loose compound predicates, takes the form of “Noun + Predicate (Verbal Noun [VN] or Adjectival Noun [AN]),” as in post-syntactic compounds like [sinsya : koonyuu] no okyakusama ([new.car : purchase] GEN customers) ‘customer(s) who purchase(d) a new car,’ where the symbol “:” stands for a short phonological break. Remarkably, loose compounding allows combinations of a transitive VN with its agent subject (external argument), as in [Supirubaagu : seisaku] no eiga ([Spielberg : produce] GEN film) ‘a film/films that Spielberg produces/produced’—a pattern that is illegitimate in tight compounds and has in fact been considered universally impossible in the world’s languages in verbal compounding and noun incorporation.
In addition to a huge variety of tight and loose compound predicates, Japanese has an additional class of syntactic constructions that as a whole function as complex predicates. Typical examples are the light verb construction, where a clause headed by a VN is followed by the light verb suru ‘do,’ as in Tomodati wa sinsya o koonyuu (sae) sita [friend TOP new.car ACC purchase (even) did] ‘My friend (even) bought a new car’ and the human physical attribute construction, as in Sensei wa aoi me o site-iru [teacher TOP blue eye ACC do-ing] ‘My teacher has blue eyes.’ In these constructions, the nominal phrases immediately preceding the verb suru are semantically characterized as indefinite and non-referential and reject syntactic operations such as movement and deletion. The semantic indefiniteness and syntactic immobility of the NPs involved are also observed with a construction composed of a human subject and the verb aru ‘be,’ as Gakkai ni wa oozei no sankasya ga atta ‘There was a large number of participants at the conference.’ The constellation of such “word-like” properties shared by these compound and complex predicates poses challenging problems for current theories of morphology-syntax-semantics interactions with regard to such topics as lexical integrity, morphological compounding, syntactic incorporation, semantic incorporation, pseudo-incorporation, and indefinite/non-referential NPs.
Pius ten Hacken
Compounding is a word formation process based on the combination of lexical elements (words or stems). In the theoretical literature, compounding is discussed controversially, and the disagreement also concerns basic issues. In the study of compounding, the questions guiding research can be grouped into four main areas, labeled here as delimitation, classification, formation, and interpretation. Depending on the perspective taken in the research, some of these may be highlighted or backgrounded.
In the delimitation of compounding, one question is how important it is to be able to determine for each expression unambiguously whether it is a compound or not. Compounding borders on syntax and on affixation. In some theoretical frameworks, it is not a problem to have more typical and less typical instances, without a precise boundary between them. However, if, for instance, word formation and syntax are strictly separated and compounding is in word formation, it is crucial to draw this borderline precisely. Another question is which types of criteria should be used to distinguish compounding from other phenomena. Criteria based on form, on syntactic properties, and on meaning have been used. In all cases, it is also controversial whether such criteria should be applied crosslinguistically.
In the classification of compounds, the question of how important the distinction between the classes is for the theory in which they are used poses itself in much the same way as the corresponding question for the delimitation. A common classification uses headedness as a basis. Other criteria are based on the forms of the elements that are combined (e.g., stem vs. word) or on the semantic relationship between the components. Again, whether these criteria can and should be applied crosslinguistically is controversial.
The issue of the formation rules for compounds is particularly prominent in frameworks that emphasize form-based properties of compounding. Rewrite rules for compounding have been proposed, generalizations over the selection of the input form (stem or word) and of linking elements, and rules for stress assignment. Compounds are generally thought of as consisting of two components, although these components may consist of more than one element themselves. For some types of compounds with three or more components, for example copulative compounds, a nonbinary structure has been proposed.
The question of interpretation can be approached from two opposite perspectives. In a semasiological perspective, the meaning of a compound emerges from the interpretation of a given form. In an onomasiological perspective, the meaning precedes the formation in the sense that a form is selected to name a particular concept. The central question in the interpretation of compounds is how to determine the relationship between the two components. The range of possible interpretations can be constrained by the rules of compounding, by the semantics of the components, and by the context of use. A much-debated question concerns the relative importance of these factors.
Computational psycholinguistics has a long history of investigation and modeling of morphological phenomena. Several computational models have been developed to deal with the processing and production of morphologically complex forms and with the relation between linguistic morphology and psychological word representations. Historically, most of this work has focused on modeling the production of inflected word forms, leading to the development of models based on connectionist principles and other data-driven models such as Memory-Based Language Processing (MBLP), Analogical Modeling of Language (AM), and Minimal Generalization Learning (MGL). In the context of inflectional morphology, these computational approaches have played an important role in the debate between single and dual mechanism theories of cognition. Taking a different angle, computational models based on distributional semantics have been proposed to account for several phenomena in morphological processing and composition. Finally, although several computational models of reading have been developed in psycholinguistics, none of them have satisfactorily addressed the recognition and reading aloud of morphologically complex forms.
Jane Chandlee and Jeffrey Heinz
Computational phonology studies the nature of the computations necessary and sufficient for characterizing phonological knowledge. As a field it is informed by the theories of computation and phonology.
The computational nature of phonological knowledge is important because at a fundamental level it is about the psychological nature of memory as it pertains to phonological knowledge. Different types of phonological knowledge can be characterized as computational problems, and the solutions to these problems reveal their computational nature. In contrast to syntactic knowledge, there is clear evidence that phonological knowledge is computationally bounded to the so-called regular classes of sets and relations. These classes have multiple mathematical characterizations in terms of logic, automata, and algebra with significant implications for the nature of memory. In fact, there is evidence that phonological knowledge is bounded by particular subregular classes, with more restrictive logical, automata-theoretic, and algebraic characterizations, and thus by weaker models of memory.
Computational semantics performs automatic meaning analysis of natural language. Research in computational semantics designs meaning representations and develops mechanisms for automatically assigning those representations and reasoning over them. Computational semantics is not a single monolithic task but consists of many subtasks, including word sense disambiguation, multi-word expression analysis, semantic role labeling, the construction of sentence semantic structure, coreference resolution, and the automatic induction of semantic information from data.
The development of manually constructed resources has been vastly important in driving the field forward. Examples include WordNet, PropBank, FrameNet, VerbNet, and TimeBank. These resources specify the linguistic structures to be targeted in automatic analysis, and they provide high-quality human-generated data that can be used to train machine learning systems. Supervised machine learning based on manually constructed resources is a widely used technique.
A second core strand has been the induction of lexical knowledge from text data. For example, words can be represented through the contexts in which they appear (called distributional vectors or embeddings), such that semantically similar words have similar representations. Or semantic relations between words can be inferred from patterns of words that link them. Wide-coverage semantic analysis always needs more data, both lexical knowledge and world knowledge, and automatic induction at least alleviates the problem.
Compositionality is a third core theme: the systematic construction of structural meaning representations of larger expressions from the meaning representations of their parts. The representations typically use logics of varying expressivity, which makes them well suited to performing automatic inferences with theorem provers.
Manual specification and automatic acquisition of knowledge are closely intertwined. Manually created resources are automatically extended or merged. The automatic induction of semantic information is guided and constrained by manually specified information, which is much more reliable. And for restricted domains, the construction of logical representations is learned from data.
It is at the intersection of manual specification and machine learning that some of the current larger questions of computational semantics are located. For instance, should we build general-purpose semantic representations, or is lexical knowledge simply too domain-specific, and would we be better off learning task-specific representations every time? When performing inference, is it more beneficial to have the solid ground of a human-generated ontology, or is it better to reason directly with text snippets for more fine-grained and gradual inference? Do we obtain a better and deeper semantic analysis as we use better and deeper manually specified linguistic knowledge, or is the future in powerful learning paradigms that learn to carry out an entire task from natural language input and output alone, without pre-specified linguistic knowledge?
The Word and Paradigm approach to morphology associates lexemes with tables of surface forms for different morphosyntactic property sets. Researchers express their realizational theories, which show how to derive these surface forms, using formalisms such as Network Morphology and Paradigm Function Morphology. The tables of surface forms also lend themselves to a study of the implicative theories, which infer the realizations in some cells of the inflectional system from the realizations of other cells.
There is an art to building realizational theories. First, the theories should be correct, that is, they should generate the right surface forms. Second, they should be elegant, which is much harder to capture, but includes the desiderata of simplicity and expressiveness. Without software to test a realizational theory, it is easy to sacrifice correctness for elegance. Therefore, software that takes a realizational theory and generates surface forms is an essential part of any theorist’s toolbox.
Discovering implicative rules that connect the cells in an inflectional system is often quite difficult. Some rules are immediately apparent, but others can be subtle. Software that automatically analyzes an entire table of surface forms for many lexemes can help automate the discovery process.
Researchers can use Web-based computerized tools to test their realizational theories and to discover implicative rules.
Connectionism is an important theoretical framework for the study of human cognition and behavior. Also known as Parallel Distributed Processing (PDP) or Artificial Neural Networks (ANN), connectionism advocates that learning, representation, and processing of information in mind are parallel, distributed, and interactive in nature. It argues for the emergence of human cognition as the outcome of large networks of interactive processing units operating simultaneously. Inspired by findings from neural science and artificial intelligence, connectionism is a powerful computational tool, and it has had profound impact on many areas of research, including linguistics. Since the beginning of connectionism, many connectionist models have been developed to account for a wide range of important linguistic phenomena observed in monolingual research, such as speech perception, speech production, semantic representation, and early lexical development in children. Recently, the application of connectionism to bilingual research has also gathered momentum. Connectionist models are often precise in the specification of modeling parameters and flexible in the manipulation of relevant variables in the model to address relevant theoretical questions, therefore they can provide significant advantages in testing mechanisms underlying language processes.