This article gives a brief introduction to sociolinguistics in China. Chinese sociolinguistics started with the introduction of Western sociolinguistic theories at the end of the 1970s. It did not become mature until the turn of the 21st century. After more than 40 years of development, Chinese sociolinguistics has now covered a variety of topics and themes. Among them, the most popular are “language life,” “language planning,” “language variations,” and “urban language studies.” After providing a brief introduction to the historical development of Chinese sociolinguistics, this article primarily focuses on some of the most popular topics in that field. Although Chinese sociolinguistics still relies on the introduction and incorporation of Western sociolinguistic theories, it has gradually formed its own research agenda. In the meantime, it has also attempted to adapt Western theories to the unique Chinese context and made some theoretical and methodological innovations. Especially in view of the growing urbanization and industrialization taking place in China, Chinese sociolinguistics is expected to play a growing important role in the country’s future development and lead to more breakthroughs in its theoretical and methodological developments.
Jiayu Wang and Guangyu Jin
Since critical discourse analysis (CDA) was introduced to China, it has developed into an influential field. Studies in CDA in China from the 1990s to 2020 can be delineated through four stages of development. The first stage focused on introducing the theories and concepts in CDA to China’s academia. During the second stage, CDA in China was no longer confined to reviewing theories abroad but was extended to deeper and more extensive theoretical, methodological, and empirical investigations. During the third stage, Chinese scholars in CDA became more concerned with domestic issues than in the previous stages and started to conduct interdisciplinary studies. The fourth stage marked the flourishing of CDA studies in terms of the numbers of studies published and scholars engaged in the field, and in terms of the breadth and the variety of research methods, topics, and disciplines involved. Chinese scholars tend to gear CDA to China’s social, political, and cultural contexts.
Child phonology refers to virtually every phonetic and phonological phenomenon observable in the speech productions of children, including babbles. This includes qualitative and quantitative aspects of babbled utterances as well as all behaviors such as the deletion or modification of the sounds and syllables contained in the adult (target) forms that the child is trying to reproduce in his or her spoken utterances. This research is also increasingly concerned with issues in speech perception, a field of investigation that has traditionally followed its own course; it is only recently that the two fields have started to converge. The recent history of research on child phonology, the theoretical approaches and debates surrounding it, as well as the research methods and resources that have been employed to address these issues empirically, parallel the evolution of phonology, phonetics, and psycholinguistics as general fields of investigation. Child phonology contributes important observations, often organized in terms of developmental time periods, which can extend from the child’s earliest babbles to the stage when he or she masters the sounds, sound combinations, and suprasegmental properties of the ambient (target) language. Central debates within the field of child phonology concern the nature and origins of phonological representations as well as the ways in which they are acquired by children. Since the mid-1900s, the most central approaches to these questions have tended to fall on each side of the general divide between generative vs. functionalist (usage-based) approaches to phonology. Traditionally, generative approaches have embraced a universal stance on phonological primitives and their organization within hierarchical phonological representations, assumed to be innately available as part of the human language faculty. In contrast to this, functionalist approaches have utilized flatter (non-hierarchical) representational models and rejected nativist claims about the origin of phonological constructs. Since the beginning of the 1990s, this divide has been blurred significantly, both through the elaboration of constraint-based frameworks that incorporate phonetic evidence, from both speech perception and production, as part of accounts of phonological patterning, and through the formulation of emergentist approaches to phonological representation. Within this context, while controversies remain concerning the nature of phonological representations, debates are fueled by new outlooks on factors that might affect their emergence, including the types of learning mechanisms involved, the nature of the evidence available to the learner (e.g., perceptual, articulatory, and distributional), as well as the extent to which the learner can abstract away from this evidence. In parallel, recent advances in computer-assisted research methods and data availability, especially within the context of the PhonBank project, offer researchers unprecedented support for large-scale investigations of child language corpora. This combination of theoretical and methodological advances provides new and fertile grounds for research on child phonology and related implications for phonological theory.
Susan Rvachew and Abdulsalam Alhaidary
Babbling is made up of meaningless speechlike syllables called canonical syllables. Canonical syllables are characterized by the coordination of consonantal and vocalic elements in syllables that have speechlike timing, phonation, and resonance characteristics. Infants begin to babble on average at approximately seven months of age. Babbling continues in parallel with less mature noncanonical vocalizations that make up the majority of utterances through the first year. Babbling also continues in parallel with the emergence of meaningful speech during the second year. Regardless of the language that the infant is learning, most canonical syllables have a CV shape with the consonant being a labial or alveolar stop or nasal and the vowel most likely to be central or low- to mid-front in place (e.g., [bʌ], [da], [mæ]). Approximately 15% of canonical utterances consist of multisyllable strings; in other words, most babbled utterances contain only a single CV syllable. The onset of the canonical babbling stage is crucially dependent upon normal hearing, permitting access to language input and feedback of self-produced speech. Many studies have reported differences in the phonetic and acoustic characteristics of babble produced by infants learning different languages. These differences include the frequency with which certain consonants are produced, the location, size, and shape of the vowel space, and the rhythmic and intonation qualities of multisyllable babbles, in each case reflecting specificities of the input language. However, replications of these findings are rare and further research is required to better understand the learning mechanisms that underlie language specific acquisition of articulatory representations during the prelinguistic stage of vocal development.
A fundamental question in epistemological philosophy is whether reason may be based on a priori knowledge—that is, knowledge that precedes and which is independent of experience. In modern science, the concept of innateness has been associated with particular behaviors and types of knowledge, which supposedly have been present in the organism since birth (in fact, since fertilization)—prior to any sensory experience with the environment. This line of investigation has been traditionally linked to two general types of qualities: the first consists of instinctive and inflexible reflexes, traits, and behaviors, which are apparent in survival, mating, and rearing activities. The other relates to language and cognition, with certain concepts, ideas, propositions, and particular ways of mental computation suggested to be part of one’s biological make-up. While both these types of innatism have a long history (e.g., debate by Plato and Descartes), some bias appears to exist in favor of claims for inherent behavioral traits, which are typically accepted when satisfactory empirical evidence is provided. One famous example is Lorenz’s demonstration of imprinting, a natural phenomenon that obeys a predetermined mechanism and schedule (incubator-hatched goslings imprinted on Lorenz’s boots, the first moving object they encountered). Likewise, there seems to be little controversy in regard to predetermined ways of organizing sensory information, as is the case with the detection and classification of shapes and colors by the mind. In contrast, the idea that certain types of abstract knowledge may be part of an organism’s biological endowment (i.e., not learned) is typically met with a greater sense of skepticism. The most influential and controversial claim for such innate knowledge in modern science is Chomsky’s nativist theory of Universal Grammar in language, which aims to define the extent to which human languages can vary; and the famous Argument from the Poverty of the Stimulus. The main Chomskyan hypothesis is that all human beings share a preprogrammed linguistic infrastructure consisting of a finite set of general principles, which can generate (through combination or transformation) an infinite number of (only) grammatical sentences. Thus, the innate grammatical system constrains and structures the acquisition and use of all natural languages.
A computational learner needs three things: Data to learn from, a class of representations to acquire, and a way to get from one to the other. Language acquisition is a very particular learning setting that can be defined in terms of the input (the child’s early linguistic experience) and the output (a grammar capable of generating a language very similar to the input). The input is infamously impoverished. As it relates to morphology, the vast majority of potential forms are never attested in the input, and those that are attested follow an extremely skewed frequency distribution. Learners nevertheless manage to acquire most details of their native morphologies after only a few years of input. That said, acquisition is not instantaneous nor is it error-free. Children do make mistakes, and they do so in predictable ways which provide insights into their grammars and learning processes. The most elucidating computational model of morphology learning from the perspective of a linguist is one that learns morphology like a child does, that is, on child-like input and along a child-like developmental path. This article focuses on clarifying those aspects of morphology acquisition that should go into such an elucidating a computational model. Section 1 describes the input with a focus on child-directed speech corpora and input sparsity. Section 2 discusses representations with focuses on productivity, developmental paths, and formal learnability. Section 3 surveys the range of learning tasks that guide research in computational linguistics and NLP with special focus on how they relate to the acquisition setting. The conclusion in Section 4 presents a summary of morphology acquisition as a learning problem with Table 4 highlighting the key takeaways of this article.
Yvan Rose, Laetitia Almeida, and Maria João Freitas
The field of study on the acquisition of phonological productive abilities by first-language learners in the Romance languages has been largely focused on three main languages: French, Portuguese, and Spanish, including various dialects of these languages spoken in Europe as well as in the Americas. In this article, we provide a comparative survey of this literature, with an emphasis on representational phonology. We also include in our discussion observations from the development of Catalan and Italian, and mention areas where these languages, as well as Romanian, another major Romance language, would provide welcome additions to our cross-linguistic comparisons. Together, the various studies we summarize reveal intricate patterns of development, in particular concerning the acquisition of consonants across different positions within the syllable, the word, and in relation to stress, documented from both monolingual and bilingual first-language learners can be found. The patterns observed across the different languages and dialects can generally be traced to formal properties of phone distributions, as entailed by mainstream theories of phonological representation, with variations also predicted by more functional aspects of speech, including phonetic factors and usage frequency. These results call for further empirical studies of phonological development, in particular concerning Romanian, in addition to Catalan and Italian, whose phonological and phonetic properties offer compelling grounds for the formulation and testing of models of phonology and phonological development.
First-language acquisition of morphology refers to the process whereby native speakers gain full and automatic command of the inflectional and derivational machinery of their mother tongue. Despite language diversity, evidence shows that morphological acquisition follows a shared path in development in evolving from semantically and structurally simplex and non-productive to more complex and productive. The emergence and consolidation of the central morphological systems in a language typically take place between the ages of two and six years, while mature command of all systems and subsystems can take up to 10 more years, and is mediated by the consolidation of literacy skills. Morphological learning in both inflection and derivation is always interwoven with lexical growth, and derivational acquisition is highly dependent on the development of a large and coherent lexicon. Three critical factors platform the acquisition of morphology. One factor is the input patterns in the ambient language, including various types of frequency. Input provides the context for children to pay attention to morphological markers as meaningful cues to caregivers’ intentions in interactive sociopragmatic settings of joint attention. A second factor is language typology, given that languages differ in the amount of word-internal information they package in words. The “typological impact” in morphology directs children to the ways pertinent conceptual and structural information is encoded in morphological structures. It is thus responsible for great differences among languages in the timing and pace of learning morphological categories such as passive verbs. Finally, development itself is a central mechanism that drives morphological acquisition from emergence to productivity in three senses: as the filtering device that enables the break into the morphological system, in providing the span of time necessary for the consolidation of morphological systems in children, and in hosting the cognitive changes that usher in mature morphological systems in both speech and writing in adolescents and adults.
Functional categories carry little or no semantic content by themselves and contribute crucially to sentence structure. In the generative framework, they are assumed to mark and head functional projections in the basic hierarchical structure underlying each phrase or sentence. Given their intertwining with grammar, child language researchers have long been attracted by the development of functional categories. To a child, it is important to differentiate functional categories from lexical categories and relate each of them to the hidden hierarchical structure of the phrase or sentence. The learning of a functional category is no easy task and implies the development of different dimensions of linguistic knowledge, including the lexical realization of the functional category in the ambient language, the specific grammatical function it serves, the abstract underlying structure, and the semantic properties of the associated structure. A central issue in the acquisition of functional categories concerns whether children have access to functional categories early in language development. Differing accounts have been proposed. According to the maturational view, functional categories are absent in children’s initial grammar and mature later. In contrast to the maturational view is the continuity view, which assumes children’s continuous access to functional categories throughout language development. Cross-linguistic evidence from production and experimental studies has been accumulated in support of the continuity hypothesis. Mandarin Chinese has a rich inventory of function words, though it lacks overt inflectional markers. De, aspect markers, ba, and sentence final particles are among the most commonly used function words and play a fundamental role in sentence structure in Mandarin Chinese in that they are functional categories that head various functional projections in the hierarchical structure. Acquisition studies show that these function words emerge early in development and children’s use of these function words is mostly target-like, offering evidence for the continuity view of functional categories as well as insights into child grammar in Mandarin Chinese.
Connectionism is an important theoretical framework for the study of human cognition and behavior. Also known as Parallel Distributed Processing (PDP) or Artificial Neural Networks (ANN), connectionism advocates that learning, representation, and processing of information in mind are parallel, distributed, and interactive in nature. It argues for the emergence of human cognition as the outcome of large networks of interactive processing units operating simultaneously. Inspired by findings from neural science and artificial intelligence, connectionism is a powerful computational tool, and it has had profound impact on many areas of research, including linguistics. Since the beginning of connectionism, many connectionist models have been developed to account for a wide range of important linguistic phenomena observed in monolingual research, such as speech perception, speech production, semantic representation, and early lexical development in children. Recently, the application of connectionism to bilingual research has also gathered momentum. Connectionist models are often precise in the specification of modeling parameters and flexible in the manipulation of relevant variables in the model to address relevant theoretical questions, therefore they can provide significant advantages in testing mechanisms underlying language processes.