Empirical and theoretical research on language has recently experienced a period of extensive growth. Unfortunately, however, in the case of the Japanese language, far fewer studies—particularly those written in English—have been presented on adult second language (L2) learners and bilingual children. As the field develops, it is increasingly important to integrate theoretical concepts and empirical research findings in second language acquisition (SLA) of Japanese, so that the concepts and research can be eventually applied to educational practice. This article attempts to: (a) address at least some of the gaps currently existing in the literature, (b) deal with important topics to the extent possible, and (c) discuss various problems with regard to adult learners of Japanese as an L2 and English–Japanese bilingual children. Specifically, the article first examines the characteristics of the Japanese language. Tracing the history of SLA studies, this article then deliberately touches on a wide spectrum of domains of linguistic knowledge (e.g., phonology and phonetics, morphology, lexicon, semantics, syntax, discourse), context of language use (e.g., interactive conversation, narrative), research orientations (e.g., formal linguistics, psycholinguistics, social psychology, sociolinguistics), and age groups (e.g., children, adults). Finally, by connecting past SLA research findings in English and recent/present concerns in Japanese as SLA with a focus on the past 10 years including corpus linguistics, this article provides the reader with an overview of the field of Japanese linguistics and its critical issues.
The study of second language phonetics is concerned with three broad and overlapping research areas: the characteristics of second language speech production and perception, the consequences of perceiving and producing nonnative speech sounds with a foreign accent, and the causes and factors that shape second language phonetics. Second language learners and bilinguals typically produce and perceive the sounds of a nonnative language in ways that are different from native speakers. These deviations from native norms can be attributed largely, but not exclusively, to the phonetic system of the native language. Non-nativelike speech perception and production may have both social consequences (e.g., stereotyping) and linguistic–communicative consequences (e.g., reduced intelligibility). Research on second language phonetics over the past ca. 30 years has resulted in a fairly good understanding of causes of nonnative speech production and perception, and these insights have to a large extent been driven by tests of the predictions of models of second language speech learning and of cross-language speech perception. It is generally accepted that the characteristics of second language speech are predominantly due to how second language learners map the sounds of the nonnative to the native language. This mapping cannot be entirely predicted from theoretical or acoustic comparisons of the sound systems of the languages involved, but has to be determined empirically through tests of perceptual assimilation. The most influential learner factors which shape how a second language is perceived and produced are the age of learning and the amount and quality of exposure to the second language. A very important and far-reaching finding from research on second language phonetics is that age effects are not due to neurological maturation which could result in the attrition of phonetic learning ability, but to the way phonetic categories develop as a function of experience with surrounding sound systems.
The distinction between representations and processes is central to most models of the cognitive science of language. Linguistic theory informs the types of representations assumed, and these representations are what are taken to be the targets of second language acquisition. Epistemologically, this is often taken to be knowledge, or knowledge-that. Techniques such as Grammaticality Judgment tasks are paradigmatic as we seek to gain insight into what a learner’s grammar looks like. Learners behave as if certain phonological, morphological, or syntactic strings (which may or may not be target-like) were well-formed. It is the task of the researcher to understand the nature of the knowledge that governs those well-formedness beliefs. Traditional accounts of processing, on the other hand, look to the real-time use of language, either in production or perception, and invoke discussions of skill or knowledge-how. A range of experimental psycholinguistic techniques have been used to assess these skills: self-paced reading, eye-tracking, ERPs, priming, lexical decision, AXB discrimination, and the like. Such online measures can show us how we “do” language when it comes to activities such as production or comprehension. There has long been a connection between linguistic theory and theories of processing as evidenced by the work of Berwick (The Grammatical Basis of Linguistic Performance). The task of the parser is to assign abstract structure to a phonological, morphological, or syntactic string; structure that does not come directly labeled in the acoustic input. Such processing studies as the Garden Path phenomenon have revealed that grammaticality and processability are distinct constructs. In some models, however, the distinction between grammar and processing is less distinct. Phillips says that “parsing is grammar,” while O’Grady builds an emergentist theory with no grammar, only processing. Bayesian models of acquisition, and indeed of knowledge, assume that the grammars we set up are governed by a principle of entropy, which governs other aspects of human behavior; knowledge and skill are combined. Exemplar models view the processing of the input as a storing of all phonetic detail that is in the environment, not storing abstract categories; the categories emerge via a process of comparing exemplars. Linguistic theory helps us to understand the processing of input to acquire new L2 representations, and the access of those representations in real time.
Patrice Speeter Beddor
In their conversational interactions with speakers, listeners aim to understand what a speaker is saying, that is, they aim to arrive at the linguistic message, which is interwoven with social and other information, being conveyed by the input speech signal. Across the more than 60 years of speech perception research, a foundational issue has been to account for listeners’ ability to achieve stable linguistic percepts corresponding to the speaker’s intended message despite highly variable acoustic signals. Research has especially focused on acoustic variants attributable to the phonetic context in which a given phonological form occurs and on variants attributable to the particular speaker who produced the signal. These context- and speaker-dependent variants reveal the complex—albeit informationally rich—patterns that bombard listeners in their everyday interactions. How do listeners deal with these variable acoustic patterns? Empirical studies that address this question provide clear evidence that perception is a malleable, dynamic, and active process. Findings show that listeners perceptually factor out, or compensate for, the variation due to context yet also use that same variation in deciding what a speaker has said. Similarly, listeners adjust, or normalize, for the variation introduced by speakers who differ in their anatomical and socio-indexical characteristics, yet listeners also use that socially structured variation to facilitate their linguistic judgments. Investigations of the time course of perception show that these perceptual accommodations occur rapidly, as the acoustic signal unfolds in real time. Thus, listeners closely attend to the phonetic details made available by different contexts and different speakers. The structured, lawful nature of this variation informs perception. Speech perception changes over time not only in listeners’ moment-by-moment processing, but also across the life span of individuals as they acquire their native language(s), non-native languages, and new dialects and as they encounter other novel speech experiences. These listener-specific experiences contribute to individual differences in perceptual processing. However, even listeners from linguistically homogenous backgrounds differ in their attention to the various acoustic properties that simultaneously convey linguistically and socially meaningful information. The nature and source of listener-specific perceptual strategies serve as an important window on perceptual processing and on how that processing might contribute to sound change. Theories of speech perception aim to explain how listeners interpret the input acoustic signal as linguistic forms. A theoretical account should specify the principles that underlie accurate, stable, flexible, and dynamic perception as achieved by different listeners in different contexts. Current theories differ in their conception of the nature of the information that listeners recover from the acoustic signal, with one fundamental distinction being whether the recovered information is gestural or auditory. Current approaches also differ in their conception of the nature of phonological representations in relation to speech perception, although there is increasing consensus that these representations are more detailed than the abstract, invariant representations of traditional formal phonology. Ongoing work in this area investigates how both abstract information and detailed acoustic information are stored and retrieved, and how best to integrate these types of information in a single theoretical model.
Prosody is an umbrella term used to cover a variety of interconnected and interacting phenomena, namely stress, rhythm, phrasing, and intonation. The phonetic expression of prosody relies on a number of parameters, including duration, amplitude, and fundamental frequency (F0). The same parameters are also used to encode lexical contrasts (such as tone), as well as paralinguistic phenomena (such as anger, boredom, and excitement). Further, the exact function and organization of the phonetic parameters used for prosody differ across languages. These considerations make it imperative to distinguish the linguistic phenomena that make up prosody from their phonetic exponents, and similarly to distinguish between the linguistic and paralinguistic uses of the latter. A comprehensive understanding of prosody relies on the idea that speech is prosodically organized into phrasal constituents, the edges of which are phonetically marked in a number of ways, for example, by articulatory strengthening in the beginning and lengthening at the end. Phrases are also internally organized either by stress, that is around syllables that are more salient relative to others (as in English and Spanish), or by the repetition of a relatively stable tonal pattern over short phrases (as in Korean, Japanese, and French). Both types of organization give rise to rhythm, the perception of speech as consisting of groups of a similar and repetitive pattern. Tonal specification over phrases is also used for intonation purposes, that is, to mark phrasal boundaries, and express information structure and pragmatic meaning. Taken together, the components of prosody help with the organization and planning of speech, while prosodic cues are used by listeners during both language acquisition and speech processing. Importantly, prosody does not operate independently of segments; rather, it profoundly affects segment realization, making the incorporation of an understanding of prosody into experimental design essential for most phonetic research.
“Tupian” is a common term applied by linguists to a linguistic stock of seven families spread across great parts of South America. Tupian languages share a large number of structural and morphological similarities which make genetic relationship very probable. Four families (Arikém, Mondé, Tuparí, and Raramarama-Poruborá) are still limited to the Madeira-Guaporé region in Brazil, considered by some scholars to be the Tupí homeland. Other families and branches would have migrated, in ancient times, down the Amazon (Mundurukú, Mawé) and up the Xingú River (Juruna, Awetí). Only the Tupí-Guarani branch, which makes up about 40 living languages, mainly spread to the south. Two Tupí-Guaraní languages played an important part in the Portuguese and Spanish colonisation of South America, Tupinambá on the Brazilian coast and Guaraní in colonial Paraguay. In the early 21st century, Guaraní is spoken by more than six million non-Indian people in Paraguay and in adjacent parts of Argentina and Brazil. Tupí-Guaraní (TG) is an artificial term used by linguists to denominate the family composed by eight subgroups of languages, one of them being the Guaraní subgroup and the other one the extinct Tupinambá and its varieties. Important phonological characteristics of Tupian languages are nasality and the occurrence of a high central vowel /ɨ/, a glottal stop /ʔ/, and final consonants, especially plosives in coda position. Nasality seems to be a common characteristic of all branches of the family. Most of them show phenomena such as nasal harmony, also called nasal assimilation or regressive nasalization by some scholars. Tupian languages have a rich morphology expressed mainly by suffixes and prefixes, though particles are also important to express grammatical categories. Verbal morphology is characterized by generally rich devices of valence-changing formations. Relational inflection is one of the most striking phenomena of TG nominal phrases. It allows marking the determination of a noun by a preceding adjunct, its syntactical transformation into a nominal predicate, or the absence of any relation. Relational inflection partly occurs also in other branches and families than Tupí-Guaraní. Verbal person marking is realized by prefixing in most languages; some languages of the Tuparí and Juruna family, however, use only free pronouns. Tupian syntax is based on the predication of both verbs and nouns. Subordinate clauses, such as relative clauses, are produced by nominalization, while adverbial clauses are formed by specific particles or postpositions on the predicate. Traditional word order is SOV.
Throughout the 20th century, structuralist and generative linguists have argued that the study of the language system (langue, competence) must be separated from the study of language use (parole, performance), but this view of language has been called into question by usage-based linguists who have argued that the structure and organization of a speaker’s linguistic knowledge is the product of language use or performance. On this account, language is seen as a dynamic system of fluid categories and flexible constraints that are constantly restructured and reorganized under the pressure of domain-general cognitive processes that are not only involved in the use of language but also in other cognitive phenomena such as vision and (joint) attention. The general goal of usage-based linguistics is to develop a framework for the analysis of the emergence of linguistic structure and meaning. In order to understand the dynamics of the language system, usage-based linguists study how languages evolve, both in history and language acquisition. One aspect that plays an important role in this approach is frequency of occurrence. As frequency strengthens the representation of linguistic elements in memory, it facilitates the activation and processing of words, categories, and constructions, which in turn can have long-lasting effects on the development and organization of the linguistic system. A second aspect that has been very prominent in the usage-based study of grammar concerns the relationship between lexical and structural knowledge. Since abstract representations of linguistic structure are derived from language users’ experience with concrete linguistic tokens, grammatical patterns are generally associated with particular lexical expressions.