Until recently, theoretical linguists have paid little attention to the frequency of linguistic elements in grammar and grammatical development. It is a standard assumption of (most) grammatical theories that the study of grammar (or competence) must be separated from the study of language use (or performance). However, this view of language has been called into question by various strands of research that have emphasized the importance of frequency for the analysis of linguistic structure. In this research, linguistic structure is often characterized as an emergent phenomenon shaped by general cognitive processes such as analogy, categorization, and automatization, which are crucially influenced by frequency of occurrence. There are many different ways in which frequency affects the processing and development of linguistic structure. Historical linguists have shown that frequent strings of linguistic elements are prone to undergo phonetic reduction and coalescence, and that frequent expressions and constructions are more resistant to structure mapping and analogical leveling than infrequent ones. Cognitive linguists have argued that the organization of constituent structure and embedding is based on the language users’ experience with linguistic sequences, and that the productivity of grammatical schemas or rules is determined by the combined effect of frequency and similarity. Child language researchers have demonstrated that frequency of occurrence plays an important role in the segmentation of the speech stream and the acquisition of syntactic categories, and that the statistical properties of the ambient language are much more regular than commonly assumed. And finally, psycholinguists have shown that structural ambiguities in sentence processing can often be resolved by lexical and structural frequencies, and that speakers’ choices between alternative constructions in language production are related to their experience with particular linguistic forms and meanings. Taken together, this research suggests that our knowledge of grammar is grounded in experience.
Holger Diessel and Martin Hilpert
Children’s acquisition of language is an amazing feat. Children master the syntax, the sentence structure of their language, through exposure and interaction with caregivers and others but, notably, with no formal tuition. How children come to be in command of the syntax of their language has been a topic of vigorous debate since Chomsky argued against Skinner’s claim that language is ‘verbal behavior.’ Chomsky argued that knowledge of language cannot be learned through experience alone but is guided by a genetic component. This language component, known as ‘Universal Grammar,’ is composed of abstract linguistic knowledge and a computational system that is special to language. The computational mechanisms of Universal Grammar give even young children the capacity to form hierarchical syntactic representations for the sentences they hear and produce. The abstract knowledge of language guides children’s hypotheses as they interact with the language input in their environment, ensuring they progress toward the adult grammar. An alternative school of thought denies the existence of a dedicated language component, arguing that knowledge of syntax is learned entirely through interactions with speakers of the language. Such ‘usage-based’ linguistic theories assume that language learning employs the same learning mechanisms that are used by other cognitive systems. Usage-based accounts of language development view children’s earliest productions as rote-learned phrases that lack internal structure. Knowledge of linguistic structure emerges gradually and in a piecemeal fashion, with frequency playing a large role in the order of emergence for different syntactic structures.
Corpus Phonology is an approach to phonology that places corpora at the center of phonological research. Some practitioners of corpus phonology see corpora as the only object of investigation; others use corpora alongside other available techniques (for instance, intuitions, psycholinguistic and neurolinguistic experimentation, laboratory phonology, the study of the acquisition of phonology or of language pathology, etc.). Whatever version of corpus phonology one advocates, corpora have become part and parcel of the modern research environment, and their construction and exploitation has been modified by the multidisciplinary advances made within various fields. Indeed, for the study of spoken usage, the term ‘corpus’ should nowadays only be applied to bodies of data meeting certain technical requirements, even if corpora of spoken usage are by no means new and coincide with the birth of recording techniques. It is therefore essential to understand what criteria must be met by a modern corpus (quality of recordings, diversity of speech situations, ethical guidelines, time-alignment with transcriptions and annotations, etc.) and what tools are available to researchers. Once these requirements are met, the way is open to varying and possibly conflicting uses of spoken corpora by phonological practitioners. A traditional stance in theoretical phonology sees the data as a degenerate version of a more abstract underlying system, but more and more researchers within various frameworks (e.g., usage-based approaches, exemplar models, stochastic Optimality Theory, sociophonetics) are constructing models that tightly bind phonological competence to language use, rely heavily on quantitative information, and attempt to account for intra-speaker and inter-speaker variation. This renders corpora essential to phonological research and not a mere adjunct to the phonological description of the languages of the world.