Show Summary Details

Page of

Printed from Oxford Research Encyclopedias, Linguistics. Under the terms of the licence agreement, an individual user may print out a single article for personal use (for details see Privacy Policy and Legal Notice).

date: 05 December 2020

Quantitative Methods in Morphology: Corpora and Other “Big Data” Approacheslocked

  • Marco MarelliMarco MarelliDepartment of Psychology, University of Milan-Bicocca

Summary

Corpora are an all-important resource in linguistics, as they constitute the primary source for large-scale examples of language usage. This has been even more evident in recent years, with the increasing availability of texts in digital format leading more and more corpus linguistics toward a “big data” approach. As a consequence, the quantitative methods adopted in the field are becoming more sophisticated and various.

When it comes to morphology, corpora represent a primary source of evidence to describe morpheme usage, and in particular how often a particular morphological pattern is attested in a given language. There is hence a tight relation between corpus linguistics and the study of morphology and the lexicon. This relation, however, can be considered bi-directional. On the one hand, corpora are used as a source of evidence to develop metrics and train computational models of morphology: by means of corpus data it is possible to quantitatively characterize morphological notions such as productivity, and corpus data are fed to computational models to capture morphological phenomena at different levels of description. On the other hand, morphology has also been applied as an organization principle to corpora. Annotations of linguistic data often adopt morphological notions as guidelines. The resulting information, either obtained from human annotators or relying on automatic systems, makes corpora easier to analyze and more convenient to use in a number of applications.

You do not currently have access to this article

Login

Please login to access the full content.

Subscribe

Access to the full content requires a subscription