Show Summary Details

Page of

Printed from Oxford Research Encyclopedias, Linguistics. Under the terms of the licence agreement, an individual user may print out a single article for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: Google Scholar Indexing; date: 25 April 2024

The Phonetics of Babblingunlocked

The Phonetics of Babblingunlocked

  • Susan RvachewSusan RvachewSchool of Communication Sciences and Disorders, McGill University
  •  and Abdulsalam AlhaidaryAbdulsalam AlhaidaryCollege of Applied Medical Sciences, King Saud University


Babbling is made up of meaningless speechlike syllables called canonical syllables. Canonical syllables are characterized by the coordination of consonantal and vocalic elements in syllables that have speechlike timing, phonation, and resonance characteristics. Infants begin to babble on average at approximately seven months of age. Babbling continues in parallel with less mature noncanonical vocalizations that make up the majority of utterances through the first year. Babbling also continues in parallel with the emergence of meaningful speech during the second year. Regardless of the language that the infant is learning, most canonical syllables have a CV shape with the consonant being a labial or alveolar stop or nasal and the vowel most likely to be central or low- to mid-front in place (e.g., [bʌ], [da], [mæ]). Approximately 15% of canonical utterances consist of multisyllable strings; in other words, most babbled utterances contain only a single CV syllable. The onset of the canonical babbling stage is crucially dependent upon normal hearing, permitting access to language input and feedback of self-produced speech. Many studies have reported differences in the phonetic and acoustic characteristics of babble produced by infants learning different languages. These differences include the frequency with which certain consonants are produced, the location, size, and shape of the vowel space, and the rhythmic and intonation qualities of multisyllable babbles, in each case reflecting specificities of the input language. However, replications of these findings are rare and further research is required to better understand the learning mechanisms that underlie language specific acquisition of articulatory representations during the prelinguistic stage of vocal development.


    1. The Development of Babbling

    1.1 Nomenclature

    Figure 1. Classification of infant vocalizations. Figure printed with permission of Rvachew and Alhaidary.

    Figure 2. Spectrograms of canonical babble produced by Arabic learning infants. 2A (left). Reduplicated multisyllables [bɛbɛbɛbɛbɛɪ] produced by 8-month-old. 2B (right). Variegated multisyllables [əweibʌbdoweɪhæ], with syllables 1, 2, and 6 characterized by hoarse voice quality, produced by 14-month-old. Speech samples collected as part of dissertation research reported in Alhaidary (2012) and figure printed with permission of Susan Rvachew and Abdulsalam Alhaidary.

    Dictionary definitions of babbling (Oxford English Dictionary, 2015) highlight two aspects of the vocalizations that the infant is producing during this early stage of speech development: First, these vocalizations are unintelligible or meaningless to the listener; second, these vocalizations have a speechlike form in that they are composed of recognizable speech sounds and syllables. It is not uncommon to see the definition of babble stretched to cover all of the meaningless, nonreflexive utterances produced by the infant. This overgeneralization of the term is inappropriate, as illustrated in Figure 1; therefore, it is necessary to bring some precision to the use of terms (Oller, 2000). Prior to the onset of babbling at approximately seven months of age the infant produces a range of nonreflexive sounds that are excluded from the definition of babbling by virtue of being not recognizably speechlike (e.g., squeals, growls, raspberries). Also excluded are utterances that contain recognizable speech sounds, but not organized into speechlike syllables (i.e., single vowels and marginal babble). Identifying the transition from these less mature types of vocalizations to babbling is supported by a scientific definition of babbling that takes into account the perceptual, acoustic, and articulatory characteristics of these more mature utterances. Canonical babbling is produced with articulatory gestures that serve to alternate between a closed (or relatively closed) and open vocal tract to produce syllables 100 to 500 ms long with formant frequency transitions that have a duration of 25 to 120 ms. The vocalic portions of the syllables are produced with normal phonation and resonance. Canonical babble may be composed of a single syllable meeting these criteria or a rhythmic sequence of syllables, either reduplicated as shown in Figure 2A or variegated as shown in Figure 2B. In this article the focus will be on canonical babbling specifically, although a brief overview of how this skill emerges from earlier noncanonical stages of speech development will be provided before proceeding to the details of the phonetic content of this type of infant vocalization. Definitions of technical terms that will appear in this overview and subsequently are shown in Table 1.

    Table 1. Definitions.


    Speech sounds produced with the vocal tract open permitting unimpeded airflow. Includes point or corner vowels ([i], [u], [a]) and central vowels such as the schwa [ə].


    Speech sounds produced with vocal tract closed or approximately closed. Includes closures produced with lips, i.e., labial [m], [b], or tongue tip, i.e., alveolar [n], [d], or tongue body against velum (soft palate), i.e., velar [ɡ], [k].

    Manner classes

    Fricative sounds that are produced with turbulence due to partial closure of the vocal tract ([f], [s]) may be contrasted with stops that involve full closure ([p], [t]) and glides ([w], [j]) or liquids ([l], [ɹ]) that involve partial closure but no turbulence.


    Articulation pertains to the movement of articulators such as tongue and lips to produce different speech sounds.


    Phonation pertains to different voice qualities that are produced when forcing air through the vocal folds in the larynx, i.e., breathy, hoarse.


    Resonance is the sound quality that results from the size and shape of the resonating cavity that speech is produced in. Fully resonant vowels are produced in an open oral cavity (mouth) and quasiresonant vowels are produced with air flow through the nasal cavity (nose).


    When air resonates in the vocal cavity, sound energy at certain frequencies is concentrated, depending on the size and shape of the resonating cavity. The lowest concentration is called the first formant (F1), and the next highest concentration of sound energy is called the second formant (F2). The frequency location of these formants is unique to specific speech sounds.

    Unsupervised learning

    Exploratory movements permit the learner to discover the association between movements and their sensory effects.

    Supervised learning

    Goal-directed practice permits the learner to use error signals to gradually improve the selection and implementation of motor plans to achieve the goal.

    Reinforcement learning

    Positive environmental responses to desirable actions increase the likelihood that the learner will repeat the movement in the appropriate context.

    1.2 Approaches to the Study of Infant Speech Development

    Infant vocalizations are commonly transcribed using IPA symbols (International Phonetic Association, 1999) to reduce the continuous stream of acoustic output to a linearly ordered string of segments, each with discrete voice, place, and manner of articulation parameters (e.g., Davis & MacNeilage, 1995; Roug, Landberg, & Lundberg, 1989). The main advantages are that such a description can be produced without specialized equipment or the cooperation of the infant; furthermore, this form of description allows for direct comparison of nonmeaningful with meaningful speech and infant vocalizations with adult speech output. These advantages are outweighed by significant shortcomings, specifically: (1) IPA is subject to listener biases; (2) transcription reliability can be poor in the case of infant speech; (3) similarities among samples are emphasized while differences are obscured; (4) segmental aspects of speech receive the primary focus; (5) utterances that are not grossly speechlike are difficult to transcribe adequately and thus a direct comparison of canonical and precanonical forms is impossible; and (6) assumptions about the correspondence between transcribed phonetic segments and underlying articulatory parameters are inappropriately generalized from adult speech to infant vocalizations (Rvachew & Brosseau-Lapré, 2018).

    An alternative is to use instrumentation to describe the phonetic characteristics of infant vocalizations more directly, as for example in kinematic studies of lip and jaw movements produced by infants during speech (e.g., Moore & Ruark, 1996; Nip, Green, & Marx, 2009) or in studies of the acoustic characteristics of infant vowels (Kent & Murray, 1982; Rvachew, Slawinski, Williams, & Green, 1996). However, accurate acoustic analysis of infant speech is known to be difficult and time-consuming due the high fundamental frequency of infant speech, even with recent improvements to analysis techniques that are specially adapted to this context (Alku, Pohjalainen, Vainio, Laukkanen, & Story, 2013; Shadle, Nam, & Whalen, 2016; Vallabha & Tuller, 2002). Therefore, the best option is to combine perceptual transcription with instrumental descriptions of infant speech.

    Oller (2000) explains that in order to provide a linguistically meaningful description of infant vocalizations it is necessary to employ an infraphonological framework that defines how acoustic and articulatory parameters are manipulated to produce well-formed syllables. This framework permits the adult listener to identify categories of infant vocalization according to the infraphonological parameters that are violated in their production, as will be described in more detail in section 1.3. Only those vocalizations that meet all requirements for well-formedness, thus qualifying as canonical babble, can be properly submitted to phonetic transcription. Even at this level, however, phonetic transcription may provide excessive detail and overestimate the infant’s phonetic abilities. Ramsdell, Oller, Buder, Ethington, and Chorna (2012) suggest that the actual number of syllabic templates in an infant’s repertoire will be smaller than the repertoire of syllables revealed by detailed phonetic transcription, so that (for example) each of the utterances [baba] [β‎aβ‎a] [baβ‎a] [babβ‎a] can be considered as part of a single labial obstruent-plus-vowel template.

    In addition to empirical investigations of infant vocal output, vocal tract models (Ménard, Davis, Boë, & Roy, 2009) and simulations of speech development (Nam, Goldstein, Giulivi, Levitt, & Whalen, 2013) have become increasingly popular methods of testing theories of the mechanisms that underlie the development of early speech. A related methodology employs robots to examine the intersection of infant speech output and adult responses to model developmental processes (Howard & Messum, 2011; Moulin-Frier, Nguyen, & Oudeyer, 2014; Rasilo, Räsänen, & Laine, 2013).

    1.3 Stages in the Development of Babbling

    Four longitudinal studies of prelinguistic speech development have suggested that the infant progresses through a series of stages from reflexive vocalizations to meaningful speech. Oller (1980) classified infant vocalizations in terms of infraphonological features such as pitch, phonation type, resonance pattern, timing, and amplitude. Stark (Stark, 1980; Stark, Rose, & Benson, 1978) used auditory and spectographic analysis to identify features such as breath direction, phonation type, prosodic features, vocalic, and consonantal elements. Koopmans-van Beinum and van der Stelt (1986) attended to phonatory and articulatory parameters separately to produce a sensory motor description of infant speech output. Roug et al. (1989) transcribed the samples using IPA and then focused on place, manner, and voicing features. Despite these large procedural variations across studies in English, Swedish, and Dutch language contexts, the studies yielded remarkably similar stage hierarchies and therefore only the stages described by Oller (2000) will be outlined here.

    Five stages are proposed, each marked by differences in the relative frequency of particular utterance types rather than the production of unique categories of vocalization. The utterance types are defined in relation to the definition of the canonical syllable (CS) as defined in section 1.2, revealing the systematic accumulation of the principles of well-formedness in syllable production during the first year of life. The resulting set of utterance types makes it possible to classify all human utterances using the same scheme, while not assuming that the infant has the same vocal tract structure, motor control capabilities, articulatory goals, or internalized linguistic categories as the adult talker.

    During the phonation stage in the first month or two after birth, the quasiresonant vowel (QRV) is the primary nonreflexive, nondistress vocalization, produced with the vocal tract in a more or less resting position. The primary difference between QRVs and CSs is the absence in QRVs of clear upper frequency formants and obvious formant transitions that would accompany deliberate shaping of the vocal tract to produce a specific vowel or syllable. Although these utterances sound nasal, the velum may or may not be lowered during production. QRVs are produced with normal phonation and duration and may occur in rhythmic sequences, qualities that are shared with CSs.

    The primitive articulation stage emerges between one and four months of age when QRVs are interrupted by an undifferentiated vocal tract closing gesture to produce an utterance referred to as a goo or coo. IPA transcription of these utterances leads to the common but misleading conclusion that prelinguistic phonetic development proceeds from back to front (velar–coronal–labial) whereas linguistic phonetic development proceeds in the opposite direction with labial consonants acquired before velars (Irwin, 1947a). Such a conclusion does not take into account differences in infant vocal tract morphology or speech motor control and assumes incorrectly that IPA transcription is a reasonable representation of the infant’s articulatory gestures during the first few months of life.

    The expansion stage, lasting from approximately four through seven months, is marked primarily by the appearance of fully resonant vowels (FRV) alongside a large variety of utterance types (raspberries, squeals, growls, yells, whispers, and so on) that give the appearance of free exploration of phonatory and articulatory parameters. Toward the end of this stage marginal babbling appears in the form of consonant–vowel syllables that do not meet the requirements of a CS. These highly variable vocalizations are characterized by unusual timing, phonatory, or resonance parameters that violate the parameters of well-formedness.

    The canonical babbling stage begins on average at seven months, and always by 11 months, in normally developing infants. This stage is heralded by the emergence of canonical syllables, easily identified by parents and other untrained observers especially when in multisyllable form. There is some controversy about whether the production of reduplicated babble, in which all the syllables in the utterance are the same, and variegated babble, in which there are varied consonants and vowels, occurs during overlapping or sequential stages. Some observers have described variegated babble as a more complex utterance that appears subsequent to reduplicated babble (Elbers, 1982; Stoel-Gammon, 1989). In larger sample studies, it has been observed that these two kinds of babbling emerge in parallel (Mitchell & Kent, 1990; Smith, Brown-Sweeney, & Stoel-Gammon, 1989). The impossibility of knowing the infant’s intention makes it difficult to resolve this conflict. Recall that Ramsdell et al. (2012) hypothesized that utterances such as [baba] [babβ‎a] [baβ‎a] can be considered to be members of the same template—phonetically different utterances that would sound roughly similar to the casual listener. This seems to be a reasonable hypothesis if the younger infant produces all three vocalizations as a consequence of imprecise jaw closing gestures. This hypothesis further implies that [babβ‎a] and [baβ‎a] are not in fact more complex in form than [baba]. Although it seems theoretically possible that an older infant might intentionally produce [babβ‎a] and [baβ‎a] in a manner that is qualitatively and functionally differentiated from [baba], it is not clear how intentional babbles can be isolated from accidental versions of the same forms. Further to the issue of intent, the canonical babbling stage coincides with beginning receptive language skills and overlaps with the emergence of intentional communication. Babbled utterances do not immediately serve the communicative needs of the child, however, as often the infant will resort to nonverbal gestures or more primitive forms of vocalization to demand attention or comment on the environment (McCune, Vihman, Roug-Hellichus, Bordenave Delery, & Gogate, 1996). Babbling frequently occurs when the infant is alone or when the adult is not attending to the infant and therefore these vocalizations are clearly not communicative in intent (Locke, 1989).

    The final stage is the integrative stage, when babbling coexists with the production of meaningful words, a period covering roughly 12 through 18 months of age. Babbling may be integrated with meaningful forms within the same utterance when babies produce jargon. Another form that is common during this stage is gibberish, consisting of sequences of nonmeaningful syllables produced with prosodic contours that mimic those heard in meaningful phrases.

    1.4 Maturational Impacts on Babbling

    The infant vocal tract undergoes dramatic reshaping during the first six months of life. Continuous but more gradual anatomical changes occur thereafter until the shape of the vocal tract approximates that of the adult model by the age of six years (Kent & Vorperian, 1995). Developmental changes in the structure of the vocal tract may partly explain the emergence of new utterance types during the first year, especially the shift from the phonation stage with the predominance of quasiresonant vocalizations to the expansion stage with varied vocalizations including fully resonant vowels. The relationship between vocal tract structure and function is reciprocal, however: It is not simply a matter of structural change permitting new utterance types; rather, vocal practice changes muscle strength, contributing in turn to changes in vocal tract structure. Furthermore, developmental changes in the ability to control and coordinate the vocal system impact the infant’s vocal repertoire (for review, see Rvachew & Brosseau-Lapré, 2018).

    A prominent theory attributes early babbling patterns to the primary role of mandibular (lower jaw) movements to syllable production (MacNeilage & Davis, 2000). By this account, the dominant pattern of rhythmic jaw movements provides a syllabic frame for speech production. Frame dominance is hypothesized to restrict early syllables to forms requiring very little differentiation of tongue and lip movements from the dominant jaw movement pattern: specifically, labial consonants with central vowels (e.g., [bʌ]), alveolar consonants with front vowels (e.g., [di]), and velar consonants with back vowels (e.g., [ɡu]). The dominant role of the jaw in early syllable production has been established in kinematic studies (Green, Moore, Higashikawa, & Steeve, 2000). However, there is more variability in the content—consonants and vowels produced within syllables—than is predicted, and at earlier ages than predicted, suggesting that the phonetic content of babbled syllables is better explained by alternative accounts such as articulatory phonology (Giulivi, Whalen, Goldstein, Nam, & Levitt, 2011; Sussman, Duder, Dalston, & Caciatore, 1999; Sussman, Minifie, Buder, Stoel-Gammon, & Smith, 1996). Nam et al. (2013) predict that syllables will be more frequent in infant and adult speech when they are composed of overlapping consonant and vowel gestures that can be produced synchronously. By their account, infants have control over a variety of vocal tract constrictions involving the jaw and tongue body. Kinematic research with young children has shown that stability in the production of higher order goals (i.e., lip aperture area) is achieved prior to stability in the underlying articulatory gestures that contribute to the higher order goal (Smith & Zelaznik, 2004). Furthermore, practice plays a key role in the emergence of stable higher order goals (Walsh, Smith, & Weber-Fox, 2006). It would appear that passive maturational mechanisms in the structural or functional domains are insufficient to explain the developmental course of babbling. Therefore, environmental inputs and learning mechanisms will be considered.

    1.5 Environmental Impacts on Babbling

    The most striking evidence that the auditory environment is essential to the development of babbling is provided by studies of hearing-impaired infants. The onset of canonical babbling is significantly delayed in all infants with severe or profound hearing loss (Eilers & Oller, 1994). Even infants who experience the mild fluctuating hearing loss associated with otitis media will demonstrate brief delays in the onset of the canonical babbling stage (Rvachew, Slawinski, Williams, & Green, 1999). The onset of canonical babbling is associated with the severity of the hearing loss and the duration of access to auditory feedback as indexed by time passed since amplification was provided by hearing aids or cochlear implants (von Hapsburg, Davis, & MacNeilage, 2008). Koopmans-van Beinum, Clement, and van den Dikkenberg-Pot (2001) proposed that auditory feedback is necessary for coordination of the phonatory and articulatory systems, thus permitting the emergence of canonical babble. Fagan (2015) found that the emergence of reduplicated babbling and the number of repeated syllables per utterance was associated with access to auditory feedback: Specifically, these speech behaviors were more frequent in hearing infants than those with hearing impairments at the same age; furthermore, these behaviors markedly increased in hearing-impaired infants five months after receiving a cochlear implant. Fagan argued that reduplicated babbling allowed the infants to form auditory-motor representations of speech sounds and syllables.

    The hypothesis that infants’ abstract articulatory representations for specific sounds or syllables as a consequence of exposure to the ambient language is controversial. Cross-linguistic data is required to resolve the question of whether infants learn to produce language-specific sounds during infancy. Evidence of “babbling drift” can be difficult to find, however, whether comparing babbling samples across language groups via listener judgments (Engstrand, Williams, & Lacerda, 2003), acoustic measurements (Eilers, Oller, & Benito-Garcia, 1984), or phonetic transcription (Vihman, de Boysson-Bardies, Durand, & Sundberg, 1994). In cross-linguistic studies, descriptive metrics must reflect the infant’s focus of attention rather than adult linguistic categories and be sensitive enough to differentiate the language groups of interest. When this challenge is met, distinct cross-linguistic differences in early speech production have been identified at many levels of description, including prosodic (Hallé, De Boysson-Bardies, & Vihman, 1991), segmental (Rvachew, Alhaidary, Mattock, & Polka, 2008), and supra-segmental (Whalen, Levitt, Hsiao, & Smorodinsky, 1995). These cross-linguistic differences will be described in greater detail in section 2.3.

    1.6 Cognitive Impacts on Babbling

    Various learning mechanisms have been proposed to explain the essential role of auditory input in the emergence of canonical babble during the first year of life. Undoubtedly, an important mechanism is unsupervised learning in which stable action–perception links are acquired through random articulatory practice with auditory and somatosensory feedback, in particular during the expansion stage. Ultimately, an internal model of the mapping between inputs to and outputs from the speech motor system is acquired (Wolpert, Ghahramani, & Flanagan, 2001).

    Subsequently, this internal model lays the foundation for supervised learning in which the infant learns through trial and error to achieve speechlike and perhaps language-specific speech targets during the canonical babbling stage (Imada et al., 2006; Kuhl, Ramírez, Bosseler, Lin, & Imada, 2014). Performance is improved in supervised learning because feedback generates error signals in relation to a specified target during practice (Wolpert et al., 2001). Traditionally, supervised learning invokes the notion of an external model for imitation, and at least one study has suggested imitation of point vowels by infants as young as six months of age (Kuhl & Meltzoff, 1996). Self-supervised learning is also possible, especially given the early stabilization of native-language vowel categories in perceptual learning (Kuhl et al., 2008), and indeed most babbling appears to occur in the absence of external models. Moulin-Frier et al. (2014) have described a model in which early learning is intrinsically motivated and focused on self-generated auditory targets; a developmental transition to imitation learning occurs later in development after the achievement of the basic principles of speech production.

    An alternative hypothesis invokes reinforcement learning as the primary mechanism (Howard & Messum, 2011), with speech learning driven by adult mimicry of infant vocalizations that capture adult attention when they approximate phonetic categories in the adult language system (see also Rasilo et al., 2013). Social reinforcement from parents also shapes infant vocal output (Goldstein, King, & West, 2003; Goldstein & Schwade, 2008). These differing learning mechanisms—unsupervised, supervised, and reinforcement learning—are not mutually exclusive, and it is likely that all three play a role in early speech development.

    2. Phonetic Characteristics of Babble

    2.1 Universal Phonetics of Babble

    Cross-linguistic research in babbling reveals many aspects of infant vocalizations that occur regardless of the ambient language input. A common focus of this research is the phonetic content of babble, focusing on segments. These studies reveal that stop consonants are by far the most frequently occurring type across a broad range of languages, typically comprising more than 50% of the total consonants produced (Boysson-Bardies & Vihman, 1991; Irwin, 1947b). Either liquids or fricatives will be the least commonly occurring type of consonant in infant babbling. Interestingly, fricatives are likely to occur in postvocalic contexts, possibly as a consequence of a generalized fall in energy over the course of syllable production. With respect to place of consonant articulation, labial and alveolar consonants together account for 80% to 90% of all consonants produced in babble.

    Vowel segments are similarly restricted, with central and mid- or low-front vowels being preferred across many languages (i.e., English, French, Japanese, Mandarin, Korean, and Swedish: Boysson-Bardies & Vihman, 1991; Buhr, 1980; Kent & Bauer, 1985; Kent & Murray, 1982; Lee, Davis, & MacNeilage, 2010; Chen & Kent, 2010). Vocal tract modeling shows that it is possible to produce the full range of vowels with the infant vocal tract, but most vowels in the theoretical infant vowel space are perceived by the adult as low and front (Ménard, Schwartz, & Boë, 2004). Developmental changes in vocal tract morphology and functional abilities result in a gradual expansion of the vowel space along the F1 dimension in the first year and the F2 dimension in the second year; consequently, the corner vowels appear alongside greater differentiation among vowel categories by approximately 18 months of age (Ishizuka, Mugitani, Kato, & Amano, 2007; Kent & Murray, 1982; Rvachew et al., 2008; Rvachew, Mattock, Polka, & Ménard, 2006; Rvachew et al., 1996).

    The syllabic and prosodic structure of babbled utterances has also received attention. Excluding noncanonical utterances containing only a single vowel (which remain the most common speechlike utterance through one year of age; Kent & Bauer, 1985), babbled utterances are composed of one or more CV syllables overwhelmingly (Mitchell & Kent, 1990). Syllables containing a coda consonant or consonant cluster are extremely rare. About three-quarters of canonical babbles consist of a single CV syllable during the canonical babbling stage (Fagan, 2009). The frequency of multisyllable babbles and the number of repetitions per babbled utterance appears to peak at approximately nine months, decrease through approximately 12 months when first words appear, and then increase again later in the second year, at least in normal hearing English-learning infants (Fagan, 2009, 2015; Smith et al., 1989). Variegated and reduplicated canonical babbling emerges contemporaneously; variegation in manner is more common than variation in place of articulation (Gildersleeve-Neumann, Davis, & MacNeilage, 2013; Smith et al., 1989).

    Within utterance, pitch contours have been examined as precursors to the linguistic manipulation of fundamental frequency (F0). Depending upon the language environment, the infant will hear variations in F0 to signal lexical contrast or phonological tone. There is little consensus among these studies in reported preferences for pitch contours in babble. The most frequently observed patterns are falling, rising–falling, and level pitch contours (Amano, Nakatani, & Kondo, 2006; Chen & Kent, 2009; Davis, MacNeilage, Matyear, & Powell, 2000; Kent & Murray, 1982; Whalen, Levitt, & Wang, 1991). Explanations for the observed pitch contours also vary: Some researchers place strong emphasis on physiological factors (Kent & Murray, 1982) while others explicitly propose learning from the ambient language (Hallé et al., 1991). It is difficult to determine if infants are deliberately manipulating prosodic contours because the background level of F0 in infant babble is unstable: maturational declines in absolute F0 and variability in F0 occur continuously during infancy; furthermore, the ability to coordinate all the parameters of prosody—pitch, loudness, and duration—develops well into late childhood (Kehoe, Stoel-Gammon, & Buder, 1995; Lee, Potamianos, & Narayanan, 1999; Vorperian & Kent, 2007).

    2.2 Individual Differences Within Language Groups

    Oller (2000) concludes that babbling is strongly “canalized” and therefore little digression in the development course is expected in the face of genetic or environmental variation. However, individual differences are known to occur among infants learning the same language. These deviations from the norm are often subtle and qualitatively specific to the source of variation. Differences in access to language input disrupt developmental progress toward the achievement of canonical babble during the first year. For example, extreme poverty appears to reduce volubility in infant output (i.e., rate of vocalizing) while not significantly impacting the age of onset for canonical babbling (Oller, Eilers, Basinger, Steffens, & Urbano, 1995). Volubility is also lower among infants who are hypothesized to have a heritable impairment in phonological processing, specifically infants who are eventually diagnosed with delays in language or reading acquisition. These infants also produce less complex babble than infants who show normal speech, language, and reading development, with more vocalizations containing glide consonants and fewer containing reduplicated or variegated true consonants (Lambrecht Smith, Roberts, Locke, & Tozer, 2010; Stoel-Gammon, 1989).

    Early onset otitis media is associated with later emergence of canonical babbling and slower expansion of the vowel space, in comparison to babbling development in infants who experience their first ear infections after the first year (Rvachew et al., 1996, 1999). Sensory-neural hearing loss has more significant impacts that include delayed onset of canonical babbling and qualitative differences in the phonetic content of speech: specifically, slower advancement in the complexity of syllable shapes after the onset of the CS stage; a persistently small vowel space restricting the inventory to central vowels; and an unusually high proportion of labial consonants and syllabic consonants (for review, see Ertmer & Nathani Iyer, 2012). When deaf infants receive cochlear implants at an early age (before 30 months) phonetic development may proceed relatively normally, with onset of canonical babble five to 10 months after implantation being a good prognostic indicator.

    An important kind of auditory input may be access to feedback of the infant’s own speech during speech practice. Infants who have undergone tracheostomy to bypass an obstructed airway have restricted access to this kind of feedback until the breathing tube is removed (decannulation). Some infants who have experienced long-term tracheostomy have neurological or craniofacial conditions that would otherwise impair speech production but many do not have complications beyond the tracheostomy. Speech development in this latter group appears to progress through the normal prelinguistic stages of vocal development after decannulation, albeit at a faster pace (Kraemer, Plante, & Green, 2005); specifically, expansion stage vocalizations appear first followed rapidly by canonical babble consisting of the expected CV syllables favoring stop consonants; fricatives emerge last and speech therapy may be required to ensure a complete phonetic inventory and accurate speech. Outcomes are associated with the age of initial tracheostomy procedure, duration of cannulation, and age at decannulation (Jiang & Morrison, 2003). When the tracheostomy is performed after one year or decannulation occurs in the first three months of life, speech development will follow a normal trajectory. If the tracheostomy is performed at approximately four months, there is likely to be speech and language delay if the duration of cannulation persists throughout the second year of life; outcomes may be good if decannulation occurs before or shortly after the first birthday.

    Many infants who require tracheostomy were born prematurely, and low birth weight impacts speech development even without tracheostomy in this population that is at-risk for subtle but long-term issues with motor coordination. Rvachew, Creighton, Feldman, and Sauve (2005) observed that very low birth weight infants with a history of bronchopulmonary dysplasia demonstrated delayed onset of canonical babbling and a tendency toward unusual rhythmic organization of their babbling (see also Goldfield, 1999). Infants with more frank motor impairments, specifically cerebral palsy, have been observed to produce very short utterances reflecting differences with controlled expiration (Levin, 1999). In contrast, infants with cleft palate do not demonstrate difficulties with the rhythmic quality of their vocalizations but do have restricted and unusual phonetic repertoires reflecting impairments in the structural domain (Chapman, Hardin-Jones, Schulte, & Halter, 2001).

    2.3 Cross-Linguistic Differences in Babble

    Many studies have tested the hypothesis of “babbling drift” by searching for cross-linguistic differences in the phonetic characteristics of infant babble that are hypothesized to mirror cross-linguistic differences in adult speech input. Although these studies typically involve small samples of infants and there are disputes about the appropriate description of salient ambient language inputs, some cross-linguistic differences in infant speech have been reported. With respect to consonant manner for example, French-, Japanese-, and Chinese-learning infants may produce more nasal consonants than English- and Swedish-learning infants in their prelinguistic babble (Boysson-Bardies & Vihman, 1991; Chen & Kent, 2010). However, nasals are a common consonant type universally, and it is difficult to establish that this cross-linguistic variation exceeds the degree of within-language variation for proportion of stops versus nasals in infant vocalizations. It is not clear that there are reliable differences in place of articulation, but labial place may be predominant in English and French whereas alveolar place seems somewhat more common than labial in Japanese, Mandarin, Korean, and Swedish (Boysson-Bardies & Vihman, 1991; Chen & Kent, 2005; Kent & Bauer, 1985; Lee et al., 2010). Language-specific differences in voice onset time have been reported, specifically involving a more frequent production of voicing lead by French-learning infants in comparison to English-learning infants (Whalen, Levitt, & Goldstein, 2007).

    Figure 3. An alternative to segment-based approaches to cross-linguistic research is to examine global characteristics of the infant’s speech production, in this case the size and shape of the vowel space. The corners of the vowel space were identified as the vowels with values (in mels) maximum [F2−F1] (Diffuse corner), minimum [(F1+F2)/2] (Grave corner) and minimum [F2−F1] (Compact corner) and using these values to calculate the triangular vowel space area. Arabic infants were found to produce larger and more symmetrical vowel spaces in comparison to the English vowel space throughout the age range 10 through 18 months of age. Data taken from Alhaidary (2012), and figure printed with permission of Susan Rvachew and Abdulsalam Alhaidary.

    The approach to the study of the vowel system has focused on more global characteristics of the infant’s phonetic output. Acoustic analysis of the infant’s vowels permits a description of the infant’s vowel space that does not assume infant knowledge of adult phonetic categories, as shown in Figure 3. For example, developmental changes and cross-linguistic differences in the location of the center of the vowel space in F1 and F2 coordinates have been identified in French- versus English-learning infants (de Boysson-Bardies, Halle, Sagart, & Durand, 1989; Rvachew et al., 2006). Furthermore, expansion of the vowel space toward the corners proceeds differently as a function of the complexity of the input vowel system: specifically, this expansion appears to be faster when the vowel system is simple, as in Arabic, when compared to languages with more complex vowel inventories such as English or French (Alhaidary, 2012; Rvachew et al., 2008). These studies suggest that development of the infant vowel space does not reflect a straightforward process of attempting to match adult phonetic targets; rather, changes in the shape of the vowel space as a whole seem to take into account competition for perceptual attention in different corners of the space.

    Some languages are differentiated more by rhythmic characteristics than phonetic content: For example, English is a stress-timed language whereas French is a syllable-timed language. Levitt and Wang (1991) found that French- and English-learning infants produced babble that reflected striking differences in the prosodic organization of the ambient language environment: In particular, babbled utterances by the French-learning infants contained more syllables that were regularly timed excepting the final syllables, which showed more prominent utterance final lengthening when compared to those produced by their English-learning age peers. Further to the topic of cross-linguistic differences in prosody, numerous studies have reported cross-linguistic differences in the frequency of rising and falling tones in babbled disyllables with continuity in the proportion of usage of these patterns into the first word stage (e.g., for Mandarin, see Chen & Kent, 2009; for French vs. Japanese, see Hallé et al., 1991; for English vs. French, see Whalen et al., 1995).

    3. Critical Analysis of Scholarship

    The latter decades of the twentieth century yielded some highly significant advancements in the study of vocal development. An essential breakthrough was the establishment of an objective definition of canonical babbling that integrates acoustic, articulatory, and phonetic factors. An understanding of the developmental course of vocal development during infancy emerged, and the age of onset for the canonical babbling stage has been established with replications across multiple laboratories and language groups. It is now clear that vocal development absolutely requires adequate access to language input: Hearing-impaired infants do not learn to babble in the normal fashion unless hearing is habilitated early in life via hearing aids or cochlear implants. Despite this considerable progress, the learning mechanisms that underpin the acquisition of babbling have not yet been determined, and it remains unclear whether infants are acquiring language-specific articulatory representations for speech sounds during this prelinguistic stage of vocal development.

    Many studies have attempted to test the hypothesis of “babbling drift” by comparing the phonetic characteristics of babble produced by infants learning different languages. These studies are marked by methodological difficulties including small samples of infants, unreliable descriptive metrics, and a complete lack of replication studies for a given language comparison and outcome measure. These problems are exacerbated by the universality of many phonetic factors so that, for example, stop consonants predominate in adult input and in infant output across the many languages that have been studied. Therefore, hypotheses about potential differences (e.g., relatively high numbers of affricates in Mandarin language input to the infant) concern very low frequency events in infant speech. The combined effect of these methodological problems is that it is typically impossible to determine if the observed differences between language groups are greater than the variation that might be observed within a language group (Vihman et al., 1994).

    One reason that the sample sizes in these studies are small is that the speech sample analysis techniques are technically difficult, time-consuming, and expensive. Automatic speech analysis tools, as a substitute for phonetic transcription and other methods of hand coding on a segment-by-segment basis, have the potential to improve the efficiency of this work (Oller et al., 2010; VanDam & Silbert, 2016). Acoustic analysis of infant speech has yielded some particularly promising findings and can be conducted reliably (Rvachew, Creighton, Feldman, & Sauve, 2002) but not always accurately, especially when F0 is high (Kelso, Tuller, Vatikiotis-Bateson, & Fowler, 1984). Unfortunately, the most accurate forms of automatic speech analysis are not sufficiently accurate with natural infant speech to avoid a considerable amount of hand coding (Shadle, Nam, & Whalen, 2016). Therefore, the accumulation of data to test the babbling drift hypothesis may continue at a slow pace, especially if phonetic segments continue to be the primary focus of investigation.

    Recent research has attempted to sidestep the difficulty of working with real data by testing hypotheses with simulations and robotic applications (Howard & Messum, 2011; Moulin-Frier et al., 2014; Nam et al., 2013). While theoretically interesting, the outcome of these studies can only be validated in relation to reliable and replicable data recorded from live infants, and therefore the shortage of these data remains a problem even in the face of these considerable technological innovations. Furthermore, observational and simulation studies suffer from a lack of consensus about appropriate units of analysis for comparing infant to adult speech. Typically, adult languages are differentiated by phonological units such as the distribution of phonemes, and then it may be assumed that infant speech will gradually approximate the ambient language distribution of those units. However, there is little evidence that prelinguistic infants attend to phonemes per se. Neither is it clear that speech targets for the supervised learning process in early vocal learning are phonemes. The issue of what the infant is attending to has been addressed since the early days of cross-linguistic research when Vihman et al. (1994) suggested that the most appropriate characterization of ambient language input would be derived from the distribution of target phonemes underlying the infant’s first words. More recently, Masapollo, Polka, and Ménard (2016) demonstrated that infants prefer to listen to vowels produced with infantlike pitch or frequency formants, in comparison to the same vowels with adultlike spectral characteristics. Ongoing research is required to fully understand the intersection of ambient language inputs, infant attentional preferences, and their vocal output.

    Further Reading

    • Ertmer, D. J., & Nathani Iyer, S. (2012). Prelinguistic vocalizations in infants and toddlers with hearing loss: Identifying and stimulating auditory-guided speech development. In M. Marschark & P. E. Spencer (Eds.), Identifying and stimulating auditory-guided speech development. Oxford Handbooks Online. Oxford University Press. Retrieved from
    • Oller, D. K. (2000). The emergence of the speech capacity. Mahwah, NJ: Lawrence Erlbaum Associates.


    • Alhaidary, A. (2012). Developmental changes in Arabic babbling in relation to English and French babbling (Doctoral Dissertation). Montreal: McGill University.
    • Alku, P., Pohjalainen, J., Vainio, M., Laukkanen, A.-M., & Story, B. H. (2013). Formant frequency estimation of high-pitched vowels using weighted linear prediction. Journal of the Acoustical Society of America, 134, 1295–1313.
    • Amano, S., Nakatani, T., & Kondo, T. (2006). Fundamental frequency of infants’ and parents’ utterances in longitudinal recordings. The Journal of the Acoustical Society of America, 119(3), 1636–1647.
    • Buhr, R. D. (1980). The emergence of vowels in an infant. Journal of Speech and Hearing Research, 23, 73–94.
    • Chapman, K. L., Hardin-Jones, M., Schulte, J., & Halter, K. A. (2001). Vocal development of 9-month-old babies with cleft palate. Journal of Speech, Language, and Hearing Research, 44, 1268–1283. doi:10.1017/S0305000909009581.
    • Chen, L., & Kent, R. D. (2010). Segmental production in Mandarin-learning infants. Journal of Child Language, 37(2), 341–371. doi:10.1017/S0305000909009581.
    • Chen, L., & Kent, R. D. (2005). Consonant–vowel co-occurrence patterns in Mandarin-learning infants. Journal of Child Language, 32, 507–534.
    • Chen, L., & Kent, R. D. (2009). Development of prosodic patterns in Mandarin-learning infants. Journal of Child Language, 36, 73–84.
    • Davis, B. L., & MacNeilage, P. F. (1995). The articulatory basis of babbling. Journal of Speech and Hearing Research, 38, 1199–1211.
    • Davis, B. L., MacNeilage, P. F., Matyear, C. L., & Powell, J. K. (2000). Prosodic correlates of stress in babbling: An acoustical study. Child Development, 71(5), 1258–1270.
    • de Boysson-Bardies, B., Halle, P., Sagart, L., & Durand, C. (1989). A crosslinguistic investigation of vowel formants in babbling. Journal of Child Language, 16, 1–17.
    • de Boysson-Bardies, B., & Vihman, M. (1991). Adaptation to language: Evidence from babbling and first words in four languages. Language, 67(2), 297–319.
    • Eilers, R. E., & Oller, D. (1994). Infant vocalizations and the early diagnosis of severe hearing impairment. Journal of Pediatrics, 124(2), 199–203.
    • Eilers, R. E., Oller, D. K., & Benito-Garcia, C. R. (1984). The acquisition of voicing contrasts in Spanish and English learning infants and children: A longitudinal study. Journal of Child Language, 11(2), 313–336. doi:10.1017/S0305000900005791.
    • Elbers, L. (1982). Operating principles in repetitive babbling: A cognitive continuity approach. Cognition, 12(1), 45–63.
    • Engstrand, O., Williams, K., & Lacerda, F. (2003). Does babbling sound native? Listener responses to vocalizations produced Swedish and American 12- and 18-month-olds. Phonetica, 60, 17–44.
    • Ertmer, D. J., & Nathani Iyer, S. (2012). Prelinguistic vocalizations in infants and toddlers with hearing loss: Identifying and stimulating auditory-guided speech development. In M. Marschark & P. E. Spencer (Eds.), Identifying and stimulating auditory-guided speech development. Oxford Handbooks Online. Oxford University Press. Retrieved from
    • Fagan, M. K. (2009). Mean Length of Utterance before words and grammar: Longitudinal trends and developmental implications of infant vocalizations. Journal of Child Language, 36(3), 495–527. doi:10.1017/S0305000908009070.
    • Fagan, M. K. (2015). Why repetition? Repetitive babbling, auditory feedback, and cochlear implantation. Journal of Experimental Child Psychology, 137, 125–136. doi:10.1016/j.jecp.2015.04.005.
    • Gildersleeve-Neumann, C. E., Davis, B. L., & MacNeilage, P. F. (2013). Syllabic patterns in the early vocalizations of Quichua children. Applied Psycholinguistics, 34(01), 111–134. doi:10.1017/S0142716411000634.
    • Giulivi, S., Whalen, D. H., Goldstein, L. M., Nam, H., & Levitt, A. G. (2011). An articulatory phonology account of preferred consonant-vowel combinations. Language Learning and Development, 7(3), 202–225. doi:10.1080/15475441.2011.564569.
    • Goldfield, E. C. (1999). Prosody during disyllable production of full-term and preterm infants. Ecological Psychology, 11(1), 81–102. doi:10.1207/s15326969eco1101_3.
    • Goldstein, M. H., King, A. P., & West, M. J. (2003). Social interaction shapes babbling: Testing parallels between birdsong and speech. Proceedings of the National Academy of Sciences, 100(13), 8030–8035.
    • Goldstein, M. H., & Schwade, J. A. (2008). Social feedback to infants’ babbling facilitates rapid phonological learning. Psychological Science, 19(5), 515–523. doi:10.1111/j.1467-9280.2008.02117.x.
    • Green, J. R., Moore, C. A., Higashikawa, M., & Steeve, R. W. (2000). The physiologic development of speech motor control: Lip and jaw coordination. Journal of Speech, Language, and Hearing Research, 43(1), 239–255.
    • Hallé, P. A., De Boysson-Bardies, B., & Vihman, M. M. (1991). Beginnings of prosodic organization: Intonation and duration patterns of disyllables produced by Japanese and French infants. Language and speech, 34, 299–318.
    • Howard, I. S., & Messum, P. (2011). Modeling the development of pronunciation in infant speech acquisition. Motor Control, 15, 85–117.
    • Imada, T., Zhang, Y., Cheour, M., Taulu, S., Ahonen, A., & Kuhl, P. K. (2006). Infant speech perception activates Broca’s area: A developmental magnetoencephalography study. Neuroreport, 17, 957–962.
    • International Phonetic Association. (1999). Handbook of the International Phonetic Association: A guide to the use of the International Phonetic Alphabet. Cambridge, UK: Cambridge University Press.
    • Irwin, O. C. (1947a). Infant speech: Consonant sounds according to place of articulation. Journal of Speech Disorders, 12, 397–401.
    • Irwin, O. C. (1947b). Infant speech: Consonant sounds according to manner of articulation. Journal of Speech Disorders, 12, 402–404.
    • Ishizuka, K., Mugitani, R., Kato, H., & Amano, S. (2007). Longitudinal developmental changes in spectral peaks of vowels produced by Japanese infants. Journal of the Acoustical Society of America, 121, 2272–2282.
    • Jiang, D., & Morrison, G. A. J. (2003). The influence of long-term tracheostomy on speech and language development in children. International Journal of Pediatric Otorhinolaryngology, 67, Supplement 1, S217–S220. doi:10.1016/j.ijporl.2003.08.031.
    • Kehoe, M. M., Stoel-Gammon, C., & Buder, E. H. (1995). Acoustic correlates of stress in young children’s speech. Journal of Speech and Hearing Research, 38, 338–350.
    • Kelso, J. A. S., Tuller, B., Vatikiotis-Bateson, E., & Fowler, C. A. (1984). Functionally specific articulatory cooperation following jaw perturbations during speech: Evidence for coordinative structures. Journal of Experimental Psychology: Human Perception and Performance, 10, 812–832.
    • Kent, R. D., & Bauer, H. R. (1985). Vocalizations of one-year-olds. Journal of Child Language, 12, 491–526.
    • Kent, R. D., & Murray, A. D. (1982). Acoustic features of infant vocalic utterances at 3, 6, and 9 months. Journal of the Acoustical Society of America, 72, 353–365.
    • Kent, R. D., & Vorperian, H. K. (1995). Anatomic development of the craniofacial-oral laryngeal systems: A review. Journal of Medical Speech-Language Pathology, 3, 145–190.
    • Koopmans-van Beinum, F. J., Clement, C. J., & van den Dikkenberg-Pot, I. (2001). Babbling and the lack of auditory speech perception: A matter of coordination? Developmental Science, 4(1), 61–70.
    • Koopmans-van Beinum, F. J., & van der Stelt, J. M. (1986). Early stages in the development of speech movements. In B. Lindblom & R. Zetterstrom (Eds.), Precursors of Early Speech (pp. 37–50). New York: Stockton Press.
    • Kraemer, R., Plante, E., & Green, G. E. (2005). Changes in speech and language development of a young child after decannulation. Journal of Communication Disorders, 38(5), 349–358. doi:10.1016/j.jcomdis.2005.01.002.
    • Kuhl, P. K., Conboy, B. T., Coffey-Corina, S., Padden, D., Rivera-Gaxiola, M., & Nelson, T. (2008). Phonetic learning as a pathway to language: New data and native language magnet theory expanded (NLM-e). Philosophical Transactions of the Royal Society, 363, 979–1000.
    • Kuhl, P. K., & Meltzoff, A. N. (1996). Infant vocalizations in response to speech: Vocal imitation and developmental change. Journal of the Acoustical Society of America, 100(4), 2425–2438.
    • Kuhl, P. K., Ramírez, R. R., Bosseler, A., Lin, J.-F. L., & Imada, T. (2014). Infants’ brain responses to speech suggest analysis by synthesis. Proceedings of the National Academy of Sciences, 111(31), 11238–11245. doi:10.1073/pnas.1410963111.
    • Lambrecht Smith, S., Roberts, J. E., Locke, J. L., & Tozer, R. (2010). An exploratory study of the development of early syllable structure in reading-impaired children. Journal of Learning Disabilities, 43, 294–307.
    • Lee, S., Potamianos, A., & Narayanan, S. (1999). Acoustics of children’s speech: Developmental changes of temporal and spectral parameters. Journal of the Acoustical Society of America, 105(3), 1455–1468.
    • Lee, S. S., Davis, B. L., & MacNeilage, P. (2010). Universal production patterns and ambient language influences in babbling: A cross-linguistic study of Korean- and English-learning infants. Journal of Child Language, 37, 293–318.
    • Levin, K. (1999). Babbling in infants with cerebral palsy. Clinical Linguistics & Phonetics, 13(4), 249–267.
    • Levitt, A. G., & Wang, Q. (1991). Evidence for language-specific rhythmic influences in the reduplicative babbling of French- and English-learning infants. Language & Speech, 34, 235–249.
    • Locke, J. L. (1989). Babbling and early speech: Continuity and individual differences. First Language, 9, 191–206.
    • MacNeilage, P. F., & Davis, B. L. (2000). On the origin of internal structure of word forms. Science, 288, 527–531.
    • Masapollo, M., Polka, L., & Ménard, L. (2016). When infants talk, infants listen: Pre-babbling infants prefer listening to speech with infant vocal properties. Developmental Science, 19(2), 318–328. doi:10.1111/desc.12298.
    • McCune, L., Vihman, M. M., Roug-Hellichus, L., Bordenave Delery, D., & Gogate, L. (1996). Grunt communication in human infants (homo sapiens). Journal of Comparative Psychology, 110, 27–37.
    • Ménard, L., Davis, B. L., Boë, L., & Roy, J. (2009). Producing American English vowels during vocal tract growth: A perceptual categorization study of synthesized vowels. Journal of Speech, Language and Hearing Research, 52, 1268–1285.
    • Ménard, L., Schwartz, J., & Boë, L. (2004). Role of vocal tract morphology in speech development: Perceptual targets and sensorimotor maps for synthesized vowels from birth to adulthood. Journal of Speech, Language, and Hearing Research, 47, 1059–1080.
    • Mitchell, P. R., & Kent, R. D. (1990). Phonetic variation in multisyllable babbling. Journal of Child Language, 17, 247–265.
    • Moore, C. A., & Ruark, J. L. (1996). Does speech emerge from earlier appearing oral motor behaviors? Journal of Speech and Hearing Research, 39(5), 1034–1047.
    • Moulin-Frier, C., Nguyen, S. M., & Oudeyer, P.-Y. (2014). Self-organization of early vocal development in infants and machines: The role of intrinsic motivation. Frontiers in Psychology, 4, 1–18. doi:10.3389/fpsyg.2013.01006.
    • Nam, H., Goldstein, L. M., Giulivi, S., Levitt, A. G., & Whalen, D. H. (2013). Computational simulation of CV combination preferences in babbling. Journal of Phonetics, 41(2), 63–77. doi:10.1016/j.wocn.2012.11.002.
    • Nip, I. S. B., Green, J. R., & Marx, D. B. (2009). Early speech motor development: Cognitive and linguistic considerations. Journal of Communication Disorders, 42, 286–298.
    • Oller, D. K. (1980). The emergence of the sounds of speech in infancy. In G. H. Yeni-Komshian, J. Kavanagh, & C. A. Ferguson (Eds.), Child Phonology (Vol. I: Production, pp. 93–112). New York: Academic Press.
    • Oller, D. K. (2000). The emergence of the speech capacity. Mahwah, NJ: Lawrence Erlbaum Associates.
    • Oller, D. K., Eilers, R. E., Basinger, D., Steffens, M. L., & Urbano, R. (1995). Extreme poverty and the development of precursors to the speech capacity. First Language, 15, 167–288.
    • Oller, D. K., Niyogi, P., Gray, S., Richards, J. A., Gilkerson, J., Xu, D., . . . Warren, S. F. (2010). Automated vocal analysis of naturalistic recordings from children with autism, language delay, and typical development. Proceedings of the National Academy of Sciences, 107(30), 13354–13359. doi:10.1073/pnas.1003882107.
    • Oxford English Dictionary. (2015). Babbling, n.1. Oxford University Press.
    • Ramsdell, H. L., Oller, D. K., Buder, E. H., Ethington, C. A., & Chorna, L. (2012). Identification of prelinguistic phonological categories. Journal of Speech, Language, and Hearing Research, 55(6), 1626–1639. doi:10.1044/1092-4388(2012/11-0250).
    • Rasilo, H., Räsänen, O., & Laine, U. K. (2013). Feedback and imitation by a caregiver guides a virtual infant to learn native phonemes and the skill of speech inversion. Speech Communication, 55(9), 909–931. doi:10.1016/j.specom.2013.05.002.
    • Roug, L., Landberg, I., & Lundberg, L. J. (1989). Phonetic development in early infancy: A study of four Swedish children during the first eighteen months of life. Journal of Child Language, 16(1), 19–40. doi:10.1017/S0305000900013416.
    • Rvachew, S., Alhaidary, A., Mattock, K., & Polka, L. (2008). Emergence of the corner vowels in the babble produced by infants exposed to Canadian English or Canadian French. Journal of Phonetics, 36, 564–577.
    • Rvachew, S., & Brosseau-Lapré, F. (2018). Developmental phonological disorders: Foundations of clinical practice (2nd ed.). San Diego: Plural Publishing.
    • Rvachew, S., Creighton, D., Feldman, N., & Sauve, R. (2002). Acoustic-phonetic description of infant speech samples: Coding reliability and related methodological issues. Acoustics Research Letters Online, 3(1), 24–28.
    • Rvachew, S., Creighton, D., Feldman, N., & Sauve, R. (2005). Vocal development of infants with very low birth weight. Clinical Linguistics & Phonetics, 19(4), 275–294.
    • Rvachew, S., Mattock, K., Polka, L., & Ménard, L. (2006). Developmental and cross-linguistic variation in the infant vowel space: The case of Canadian English and Canadian French. Journal of the Acoustical Society of America, 120(4), 2250–2259.
    • Rvachew, S., Slawinski, E. B., Williams, M., & Green, C. L. (1996). Formant frequencies of vowels produced by infants with and without early onset otitis media. Canadian Acoustics, 24, 19–28.
    • Rvachew, S., Slawinski, E. B., Williams, M., & Green, C. L. (1999). The impact of early onset otitis media on babbling and early language development. Journal of the Acoustical Society of America, 105, 467–475.
    • Shadle, C. H., Nam, H., & Whalen, D. H. (2016). Comparing measurement errors for formants in synthetic and natural vowels. Journal of the Acoustical Society of America, 139, 713–727.
    • Smith, A., & Zelaznik, H. N. (2004). Development of functional synergies for speech motor coordination in childhood and adolescence. Developmental Psychobiology, 45(1), 22–33.
    • Smith, B. L., Brown-Sweeney, S., & Stoel-Gammon, C. (1989). A quantitative analysis of reduplicated and variegated babbling. First Language, 9, 175–190.
    • Stark, R. E. (1980). Stages of speech development in the first year of life. In G. H. Yeni-Komshian, J. Kavanagh, & C. A. Ferguson (Eds.), Child Phonology (Vol. I, pp. 73–92). New York: Academic Press.
    • Stark, R. E., Rose, S. N., & Benson, P. J. (1978). Classification of infant vocalization. British Journal of Disorders of Communication, 13(1), 41–47.
    • Stoel-Gammon, C. (1989). Prespeech and early speech development of two late talkers. First Language, 9, 207–224.
    • Sussman, H. M., Duder, C., Dalston, E., & Caciatore, A. (1999). An acoustic analysis of the development of CV coarticulation: A case study. Journal of Speech, Language, and Hearing Research, 42(5), 1080–1096.
    • Sussman, H. M., Minifie, F. D., Buder, E. H., Stoel-Gammon, C., & Smith, J. (1996). Consonant–vowel interdependencies in babbling and early words: Preliminary examination of a locus equation approach. Journal of Speech, Language, and Hearing Research, 39, 424–433.
    • Vallabha, G. K., & Tuller, B. (2002). Systematic errors in the formant analysis of steady-state vowels. Speech Communication, 38, 141–160. doi:10.1016/S0167-6393(01)00049-8.
    • VanDam, M., & Silbert, N. H. (2016). Fidelity of automatic speech processing for adult and child talker classifications. PLoS ONE, 11(8), e0160588. doi:10.1371/journal.pone.0160588.
    • Vihman, M. M., de Boysson-Bardies, B., Durand, C., & Sundberg, U. (1994). External sources of individual differences? A cross-linguistic analysis of the phonetics of mothers’ speech to 1-year-old children. Developmental Psychology, 30(5), 651–662.
    • von Hapsburg, D., Davis, B. L., & MacNeilage, P. F. (2008). Frame dominance in infants with hearing loss. Journal of Speech, Language & Hearing Research, 51(2), 306–320.
    • Vorperian, H. K., & Kent, R. D. (2007). Vowel acoustic space development in children: A synthesis of acoustic and anatomic data. Journal of Speech, Language, and Hearing Research, 50, 1510–1545.
    • Walsh, B., Smith, A., & Weber-Fox, C. (2006). Short-term plasticity in children’s speech motor systems. Developmental Psychobiology, 48, 660–674.
    • Whalen, D. H., Levitt, A., & Goldstein, L. M. (2007). VOT in the babbling of French- and English-learning infants. Journal of Phonetics, 35, 341–352.
    • Whalen, D. H., Levitt, A. G., Hsiao, P., & Smorodinsky, I. (1995). Intrinsic F0 of vowels in the babbling of 6-, 9-, and 12-month-old French and English-learning infants. Journal of the Acoustical Society of America, 97, 2533–2539.
    • Whalen, D. H., Levitt, A. G., & Wang, Q. (1991). Intonational differences between the reduplicative babbling of French- and English-learning infants. Journal of Child Language, 18, 501–516.
    • Wolpert, D. M., Ghahramani, Z., & Flanagan, J. R. (2001). Perspectives and problems in motor learning. Trends in Cognitive Sciences, 5(11), 487–494.