Show Summary Details

Page of

Printed from Oxford Research Encyclopedias, Linguistics. Under the terms of the licence agreement, an individual user may print out a single article for personal use (for details see Privacy Policy and Legal Notice).

date: 27 June 2022

Phonetic Correlates of Sex, Gender and Sexual Orientationfree

Phonetic Correlates of Sex, Gender and Sexual Orientationfree

  • Adrian P. SimpsonAdrian P. SimpsonFriedrich Schiller University Jena
  •  and Melanie WeirichMelanie WeirichFriedrich Schiller University Jena


Speech carries a wealth of information about the speaker aside from any verbal message ranging from emotional state (sad, happy, bored, etc.) to illness (e.g., cold). Central features are a speaker’s gender and their sexual orientation. In part this is an inevitable product of differences in speakers’ anatomical dimensions, for example on average males have lower pitched voices than females due to longer, thicker vocal cords that vibrate more slowly. Arguably much more information has been learned by a speaker as they construct their gender or identify with a particular sexual orientation. Differences in speech already begin in young children, before any marked gender-related anatomical differences develop, emphasizing the importance of behavioral patterns. Gender, gender identity, and sexual orientation are encoded in speech in a range of different phonetic parameters relating to both phonation (activity of the vocal folds) and articulation (dimensions and configuration of the supraglottal cavities), as well as the use of pitch patterns and differences in voice quality (the way in which the vocal folds vibrate). Differences in the size and configuration of the supraglottal cavities give rise to differences in the size of the acoustic vowel space as well as subtle differences in the production of individual sounds, such as the sibilant [s]. Furthermore, significant and systematic gender-specific differences have been found in the average duration of utterances and individual sounds, which in turn have been found to have a complex relationship to the perception of tempo.


  • Phonetics/Phonology

1. Introduction

Speech is a multiplex signal. Aside from the verbal content, every utterance that leaves a speaker’s mouth contains a wealth of personal information reflecting their individual and anatomical make-up, and perhaps more importantly, learned patterns resulting from a range of social factors. A speaker’s sex, gender, and their sexual orientation give rise to phonetic patterns spanning both nature and nurture. As we will see in this article, although it is relatively straightforward to describe many of the phonetic patterns correlating with a speaker’s sex, gender, and sexual orientation, it is less easy to say which of those patterns are the direct result of physiological/anatomical factors and which have been acquired by and are being produced by a speaker to inform listeners of his/her gender or sexual orientation.

In the following we will use the term gender to cover both gender in its narrower sociocultural sense, as well as biological sex. While superficially, sex and gender appear to be clearly defined and clearly differentiated terms, referring to physical or physiological differences on the one hand, and social or cultural differences on the other, the well-known case of the South African athlete Caster Semenya, who was banned from female competition due to the level of testosterone in her blood, shows in most striking fashion how quickly any demarcation of the two terms can become unclear. In other words, both behavioral and anatomical factors play a role in accounting for differences observed between male and female speakers, and it is rarely the case that either factor is the sole cause for a particular pattern.

2. Biophysical Correlates

Studies have shown that listeners can correctly identify the gender of a randomly chosen sample of normal adult speakers at a rate approaching 100%. Anatomical and physiological changes that take place during puberty give rise to male–female differences in larynx size and position, vocal fold length and thickness, as well as differences in the length of the vocal tract (Goldstein, 1980; Stevens, 1998; Titze, 1989).

These differences in anatomical dimensions have predictable implications for the acoustic characteristics of the signal emanating from the average male or female vocal tract. Longer and thicker male vocal folds vibrate at a lower frequency than shorter thinner female vocal folds. Using estimates of 1.5 cm (male) and 1 cm (female) for the membranous part of the vocal folds, Stevens (1998, p. 12) predicts the natural frequencies for relaxed vocal folds to be 120 Hz (male) and 200 Hz (female), values in line with male–female averages that have been found in a range of languages (Traunmüller & Eriksson, 1995). Measuring the dimensions of the components involved in speech production is not trivial. This is especially the case for the structures in the larynx, and in particular the vocal folds. It is not unsurprising, therefore, that we find different estimations depending on the methods used. Earlier and much cited studies containing dimensions of laryngeal structures were taken from measurements of cadavers. In the first detailed studies of the circumpubertal larynx (Kahane, 1978, 1982), measurements were made of larynxes excised from the cadavers of 20 subjects aged from 9 to 16 years of age. Likewise, excised larynxes form the basis of the measurements summarized in Hirano (1983) and Hirano, Kurita, and Nakashima (1983). By contrast, in a more recent study of vocal fold dimensions in a sample of Korean adults, Su et al. (2002) measured vocal fold length from endoscopic images of live anaesthetized participants. From both types of study estimates of vocal fold length therefore vary somewhat (male: 14.5–16 mm; female: 11–13.5 mm).

Studies describing the dimensions of the vocal tract have also drawn on data from a range of different sources. In her seminal doctoral thesis, Goldstein (1980) draws on data available in the literature to provide values to drive an articulatory synthesis model of the vocal tract. She estimates the typical adult male vocal tract to be 17 cm, with the female vocal tract at 14 cm (Goldstein, 1980, p. 186). The length differences reside almost completely in a longer male pharynx. The longer male vocal tract has lower resonance frequencies. These disproportionate length differences have complex consequences for the acoustic output. For vowels, on average, it has been found that female formants are approximately 20% higher than male vowels (Fant, 1966). However, this average value belies a wide range of differences between male and female vowel values, varying between individual formants of individual vowels (Fant, 1966). Fant (1975) presents male–female formant (F1–F3) scaling factors for a number of Indoeuropean (American English, Swedish, Danish, Dutch, Serbo-Croatian, Italian) and non-Indoeuropean languages (Estonian, Japanese) (see Figure 1). Initially, a number of attempts were made to explain these nonuniform differences in articulatory terms (Nordström, 1975, 1977), but differences between the scaling factors needed for similar vowels in different languages makes clear that simple male–female dimensional differences can only provide part of the answer (e.g., Henton, 1995).

Probably the most cited study of gender-specific differences in vocal tract dimensions is Fitch and Giedd (1999). Measurements were taken from mid-sagittal scans of the magnetic resonance imaging of 129 normal children and young adults (2–25 years). They found differences between males and females arising at puberty in overall vocal tract length and in the relative proportions of the oral and pharyngeal cavity. The large imaging study of Vorperian et al. (2011) includes 605 participants between birth and the age of 19 years. Again, findings point to postpubertal differences between males and females in both the oral and pharyngeal portions of the vocal tract. By contrast, however, they also found prepubertal differences between boys and girls between the ages of 3 and 7 in the oral region of the vocal tract. While the authors therefore point to a possible anatomic basis of prepubertal speech acoustic differences, they also emphasize the need for further research validating anatomic-acoustic correlates by means of vocal tract modeling studies.

The longer pharyngeal cavity in males leads to a longer vertical distance from the gnathion to the palatal plane or, more precisely, the temporo-mandibular joint (Honda & Tiede, 1998). The difference in this distance between a prototypical male and female speaker is visualized in Figure 2 by the green arrow.

The larger distance in males affects the impact of the jaw excursion: we can expect larger jaw displacements for males than for females with the same jaw opening degree. Weirich, Fuchs, Simpson, Winkler, and Perrier (2016) investigated by means of empirical data of 9 German and 40 English speakers and an articulatory model the effect of these physiological differences on the jaw opening angle in the production of /a/. They found greater jaw openings in both English and German female speakers, in particular in an accented condition, where jaw opening was found to be maximal. The results of the modeling study revealed that similar jaw opening settings in male and female speakers led to differences in pharyngeal

constriction. These differences can result in a complete radico-pharyngeal closure in the male model, which suggests a physiological reason for the smaller jaw openings in males. Smaller jaw openings lead to less clear speech, typically associated with male speech, as well as the shorter segment durations (Hillenbrand, Getty, Clark, & Wheeler, 1995; Simpson, 1998), faster speech rates (Byrd, 1992), and smaller acoustic vowel spaces (Diehl, Lindblom, Hoemeke, & Fahey, 1996; Hillenbrand et al., 1995; Peterson & Barney, 1952; Simpson & Ericsdotter, 2007; Weirich & Simpson, 2014a; Whiteside, 2001) that have been found in male speech.

Figure 1. Female-male scale factors for F1, F2 and F3 in per cent for each of six languages.

Fant, 1975

Figure 2. Distance (green arrow) from gnathion to temporo-mandibular joint. Left: male, right: female.

Reprinted from Weirich et al., 2016, S1588.

Of course, there is also a relation between vocal tract geometry and articulatory space (Fuchs, Winkler, & Perrier, 2008; Winkler, Fuchs, & Perrier, 2006): the length of the speakers’ pharynx affects the articulatory distance between corner vowels and the degrees of freedom in the vertical direction. Articulatory studies dealing with gender-specific differences are less frequent. Simpson (2001, 2002) found differences in acoustic and articulatory diphthong dynamics by analyzing 26 female and 22 male speakers from the University of Wisconsin X-ray Microbeam Speech Production Database (Westbury, 1994). Weirich and Simpson (2014b) showed smaller articulatory vowel spaces in females than in males in 9 German speakers using electromagnetic articulography (EMA). They extended this line of research by investigating articulatory and acoustic undershoot in a German diphthong by means of EMA. Results point to an under-exploitation of the larger male articulatory space during running speech, with males exhibiting more undershoot than females in both articulatory and acoustic terms (Weirich & Simpson, 2018b). Thus, the smaller vocal tract dimensions in females lead to articulatory differences which have to be kept in mind when gender-specific acoustic differences are discussed.

3. Nature versus Nurture

Variation between humans has always been discussed within the nature-nurture framework, where factors are separated into two potential sources: broadly spoken into biology and society. Also regarding speech, factors have been categorized as either organic or learned (Ladefoged & Broadbent, 1957). We have already discussed some of the organic differences between male and female adults which affect gender-specific variability in speech. However, it is reasonable to assume that in addition to those biophysical inevitabilities, behavioral aspects play a role. First and foremost, cross-linguistic differences have been found in the extent of gender-specific variability in speech (Boe & Rakotofiringa, 1975; Dolson, 1994; Mennen, Schaeffler, & Docherty, 2012; Ordin & Mennen, 2017; Takefuta, Jancosek, & Brunt, 1972; van Bezooijen, 1995; Weirich, Simpson, Öjbro, & Ericsdotter Nordgren, 2019). Differences in mean fundamental frequency between genders are universal, however, van Bezooijen (1995) highlight the social aspect of differences in mean f0 between Dutch and Japanese women: having a high pitch was related to attractiveness ratings in Japanese listeners but not in Dutch listeners. Similarly, Weirich et al. (2019) found larger differences between the genders in mean f0 in German speakers than in Swedish speakers, due to significantly lower mean values for Swedish females than for German females (which were not due to differences in height, but rather seem to reflect a culture-specific gender role concept). Regarding fundamental frequency range, Mennen, Schaeffler, and Docherty (2012) found cross-linguistic differences between age-matched German and English female speakers; Ordin and Mennen (2017) found systematic variation in female Welsh-English bilinguals, but not in male bilinguals. Voice quality patterns have also been attributed to sociological aspects. For example, regarding British English Henton and Bladon (1985, 1988) found males to use more creaky voice than females and females to have a breathier voice quality than males. The authors suggest a sociophonetic explanation with increased breathiness in females being exploited to enhance attractiveness. However, more recently, creaky voice has been detected as a phonetic cue conveying social meaning also in females. The study of Yuasa (2010) revealed that frequent creaky voice was found in females from northern California and eastern Iowa and that this voice quality was perceived as educated, urban-oriented, and upwardly mobile. In the framework of creating a persona through speech features (Eckert, 2008; Podesva, 2007) Mendoza-Denton (2011) showed that creaky voice is used to create the image of a hardcore gang girl in females but also a hardcore Chicano gangster in males.

While clear speech has been ascribed to be a characteristic of female speech, “mumbling” has been described to be perceived as “macho” (Heffernan, 2010). This reflects a social explanation of various phonetic characteristics related to clear speech typically found in females, such as longer segment durations, slower speech rates, a larger vowel space, and larger contrasts between some consonants (Byrd, 1992; Diehl et al., 1996; Romeo, Hazan, & Pettinato, 2013; Simpson, 1998; Weirich & Simpson, 2014a, 2014b, 2015). Interestingly, the longer segment durations in females are in contrast to the commonly found association of faster speaking women, emphasizing the intriguing effect of stereotypes in gender perception. Weirich and Simpson (2014a) point to another potential explanation of this mismatch between perception and data. They found a positive correlation between vowel space size and perceived tempo: speakers with larger acoustic vowel spaces were perceived as speaking faster (within a group of female speakers and therefore independent of gender). The reasons suggested for the non-uniform gender differences in acoustic vowel space sizes include in addition to the mentioned vocal tract size differences, behavioral but also perceptual aspects. The perceptual reasoning links the larger female vowel space to their higher fundamental frequency and has become known as the sufficient contrast hypothesis (Diehl et al., 1996). The greater harmonic spacing in voices with higher f0 result in a poorer definition of the spectral envelope of a vowel. Following from that, the acoustic contrast between vowel categories gets worse the higher a speaker’s f0 is. Diehl et al. (1996) suggested that the larger vowel space in females is a compensation for their higher f0. Weirich and Simpson (2013) tested this assumption by analyzing average fundamental frequencies and acoustic vowel spaces of 56 female German speakers with a broad range of average f0 values. However, they found no such correlation between f0 and vowel space in their speaker sample, suggesting that other factors must be at work. The behavioral reasoning of clearer speech in females (correlating with larger acoustic vowel spaces) has already been mentioned (see also Labov, 1990). In addition, here too, cross-linguistic differences exist, as shown in the studies by Henton (1995) including English, Dutch, Swedish, and French speakers or Weirich et al. (2019) highlighting gender differences in German and Swedish vowel spaces. We will now come to an aspect of gender-specific variation in speech which is clearly on the nurture side of the biologysociology dichotomy: gender differences in prepubertal speakers who do not differ in physiological characteristics of the speech apparatus.

4. Development of Gender in the Prepubertal Voice

Perhaps the most interesting aspect of voice and gender is the vocal expression of gender in children, more precisely, in prepubertal speakers. As we have already seen, dimensional differences in vocal tract anatomy, whether it be the length of the vocal tract itself, or the size of the vocal folds, are not present, or minimal. From this we should expect to find little phonetic difference between male and female children that have a biophysical origin (but see Vorperian et al., 2011, for a more detailed analysis on this). However, studies have repeatedly found in listening experiments that participants can generally correctly identify a speaker’s gender in content-neutral utterances at a mean level significantly better than chance. More interestingly, the gender of some speakers is correctly identified at values above 90% (Simpson, Funk, & Palmer, 2017). This suggests that some children are producing acoustic cues that listeners can reliably use for gender identification. As there appears to be no biophysical reason for the phonetic differences the children are making, we must assume that they are employing phonetic features that have been learned (i.e., socially acquired). One possible source of this is the child-directed speech of adults. Foulkes, Docherty, and Watt (2005) showed that British mothers differ in the use of phonetic variants to their two-year-old children depending on the gender of the child. More standard variants were used when speaking to girls than to boys, thereby reflecting the variation found in adult speech with females using more standard variants than males.

4.1 Phonetic Correlates of Child Gender

Before puberty the larynx and vocal tract of boys and girls have similar dimensions (Fitch & Giedd, 1999; Kahane, 1978). However, significant gender-specific differences have been found in participants as young as two and a half years old (McCormack & Knighton, 1996). The differences that have been found concern a range of phonetic parameters.

Beginning in the larynx, both differences in average f0 and pitch patterns have been observed. In the first large-scale study of girls and boys evenly distributed across a wide age-range, Hasek et al. (1980) found in sustained /ɑ/ productions from 180 children (90 female; age 5–10) a significant decline in male f0 from 7 years onwards. For girls, however, no such decline was found. This finding is seemingly at odds with the findings of earlier studies which consistently failed to find any gender-specific differences.1 In a later study with a comparably large sample size and age span as that used in Hasek et al. (1980), Glaze et al. (1988) found in a sample of 121 children (59 male; age: 5–11 years) only a significant overall difference between mean male (226 Hz) and female (238 Hz) f0 but no significant relationships for age. However, in a later study of f0 and other intonational variables in a sample of 80 children (40 female) evenly split across four age groups (3–4, 5–6. 7–8, 9–10), Ferrand & Bloom (1996) were able to replicate Hasek, Singh, and Murry (1980)’s main finding. They found a significant decrease in f0 in boys from age 7 but no similar pattern in girls. They also found a decrease in contrastive pitch levels used by the older boys. In data elicited from a picture-naming task using a much smaller sample of 20 children (10 female; age: 6, 8, 10), Whiteside and Hodgson (1999) found a significant age-related difference but no significant gender-specific differences. Analysis of f0 in a similar sized sample of 30 children (15 female; age: 7–10 years) reported in Guzman et al. (2014) also found no significant gender-specific differences. By contrast, in a larger sample of 68 children (39 female; age: 5;0–9;11 years) producing sustained /a/, Brockmann-Bauser, Beyer, and Bohlender (2015) found significantly higher f0 for girls (277 Hz) than boys (262 Hz) at a “medium” loudness, but this difference all but disappeared at louder SPL of > 80 dBA.

But why should the results be so variable? Even if we take variations in sample sizes, age-ranges and elicitation materials into account, there are still considerable differences between the findings of the different studies. There seem to be at least two possible reasons for this. The first is cultural differences in the expression of gender. The studies cited use participants with different linguistic and cultural backgrounds (American English, northern British English and Swiss German speakers). The second, more intriguing possibility, is that children’s performance in the recording session, or more precisely, their f0 is influenced by the gender of experimenter. This possibility is raised by Glaze, Bless, Milenkovic, and Susser (1988) speculating about differences between their own findings and those of Hasek et al. (1980). They cite Lieberman (1967)’s findings that the vocalizations of infants as young as 13 months were found to be influenced by the f0 of an interacting adult.

Voice quality differences have also been found. Boys have been found to exhibit a higher prevalence of non-modal hoarse voice quality than girls which has been attributed to a vocal expression of “excessive hyper-activity, anxiety, and spirit of leadership” in boys (e.g., Martins, Hidalgo Ribeiro, Zeponi Fernandes de Mello, Branco, & Mendes Tavares, 2012).2 In an auditory analysis of 1549 first–sixth graders, Yairi, Horton Currin, Bulian, and Yairi (1974) found that although overall 13% were judged to be hoarse, the proportion judged to be hoarse in the boy group (17%) was significantly higher than in the girl group (10%). This finding supported the findings of earlier studies (Baynes, 1966; Senturia & Wilson, 1968). Similar differences in male–female hoarseness were found in a comparable study of hoarseness in 205 Swedish children (104 girls) (Sederholm, 1995). Likewise, in a later study of 217 Swedish children (103 girls; age: 6;4 to 9;10 years), Kallvik et al. (2015) found a higher proportion of male to female hoarseness (15.8 vs. 7.8%). Some of the variation that has been found in rates of hoarseness in children is undoubtedly due in part to differences in method and interrater variability arising from auditory judgements.

Acoustic support for these auditory findings is, however, less conclusive. Ferrand (2000) analyzed the harmonics-to-noise ratio in two vowels /ai/ and /ʌ/ produced in the spontaneous speech of 80 children (40 female) evenly distributed across four age-groups (4, 6, 8, 10 years). Although a significant average difference was found in the direction to be expected from auditory analyses (i.e. higher HNR for female speakers) there is a great deal of variation across age groups and the two vowel qualities which, in turn, may be due to the analysis of vowels in the wide range of different phonological and prosodic contexts present in spontaneous data. However, in a study of four voice parameters (signal-to-noise ratio (SNR), jitter, shimmer, f0) in 121 children (59 male; age: 5–11 years) Glaze et al. (1988) used three repetitions of prolonged /haː/. Although they found significant age- and gender-related differences in f0 (as mentioned at the beginning of this section), few significant differences in the other parameters were found and this was attributed, in part at least, to a high level of interindividual variation.

In one of the few studies adopting an articulatory technique, Robb and Simmons (1990) used electroglottography to assess differences in vocal fold contact during voicing in 26 children (12 female; age: 4;4–6;6 years) who each produced sustained tokens of the vowels /i a u/. In common with many other studies of f0 we have just seen, authors failed to find any significant gender-specific differences. However, there was a significant difference in contact quotient (Qc) such that boys had a significantly higher value than girls for /a/ (0.62 vs. 0.54). This finding is in line with similar differences in contact patterns found for adult speakers (Awan & Awan, 2013; Södersten & Lindestad, 1990) and predicted from models of adult vocal fold behavior (Titze, 1989).3

A further feature related to vocal fold activity, where gender-specific patterns have been found, is voice onset time (VOT). In common with studies on f0 and voice quality, results exhibit a good deal of intra- but also interstudy variability. Where significant gender differences are found they do point in the same direction, with female subjects having longer VOT values than males. Factors influencing the amount of variability are not always clear, but apart from small sample sizes, it would seem that younger children are still simply varying in a parameter that has yet to be robustly acquired (Koenig, 2000). However, sample size is not everything. In an attempt to replicate and consolidate the findings of an earlier study (Whiteside & Marshall, 2001), Whiteside, Henry, and Dobbin (2004) found that even after increasing the sample size from 30 to 46 subjects results remained variable and inconclusive. Some evidence for variability decreasing with age is shown by Romeo et al. (2013)’s study of 73 (39 female; age: 9–14) children which found higher VOT values in females across all three age groups analyzed (9– 10, 11–12, 13–14). Lundeborg Hammarström, Larsson, Wiman, and McAllister (2012)’s study of a similarly large group of Swedish-speaking children in similar age groups (74 female; age: 7.9–11.8 years) revealed age-related but no gender-specific differences suggesting that any gender-specific differences found in the English-speaking participants are at least partially behavioral.

With respect to resonance, several studies have shown that girls have larger vowel spaces than boys. Although slightly different methods are used to characterize acoustic vowel qualities, an estimation of the first three or four formant frequencies is shared by all, and when significant differences are found, they are in the same direction with female values being higher than males. One consistent finding is higher female F1 values for open vowels (Bennett, 1981; Busby & Plant, 1995; Klein, 2005; Lee, Potamianos, & Narayanan, 1999; Perry, Ohde, & Ashmead, 2001; Whiteside, 2001) as well as higher female F2 and F3 values (Busby & Plant, 1995; Perry et al., 2001; Pettinato, Tuomainen, Granlund, & Hazan, 2016). In their study of 42 7- and 8-year-old children Bennett (1981) found significantly higher female formant values for all vowel qualities with the percentually largest differences being present in F1 of the most open vowels /æ/ and /ʌ/ (Bennett, 1981, p. 233). Using formant spacing (∆F = Fi+1Fi) to characterize vowel space size (in a group of 34 6- to 9-year-old children), Cartei, Cowles, Banerjee, and Reby (2014) also found significantly lower male values, in other words vowels spaced less further apart acoustically than their female congeners.

4.2 Perception of Child Gender

As we have seen, studies have repeatedly found gender-specific differences between male and female children before they reach puberty. At the same time results have been variable both within and across studies and even attempts at replicating previous findings by the same authors have often failed. Given this high degree of variability, it is interesting to ask how successful listeners are at correctly identifying a child’s gender on the basis of single utterances.

Several studies have demonstrated that adult listeners can correctly identify a child’s gender at a level significantly better than chance in content-neutral utterances, typically at a value around 70% (Bennett & Weinberg, 1979; Curtin & Kiesling, 2004; Günzburger, Bresser, & ter Keurs, 1987; Karlsson & Rothenberg, 1992; Kaya et al., 2017; Nairn, 1995; Perry et al., 2001; Sachs, Lieberman, & Erickson, 1973).

As with the acoustic findings, levels of gender correct identification do vary between studies. A number of factors seem to play a role. The stimuli used must also attain a degree of phonetic complexity. Single vowels may generally provide enough acoustic information (f0, formants, voice quality) for a reliable identification of adult gender, but for better than chance identification of child gender, longer, more complex utterances are needed (Bennett & Weinberg, 1979; Günzburger et al., 1987; Klein, 2005; Nairn, 1995). So, for instance, Günzburger et al. (1987) had 17 children (11 male; age: 7;6 to 8;9 years) read content-neutral sentences together with sustained productions of the vowels /i a u/. Thirty-eight listeners attained levels significantly above chance for the sentence-length stimuli (71% boys; 76% girls), but only chance levels were reached for the sustained vowel stimuli. Age has also been found to play a role, with correct identification rates increasing with children’s age (Perry et al., 2001; Sachs et al., 1973). However, the gender of the children themselves does not have a predictable effect on reliable identification. Perry et al. (2001) found better rates for boys, whereas in Günzburger et al. (1987) it was the girls who more reliably identified. Perhaps the most interesting factor affecting the perception of gender and one suggesting intercultural differences in the vocal differentiation of gender in children’s voices is to be found in the study reported in Karlsson and Rothenberg (1992). Utterances produced by 54 (24 female; age: 2.5–8 years) speakers of English, Finnish, or Swedish (some bilingual) were played to adult listeners with differing mother-tongues (Chinese, English, Finnish, Swedish). While listeners were somewhat better at identifying the gender of speakers of their own mother-tongue, the mean gender identification rate of speakers of English or Swedish was again at around 70%. However, the rate of identification of gender in the Finnish children was only slightly better than chance for male speakers and at chance level for female speakers. The authors can only speculate at the reasons for this intercultural difference, but suggest that it may be related to the way that the Finnish language does not grammatically distinguish between gender, which the Germanic languages English and Swedish both do.

4.3 Masculinity/Femininity, Gender Dysphoria

As will become clear at other places in this review, analysis of sex, gender and sexual orientation has taken an increasingly fine-grained approach to defining various psychosocial aspects of the individual speaker. This is also the case with children. While it is enlightening to study the perception and production of a binary gender classification, a growing number of studies are showing that children also exhibit a wealth of both interand intraindividual variation that requires more differentiated categorization. Munson (2015) had 34 boys (age 4–13 years) produce isolated English words containing items with word-initial /s/. The original and two manipulated versions of the words sock or sun were presented to 38 listeners. Besides the original production, [s] with a high peak frequency as well as a fronted [θ‎]-like /s/ with a diffuse spectrum were produced by a trained phonetician and spliced into the child utterances. Listeners only heard a single stimulus originating from one child. Listeners rated the boys to sound more girl-like if they produced /s/ with an especially high peak frequency, or with a diffuse spectrum suggesting a frontal misarticulation.

Studies on the perception and production of children with gender dysphoria also suggest that children are able to manipulate acoustic parameters used to indexicalise gender and that these cues are interpreted by listeners. Munson, Crocker, Pierrehumbert, Owen-Anderson, and Zucker (2015) had two groups of listeners rate the gender typicality of single words (group 1; 21 listeners) or sentences (group 2; 17 listeners) produced by 15 boys (age: 5.7–12.8 years) with GID and 15 age-matched boys without GID. Listeners’ judgements confirmed the auditory impression that male participants with GID sound less boy-like. From the range of acoustic parameters that were also measured, they could show that less boy-like judgements are related to particular spectral characteristics of /s/ (higher centroid frequency, greater spectral diffuseness, a more negatively skewed spectrum).

Cartei et al. (2014) asked 34 children (19 female; age: 6–9 years) to read aloud words normally and then as much “like a boy” or “like a girl” as possible. Although there was no difference in f0 between the male and female children speaking normally, they did make significant f0 changes when imitating the other gender such that the boys adopted a higher f0 when imitating girls, the girls a lower f0 when imitating boys. In a further study, Cartei et al. (2019) asked 72 children (36 female; age: 6–10) to provide voices for comic figures differing in gender stereotypicality. As in the previous study, children varied f0 (and formant spacing in vowels) in the expected directions, lowering f0 and reducing formant spacing to express masculinity, raising f0 and expanding formant spacing for femininity. What all of these studies show is that from an early age, children have careful control over their vocal apparatus. In the perception studies described in section “Perception of Child Gender”, uniform identification of a particular child’s voice suggests that adults agree on a set of acoustic cues that can be reliably used to assign a gender to a particular voice. However, what remained unclear was the degree to which a child is producing these cues intentionally to define its gender. The studies described indicate that children are indeed aware of and can manipulate the cues needed to indexicalize gender.

5. Correlates of Sexual Orientation and Gender Identity

When it comes to gendered speech behavior, most studies have concentrated on the impact of a speaker’s (actual or attributed) sexual orientation. More recently, studies investigated the role of self-ascribed masculinity/femininity and gender identity on speech. Overall, studies can be categorized dealing with (a) how speakers convey or index sexual orientation or masculinity/femininity through speech, and (b) how listeners attribute and perceive it. In both cases the intriguing power of speakers’ and listeners’ attitudes and stereotypes regarding gender-conforming speech, masculinity/femininity, and sexual orientation have to be kept in mind. The following section concentrates on studies investigating how speakers may index their sexual orientation or gender identity in speech. The section on “Attributing Sexual Orientation and Gender Identity” gives an overview of studies dealing with how listeners attribute and perceive sexual orientation but also femininity/masculinity.

5.1 Indexing Sexual Orientation and Gender Identity

First, no study has given evidence that non-straight speakers intend to approximate the speech characteristics of the opposite gender, something that would be expected by lay gender inversion theories (Kite & Deaux, 1987). Second, non-straight speakers do not automatically use a speech style that diverges from prototypical gendered speech (Munson & Babel, 2007). That phonetic markers of sexual orientation are a source people use to construct and index social identity has been shown demonstratively by Podesva (2006), who found that the use of phonetic cues of gay speech varies between situations and interlocutors.

Nevertheless, studies have been conducted investigating potential phonetic differences between straight and non-straight speakers. While some have found inter-speaker variation between gay and straight men in mean f0 (Baeck, Corthals, & van Borsel, 2011; Linville, 1998), others have not (Gaudio, 1994; Lerman & Damsté, 1969; Munson, McDonald, DeBoe, & White, 2006; Rendall, Vasey, & McKenzie, 2008; Sulpizio et al., 2015; Valentova & Havlíček, 2013). Another phonetic parameter that has been the focus of attention is the fronted articulation of /s/. Linville (1998) found in a study including 9 participants gay men to produce /s/ with a higher frequency peak than straight men. Similarly, Munson et al. (2006) found in a study analyzing the speech of 44 participants (including lesbian and bisexual women, heterosexual women, gay and bisexual men, heterosexual men) a more negatively spectrally skewed /s/ in gay men. Other studies, however, have failed to show such a difference. For example, Kachel, Simpson, and Steffens (2018) investigated 25 gay and 26 straight German male speakers, and while several phonetic parameters such as f0, center of gravity in /s/ and mean F2 correlated with perceived sexual orientation, based on actual sexual orientation, straight men only showed lower mean F1 than gay men. Additional evidence has been gathered regarding differences in formants, with gay men showing higher formants than straight men in some vowels (Kachel et al., 2018; Munson et al., 2006) or a general hyperarticulation of the vowel space (Pierrehumbert, Bent, Munson, Bradlow, & Bailey, 2004).

Fewer studies have investigated the relationship between speech and sexual orientation in women and results are rather inconclusive in a similar way as for male speech. Regarding fundamental frequency, some studies have found higher mean f0 in straight women than in non-straight women (lesbian and bisexual) in spontaneous (Camp, 2009; Moonwomon-Baird, 1997) and in read speech (Van Borsel, Vandaele, & Corthals, 2013), while others have not (Munson et al., 2006; Rendall et al., 2008, in read speech; Waksler, 2001, in spontaneous speech). Also, regarding formant frequencies and vowel space size, studies vary in their findings. Munson et al. (2006) found higher F1 in /ɛ/and higher F2 in /o u/ in straight than in lesbian women but no difference in vowel space size. The large sized study of Rendall et al. (2008) including 125 Canadian speakers report higher formant frequencies in straight than in non-straight women that were vowel-specific (after controlling for body size differences between speaker groups), which leads them to conclude that these differences are a “product of broader psycho-behavioral differences between the two groups that are, in turn, continuous with and flow from the physiological processes that affect sexual orientation to begin with” (p. 1). Similarly, Pierrehumbert et al. (2004) found in a data set of American women (16 heterosexual, 16 lesbian, 16 bisexual women) higher formants in back vowels only, and, following from that, a more contracted vowel space in straight than in non-straight women, and thereby contradicting the lay gender inversion theories (Kite & Deaux, 1987).

Potential intriguing factors responsible for this rather blurred picture include methodological differences between studies, for example in speech material (read vs. spontaneous), number of speakers (starting from as few as 2 participants per group) and speaker grouping (e.g., combining lesbians and bisexuals as non-straight). Also, cross-cultural and crosslinguistic differences in gendered speech behavior must not be overseen. In addition, intra-speaker variation based on style, register or code-switching is a significant factor. Podesva (2007) found that a speaker’s choice to use a certain speech style – here the use of falsetto in constructing a gay identity – depends on situations and interlocutors. In other words, linguistic features are stylistic resources for constructing social meaning (Eckert, 2012). Since most studies are conducted within controlled laboratory settings with a non-familiar and ‘perceived straight’ experimenter, an uncontrolled bias might affect the reported results. Similarly, Podesva and Van Hofwegen (2014) found that social conservatism and prominent gender norms in an anti-urban and anti-liberal community (a non-urban community in Northern California) constrains the production of /s/ by gay men.

Recently, studies on transgender speakers – individuals where biological sex does not match their psychological gender assigned at birth – have shed new light on the debate on gendered speech. Several studies have been conducted within the framework of speech therapy often combined with hormone therapy, for example to lower f0 in female-to-male transgender individuals (Damrose, 2009; Hancock, Colton, & Douglas, 2014; van Borsel, de Cuypere, Rubens, & Destaerke, 2000). The few studies carried out by linguists and phoneticians suggest that gender differences in speech should be seen as a sociolinguistic style (Papp, 2011; Zimman, 2013, 2017). Zimman (2017, p. 339) states that “a voice’s gender is not a unidimensional feature, but a cluster of features that take on meaning only in context with one another, leaving them open for recombination and change through stylistic bricolage.” In this context, contradicting results in terms of particular features such as pitch, formants or /s/-production can be explained by the interacting factors of physiological constraints, societal gender norms, acquired traits, speaker style and the situational and functional context.

Furthermore, the recent studies of Kachel, Simpson, and Steffens (2017, 2018) and colleagues on 54 German males point to the importance of a fine-grained analysis of psychological characteristics in addition to separating speakers by sexual orientation alone. By using the 7-point Kinsey-like scale (1 = exclusively gay, 4 = bisexual, 7 = exclusively straight; modified version from Kinsey, Pomeroy, & Martin 1948), participants were grouped into gay (Kinseylike score ≤ 2) and straight (Kinsey-like score ≥ 6), but additionally, within-group variability was analyzed by comparing mainly versus exclusively gay and straight men, respectively. In Kachel et al. (2018) only few differences were found based on sexual orientation in men (lower mean F1 in straight men). However, additional influence factors on gendered phonetic variation within gay and straight men were self-ascribed masculinity/femininity ratings and the degree of sexual orientation. Similarly, in Kachel et al. (2017) a clear distinction between straight and non-straight women was missing, however, intra-group variability in median f0 and formant values were related to the exclusivity of sexual orientation and gender-role self-concept.

Weirich and Simpson (2018a) extended this aspect by assessing the self-reported gender identity in a group of 37 German straight men and women. They found that men with higher scores on the femininity/expressivity scale have higher mean f0 and larger vowel space sizes than men with lower scores, thereby highlighting the fact that independent of sexual orientation the gender identity (in terms of self-ascribed femininity/masculinity) of a speaker is reflected in speech. In general, tendencies can be detected revealing a greater importance for men to index sexual orientation in their speech. Fasoli, Hegarty, Maass, and Antonio (2018) report that men feel their voice to be more revealing in terms of indexing sexual orientation than women, mirroring a stronger stereotype concerning gay voices. They also suggest that the reason for the strong wish to index sexual orientation in heterosexual men is based on the fact that this group has “the most status to defend” (Fasoli et al., 2018, p. 62; see also Bosson & Michniewicz, 2013; Falomir-Pichastor & Mugny, 2009).

5.2 Attributing Sexual Orientation and Gender Identity

It is a very human thing to attribute non-linguistic factors – such as emotional state or personality characteristics – to a speaker based on their voice. Studies on perceived attractiveness, competence or benevolence have been conducted since the 1960s (Aronovitch, 1976; Kramer, 1964; Scherer, 1979). Research on perceived sexual orientation started in the early 1990s (e.g., Gaudio, 1994). It is worth noting that the accurateness of a rating is not the focus of these studies. Also, in attribution studies, stereotypes and shared believes of listeners (but also speakers) play a crucial role. Evaluations of a speaker’s sexual orientation often do not stand alone but are embedded in a broader concept of social attributions. For example, Munson et al. (2006) showed a relationship between attributions of height, speech clarity and sexual orientation (e.g., a male speaker that was perceived as tall and less clear was also rated to sound straight).

While some studies directly ask listeners to identify the sexual orientation of various (either straight, gay or lesbian) speakers (e.g., Linville, 1998), it has been become widely accepted to ask for ratings on a scale, not only regarding clear gradient measures such as masculinity/femininity (Gaudio, 1994; Weirich & Simpson, 2018a), but also with respect to perceived sexual orientation (Munson et al., 2006; Smyth, Jacobs, & Rogers, 2003). Indeed, Munson et al. (2006) found significant differences in listeners’ sexual orientation ratings on a 5-point scale between straight and gay speakers of American English. However, one of the two speakers with the highest gay scores was actually straight. The association of gay or bisexual was found to be triggered by the phonetic cues of higher formant frequencies and more negatively skewed /s/-spectra in males. In females, higher formant frequencies and larger vowel spaces were found to correlate with the perception of more-straight speech. Smyth et al. (2003) investigated the degree of sounding gay in 25 Canadian English male speakers. Phonetic cues that were found to correlate with the perception of gay-sounding were more peripheral vowels (Rogers, Smyth, & Jacobs, 2001), longer voice-onset-times in stops, longer sibilants with higher peaks and more alveolar variants of /l/ (Smyth & Rogers, 2002). Note though that fundamental frequency was not a correlate of perceived sexual orientation: low mean f0 was perceived as masculine but sometimes also as gay-sounding. Similarly, Levon (2006) found no relationship between variation in fundamental frequency and perceived sexual orientation in British English. Moreover, the study emphasizes the link between perceptions of sexuality and perceptions of personality. For Czech, an influence of fundamental frequency on perceived sexual orientation has been found for male listeners but not for female listeners (Valentova & Havlíček, 2013). For German, Kachel et al. (2018) showed that male speakers were perceived as straighter the lower their median f0. The relationship between fundamental frequency and perceived masculinity in men seems to be less ambiguous and has been found cross-linguistically (e.g., Avery & Liss, 1996; Munson, 2007, for English; Valentova & Havlíček, 2013, for Czech; Weirich & Simpson, 2018a, for German), pointing to a more robust influence of fundamental frequency on perceived masculinity than on sexual orientation. Of course, ratings of masculinity/femininity and sexual orientation have been shown to correlate (for English: Munson, 2007; for Czech: Valentova & Havlíček, 2013), however, correlations were much weaker for male speakers than for female speakers (Munson, 2007).

In the study of Weirich and Simpson (2018a), self-ascribed and perceived masculinity ratings of straight male speakers correlated but correlations were higher in (straight) female listeners than in (straight) male listeners. Moreover, female listeners used the additional cue of higher F1 in /a/ to attribute masculinity (correctly). Thus, the perception of constructs such as masculinity but also of sexual orientation in speech is not only subjective but varies between certain listener groups, a point that stands out even more so when different languages and cultures are concerned, as described for the effect of fundamental frequency on perceived sexual orientation in men. In addition, the spectral characteristics of /s/ have been found to be relevant regarding men’s perceived sexual orientation in English (Mack & Munson, 2012; Munson et al., 2006), as well as Italian (Sulpizio et al., 2015), while for German contradicting results have been found (Kachel et al., 2018; Sulpizio et al., 2015). Similarly, the effect of vowel formants on perceived sexual orientation seems to be larger in English and German than in Italian (Munson et al., 2006; Rogers et al., 2001; Sulpizio et al., 2015; Weirich & Simpson, 2018a). Thus, questions remain dealing with the generalizability of the phonetic parameters found across languages and cultures. What is clear, however, from the studies described, is that gender must not be considered as a binary variable but rather a continuum with nuanced variations between individuals who construct their social identities influenced by (time dependent) learned behavior, culture- and language-specific factors and shared believes and stereotypes.

6. Concluding Remarks

Research on gender effects in speech has developed over the years in line with societal changes. For a long time, the focus had been to investigate the nature-nurture dichotomy, which is the question of which speech parameters are influenced by which factors (learned, socialized or physiologically inevitable).

The physiologically grounded differences between male and female speakers were investigated using a wide range of methods from measuring the length of the vocal folds in excised larynxes, through the use of medical instruments such as X-ray, ultrasound, electromagnetic articulography, electroglottography and MRI to the development of articulatory models synthesizing speech.

More recently, behavioral aspects of constructing gender in speech has come into focus in many—often interdisciplinary—studies. The social meaning of gendered speech behavior includes a variety of social states and traits such as femininity/masculinity, being a hard-core gang girl or a gangster and “macho,” but also signaling education or urbanity. Thus, in line with developments in sociolinguistics regarding the Third Wave of Variation and the demand to interpret linguistic features as stylistic resources for constructing social meaning (Eckert, 2012), gender differences should be treated as reflecting a sociolinguistic style. Gendered speech parameters are part of a larger set of phonetic cues speakers use to construct and index social identity.

Research on gendered speech therefore requires an increasingly fine-grained approach to include various psycho-social aspects of the individual speaker. It needs to incorporate factors such as culture, experimenter gender, situated interaction, code-switching, style, attitudes, and stereotypes regarding gender-conforming speech and the fact that gender is not dichotomous but continuous and fluid.

Further Reading

  • Coates, J. (2015). Women, men and language: A sociolinguistic account of gender differences in language. London: Routledge.
  • Coates, J., & Pichler, P. (2011). Language and gender. A reader (2nd ed.). Chichester, West Sussex, UK; Malden, MA: Wiley-Blackwell.
  • Eckert, P., & Labov, W. (2017). Phonetics, phonology and social meaning. Journal of sociolinguistics, 21(4), 467–496.
  • Munson, B., & Babel, M. (2019). The phonetics of sex and gender. In W. F. Katz & P. F. Assmann (Eds.), The Routledge handbook of phonetics (pp. 499–525). Taylor and Francis.
  • Simpson, A. P. (2009). Phonetic differences between male and female speech. Language and Linguistics Compass, 3(2), 621–640.


  • Aronovitch, C. D. (1976). The voice of personality: Stereotyped judgments and their relation to voice quality and sex of speaker. The Journal of Social Psychology, 99(2), 207–220.
  • Avery, J. D., & Liss, J. M. (1996). Acoustic characteristics of less-masculine-sounding male speech. Journal of the Acoustical Society of America, 99, 3738–3748.
  • Awan, S. N., & Awan, J. A. (2013). The effect of gender on measures of electroglottographic contact quotient. Journal of Voice, 27(4), 433–440.
  • Baeck, H., Corthals, P., & van Borsel, J. (2011). Pitch characteristics of homosexual males. Journal of Voice, 25(5), e211–e214.
  • Baynes, R. A. (1966). An incidence study of chronic hoarseness among children. Journal of Speech and Hearing Disorders, 31(2), 172–176.
  • Bennett, S. (1981). Vowel formant frequency characteristics of preadolescent males and females. Journal of the Acoustical Society of America, 69(1), 231–239.
  • Bennett, S., & Weinberg, B. (1979). Sexual characteristics of preadolescent childrens’ voices. Journal of the Acoustical Society of America, 65(1), 179–189.
  • Boe, L.-J., & Rakotofiringa, H. (1975). A statistical analysis of laryngeal frequency: Its relationship to intensity level and duration. Language and speech, 18(1), 1–13.
  • Bosson, J. K., & Michniewicz, K. S. (2013). Gender dichotomization at the level of ingroup identity: What it is, and why men use it more than women. Journal of Personality and Social Psychology, 105(3), 425.
  • Brockmann-Bauser, M., Beyer, D., & Bohlender, J. E. (2015). Reliable acoustic measurements in children between 5;0 and 9;11 years: Gender, age, height and weight effects on fundamental frequency, jitter and shimmer in phonations without and with controlled voice SPL. International Journal of Pediatric Otorhinolaryngology, 79(12), 2035–2042.
  • Busby, P. A., & Plant, G. L. (1995). Formant frequency values of vowels produced by preadolescent boys and girls. Journal of the Acoustical Society of America, 97, 2603–2607.
  • Byrd, D. (1992). Sex, dialects and reduction. UCLA Working Papers in Phonetics, 81, 26–33.
  • Camp, M. (2009). Japanese lesbian speech: Sexuality, gender identity, and language (PhD thesis). The University of Arizona.
  • Cartei, V., Cowles, W., Banerjee, R., & Reby, D. (2014). Control of voice gender in pre-pubertal children. British Journal of Developmental Psychology, 32(1), 100–106.
  • Cartei, V., Garnham, A., Oakhill, J., Banerjee, R., Roberts, L., & Reby, D. (2019). Children can control the expression of masculinity and femininity through the voice. Royal Society Open Science, 6(7), 190656.
  • Curtin, S., & Kiesling, S. F. (2004). Cues to gender in children’s speech. Journal of the Acoustical Society of America, 115, 2607.
  • Damrose, E. J. (2009). Quantifying the impact of androgen therapy on the female larynx. Auris nasus larynx, 36(1), 110–112.
  • Diehl, R. L., Lindblom, B., Hoemeke, K. A., & Fahey, R. P. (1996). On explaining certain male-female differences in the phonetic realization of vowel categories. Journal of Phonetics, 24, 187–208.
  • Dolson, M. (1994). The pitch of speech as a function of linguistic community. Music Perception: An Interdisciplinary Journal, 11(3), 321–331.
  • Eckert, P. (2008). Variation and the indexical field 1. Journal of sociolinguistics, 12(4), 453–476.
  • Eckert, P. (2012). Three waves of variation study: The emergence of meaning in the study of sociolinguistic variation. Annual Review of Anthropology, 41, 87–100.
  • Falomir-Pichastor, J. M., & Mugny, G. (2009). “I’m not gay. . . . I’m a real man!”: Heterosexual men’s gender self-esteem and sexual prejudice. Personality and Social Psychology Bulletin, 35(9), 1233–1243.
  • Fant, G. (1966). A note on vocal tract size factors and non-uniform F-pattern scalings. STL-QPSR, 4, 22–30.
  • Fant, G. (1975). Non-uniform vowel normalization. STL-QPSR, 2–3, 1–19.
  • Fasoli, F., Hegarty, P., Maass, A., & Antonio, R. (2018). Who wants to sound straight? Sexual majority and minority stereotypes, beliefs and desires about auditory gaydar. Personality and Individual Differences, 130, 59–64.
  • Ferrand, C. T. (2000). Harmonics-to-noise ratios in normally speaking prepubescent girls and boys. Journal of Voice, 14(1), 17–21.
  • Ferrand, C. T., & Bloom, R. L. (1996). Gender differences in children’s intonational patterns. Journal of Voice, 10(3), 284–291.
  • Fitch, W. T., & Giedd, J. (1999). Morphology and development of the human vocal tract: A study using magnetic resonance imaging. Journal of the Acoustical Society of America, 106(3), 1511–1522.
  • Foulkes, P., Docherty, G., & Watt, D. (2005). Phonological variation in child-directed speech. Language, 81(1), 177–206.
  • Fuchs, S., Winkler, R., & Perrier, P. (2008). Do speakers’ vocal tract geometries shape their articulatory vowel space? In R. Sock, S. Fuchs, & Y. Laprie (Eds.), 8th International Seminar on Speech Production (ISSP 2008) (pp. 333–336). Strasbourg: INRA.
  • Gaudio, R. P. (1994). Sounding gay: Pitch properties in the speech of gay and straight men. American Speech, 69, 30–57.
  • Glaze, L. E., Bless, D. M., Milenkovic, P., & Susser, R. D. (1988). Acoustic characteristics of children’s voice. Journal of Voice, 2(4), 312–319.
  • Goldstein, U. (1980). An articulatory model for the vocal tracts of growing children (PhD thesis). Massachusetts Institute of Technology.
  • Günzburger, D., Bresser, A., & ter Keurs, M. (1987). Voice identification of prepubertal boys and girls by normally sighted and visually handicapped subjects. Language and Speech, 30, 47–58.
  • Guzman, M., Muñoz, D., Vivero, M., Marín, N., Ramírez, M., Trinidad Rivera, M., . . . & González, C. (2014). Acoustic markers to differentiate gender in prepubescent children’s speaking and singing voice. International Journal of Pediatric Otorhinolaryngology, 78(10), 1592–1598.
  • Hancock, A., Colton, L., & Douglas, F. (2014). Intonation and gender perception: Applications for transgender speakers. Journal of Voice, 28(2), 203–209.
  • Hasek, C. S., Singh, S., & Murry, T. (1980). Acoustic attributes of children’s voices. Journal of the Acoustical Society of America, 68, 1262–1265.
  • Heffernan, K. (2010). Mumbling is macho: Phonetic distinctiveness in the speech of American radio DJs. American Speech, 85, 67–90.
  • Henton, C. G. (1995). Cross-language variation in the vowels of female and male speakers. In K. Elenius & P. Branderud (Eds.), Proc. XIIIth ICPhS (vol. 4, pp. 420–423). Stockholm: KTH & Stockholm University.
  • Henton, C. G., & Bladon, R. A. W. (1985). Breathiness in normal female speech: Inefficiency versus desirability. Language and Communication, 5, 221–227.
  • Henton, C. G., & Bladon, R. A. W. (1988). Creak as a sociophonetic marker. In L. M. Hyman & C. N. Li (Eds.), Language, Speech and Mind: Studies in Honour of Victoria A. Fromkin (pp. 3–29). London, UK: Routledge.
  • Hillenbrand, J., Getty, L. A., Clark, M. J., & Wheeler, K. (1995). Acoustic characteristics of American English vowels. Journal of the Acoustical Society of America, 97, 3099–3111.
  • Hirano, M. (1983). The structure of the vocal folds. In K. N. Stevens & M. Hirano (Eds.), Vocal fold physiology (pp. 33–43). Tokyo, Japan: University of Tokyo.
  • Hirano, M., Kurita, S., & Nakashima, T. (1983). Growth, development, and aging of human vocal folds. In D. M. Bless & J. H. Abbs (Eds.), Vocal fold physiology: Contemporary research and clinical issues (pp. 22–43). San Diego, CA: College-Hill Press.
  • Honda, K., & Tiede, M. K. (1998). An MRI study on the relationship between oral cavity shape and larynx position. In Proceedings of the 5th International Conference on Spoken Language Processing 2: 437–40, Sydney: ISCA.
  • Kachel, S., Simpson, A. P., & Steffens, M. (2017). Acoustic correlates of sexual orientation and gender-role self-concept in women’s speech. Journal of the Acoustical Society of America, 141(6), 4793–4809.
  • Kachel, S., Simpson, A. P., & Steffens, M. (2018). “Do I sound straight?”: Acoustic correlates of actual and perceived sexual orientation and masculinity/femininity in men’s speech. Journal of Speech, Language and Hearing Research, 61, 1560–1578.
  • Kahane, J. C. (1978). A morphological study of the human prepubertal and pubertal larynx. American Journal of Anatomy, 151(1), 11–19.
  • Kahane, J. C. (1982). Growth of the human prepubertal and pubertal larynx. Journal of Speech, Language and Hearing Research, 25(3), 446–455.
  • Kallvik, E., Lindström, E., Holmqvist, S., Lindman, J., & Simberg, S. (2015). Prevalence of hoarseness in school-aged children. Journal of Voice, 29(2), 260–e1.
  • Karlsson, I., & Rothenberg, M. (1992). Inter-cultural variations in gender-based language differences in young children. STL-QPSR, 1, 1–17.
  • Kaya, H., Salahb, A. A., Karpovc, A., Frolovae, O., Grigoreve, A., & Laykso, E. (2017). Emotion, age, and gender classification in children’s speech by humans and machines. Computer Speech and Language, 46, 268–283.
  • Kinsey, A. C., Pomeroy, W. B., & Martin, C E. (1948). Sexual behavior in the human male. Philadelphia, PN: Saunders.
  • Kite, M. E., & Deaux K. (1987). Gender belief systems: Homosexuality and the implicit inversion theory. Psychology of Women Quarterly, 11(1), 83–96.
  • Klein, C. (2005). Acoustic and perceptual gender characteristics in the voices of pre-adolescent children. Phonus (Forschungsbericht Institut für Phonetik, Universität des Saarlandes), 9, 221–329.
  • Koenig, L. (2000). Laryngeal factors in voiceless consonant production in men, women, and 5-year-olds. Journal of Speech, Language and Hearing Research, 43, 1211–1228.
  • Kramer, E. (1964). Personality stereotypes in voice: A reconsideration of the data. The Journal of Social Psychology, 62(2), 247–251.
  • Labov, W. (1990). The intersection of sex and social class in the course of linguistic change. Language Variation and Change, 2, 205–254.
  • Ladefoged, P., & Broadbent, D. E. (1957). Information conveyed by vowels. Journal of the Acoustical Society of America, 39, 65–88.
  • Lee, S., Potamianos, A., & Narayanan, S. (1999). Acoustics of children’s speech: Developmental changes of temporal and spectral parameters. The Journal of the Acoustical Society of America, 105(3), 1455–1468.
  • Lerman, J. W., & Damsté, P. H. (1969). Voice pitch of homosexuals. Folia Phoniatrica et Logopaedica, 21(5), 340–346.
  • Levon, E. (2006). Hearing “gay”: Prosody, interpretation, and the affective judgments of men’s speech. American Speech, 81(1), 56–78.
  • Lieberman, P. (1967). Intonation, perception, and language. Cambridge, MA: MIT.
  • Linville, S. E. (1998). Acoustic correlates of perceived versus actual sexual orientation in men’s speech. Folia Phoniatrica et Logopaedica, 50(1), 35–48.
  • Lundeborg Hammarström, I., Larsson, M., Wiman, S., & McAllister, A. M. (2012). Voice onset time in Swedish children and adults. Logopedics Phoniatrics Vocology, 37(3), 117–122.
  • Mack, S., & Munson, B. (2012). The influence of /s/ quality on ratings of men’s sexual orientation: Explicit and implicit measures of the ‘gay lisp’ stereotype. Journal of Phonetics, 40(1), 198–212.
  • Martins, R. H. G., Hidalgo Ribeiro, C. B., Zeponi Fernandes de Mello, B. M., Branco, A., & Mendes Tavares, E. L. (2012). Dysphonia in children. Journal of Voice, 26(5), 674.e17–e20.
  • McCormack, P. F., & Knighton, T. (1996). Gender differences in the speech patterns of two and a half year old children. In P. McCormack & A. Russell (Eds.), Proceedings Sixth Australian International Conference on Speech Science and Technology (pp. 337–342). Adelaide: Australian Speech, Science and Technology Association.
  • Mendoza-Denton, N. (2011). The semiotic hitchhiker’s guide to creaky voice: Circulation and gendered hardcore in a chicana/o gang persona. Journal of Linguistic Anthropology, 21(2), 261–280.
  • Mennen, I., Schaeffler, F., & Docherty, G. (2012). Cross-language differences in fundamental frequency range: A comparison of English and German. Journal of the Acoustical Society of America, 131(3), 2249–2260.
  • Moonwomon-Baird, B. (1997). Toward a study of lesbian speech. In A. Livia & K. Hall (Eds.), Queerly phrased. Language, gender, and sexuality (pp. 202–213). New York, NY: Oxford University Press.
  • Munson, B. (2007). The acoustic correlates of perceived masculinity, perceived femininity, and perceived sexual orientation. Language and Speech, 50(1), 125–142.
  • Munson, B. (2015). Variation in /s/ and the perceived gender typicality of children’s speech (paper 0624). In The Scottish Consortium for ICPhS 2015 (Ed.), Proceedings XVIIIth ICPhS. Glasgow: University of Glasgow.
  • Munson, B., & Babel, M. (2007). Loose lips and silver tongues. Language and linguistics compass, 1(5), 416–449.
  • Munson, B., Crocker, L., Pierrehumbert, J. B., Owen-Anderson, A., & Zucker, K. J. (2015). Gender typicality in children’s speech: A comparison of boys with and without gender identity disorder. Journal of the Acoustical Society of America, 137(4), 1995–2003.
  • Munson, B., McDonald, E. C., DeBoe, N. L., & White, A. R. (2006). The acoustic and perceptual bases of judgments of women and men’s sexual orientation from read speech. Journal of Phonetics, 34, 202–240.
  • Nairn, M. J. (1995). The perception of gender differences in the speech of 4½–5½ year old children (Vol. 2, pp. 302–305). In K. Elenius & P. Branderud (Eds.), Proceedings XIIIth ICPhS. Stockholm: KTH & Stockholm University.
  • Nordström, P.-E. (1975). Attempts to simulate female and infant vocal tracts from male area functions. STL-QPSR, 16(2–3), 20–33.
  • Nordström, P.-E. (1977). Female and infant vocal tracts simulated from male area functions. Journal of Phonetics, 5, 81–92.
  • Ordin, M., & Mennen, I. (2017). Cross-linguistic differences in bilinguals’ fundamental frequency ranges. Journal of Speech, Language, and Hearing Research, 60(6), 1493–1506.
  • Papp, V. (2011). Speaker gender: Physiology, performance and perception (Unpublished doctoral dissertation). Rice University.
  • Perry, T. L., Ohde, R. N., & Ashmead, D. H. (2001). The acoustic bases for gender identification from children’s voices. Journal of the Acoustical Society of America, 109(6), 2988–2998.
  • Peterson, G. E., & Barney, H. L. (1952). Control methods used in the study of vowels. Journal of the Acoustical Society of America, 24, 175–184.
  • Pettinato, M., Tuomainen, O., Granlund, S., & Hazan, V. (2016). Vowel space area in later childhood and adolescence. Effects of age, sex and ease of communication. Journal of Phonetics, 54, 1–14.
  • Pierrehumbert, J. B., Bent, T., Munson, B., Bradlow, A. R., & Bailey, J. M. (2004). The influence of sexual orientation on vowel production (L). Journal of the Acoustical Society of America, 116, 1905–1908.
  • Podesva, R. J. (2007). Phonation type as a stylistic variable: The use of falsetto in constructing a persona 1. Journal of Sociolinguistics, 11(4), 478–504.
  • Podesva, R. J., & Van Hofwegen, J. (2014). How conservatism and normative gender constrain variation in inland California: The case of/s. University of Pennsylvania Working Papers in Linguistics, 20(2), 15.
  • Podesva, R. J. (2006). Phonetic detail in sociolinguistic variation: Its linguistic significance and role in the construction of social meaning. (Unpublished doctoral dissertation). Stanford, CA: Stanford University.
  • Rendall, D., Vasey, P. L., & McKenzie, J. (2008). The queen’s English: An alternative, biosocial hypothesis for the distinctive features of “gay speech.” Archives of Sexual Behavior, 37(1), 188–204.
  • Robb, M. P., & Simmons, J. O. (1990). Gender comparisons of children’s vocal fold contact behavior. Journal of the Acoustical Society of America, 88, 1318–1322.
  • Rogers, H., Smyth, R., & Jacobs, G. (2001). Vowel reduction as a cue to distinguishing gay-and straight-sounding male speech. In J. T. Jensen & G. van Herk (Eds.), Proceedings of the 2001 Annual Conference of the Canadian Linguistics Society (pp. 167–176). Ottawa, ON: University of Ottawa.
  • Romeo, R. Hazan, V., & Pettinato, M. (2013). Developmental and gender-related trends of intra-talker variability in consonant production. Journal of the Acoustical Society of America, 134(5), 3781–3792.
  • Sachs, J., Lieberman, P., & Erickson, D. (1973). Anatomical and cultural determinants of male and female speech. In R. W. Shuy & R. W. Fasold (Eds.), Language attitudes (pp. 74–83). Washington, DC: Georgetown University Press.
  • Scherer, K. R. (1979). Personality markers in speech. Cambridge, UK: Cambridge University Press.
  • Sederholm, E. (1995). Prevalence of hoarseness in ten-year-old children. Scandinavian Journal of Logopedics and Phoniatrics, 20(4), 165–173.
  • Senturia, B. H., & Wilson, F. B. (1968). Otorhinolaryngic findings in children with voice deviations: Preliminary report. Annals of Otology, Rhinology & Laryngology, 77(6), 1027–1041.
  • Simpson, A. P. (1998). Phonetische Datenbanken des Deutschen in der empirischen Sprachforschung und der phonologischen Theoriebildung (Work report 33). Instituts für Phonetik und digitale Sprachverarbeitung der Universität Kiel (AIPUK).
  • Simpson, A. P. (2001). Dynamic consequences of differences in male and female vocal tract dimensions. Journal of the Acoustical Society of America, 109(5), 2153–2164.
  • Simpson, A. P. (2002). Gender-specific articulatory-acoustic relations in vowel sequences. Journal of Phonetics, 30(3), 417–435.
  • Simpson, A. P., & Ericsdotter, C. (2007). Sex-specific differences in f0 and vowel space. In W. J. Barry & J. Trouvain (Eds.), Proceedings. XVIth ICPhS (pp. 933–936). Saarbrücken: Universität des Saarlandes.
  • Simpson, A. P., Funk, R., & Palmer, F. (2017). Perceptual and acoustic correlates of gender in the prepubertal voice. In Interspeech 2017 (pp. 914–918). Stockholm: ISCA.
  • Smyth, R., Jacobs, G., & Rogers, H. (2003). Male voices and perceived sexual orientation: An experimental and theoretical approach. Language in Society, 32, 329–350.
  • Smyth, R., & Rogers, H. (2002). Phonetics, gender, and sexual orientation. In S. Burelle & Somesfalean (Eds.), Proceedings of the 2002 Annual Conference of the Canadian Linguistic Association. Montréal: Dept. de linguistique et de didactique des langues, Université de Québec à Montréal (pp. 299–301).
  • Södersten, M., & Lindestad, P. A. (1990). Glottal closure and perceived breathiness during phonation in normally speaking subjects. Journal of Speech and Hearing Research, 33(3), 601–611.
  • Stevens, K. N. (1998). Acoustic phonetics. Cambridge, MA: MIT Press.
  • Su, M.-C., Yeh, T. H., Tan, C. T., Lin, C. D., Linne, O. C., & Lee, S. Y. (2002). Measurement of adult vocal fold length. The Journal of Laryngology & Otology, 116(6), 447–449.
  • Sulpizio, S., Fasoli, F., Maass, A., Paladino, M. P., Vespignani, F., Eyssel, F., & Bentler, D. (2015). The sound of voice: Voice-based categorization of speakers’ sexual orientation within and across languages. PloS One, 10(7), e0128882. doi: 10.1371/journal.pone.0128882
  • Takefuta, Y., Jancosek, E. G., & Brunt, M. (1972). A statistical analysis of melody curves in the intonation of American English. In A. Rigault & R. Charbonneau (Eds.), Proceedings VIIth ICPhS, Montréal (pp. 1035–1039). Paris, La Haye: Mouton.
  • Titze, I. R. (1989). Physiologic and acoustic differences between male and female voices. Journal of the Acoustical Society of America, 85, 1699–1707.
  • Traunmüller, H., & Eriksson, A. (1995). The frequency range of the voice fundamental in the speech of male and female adults (Unpublished manuscript).
  • Valentova, J. V., & Havlíček, J. (2013). Perceived sexual orientation based on vocal and facial stimuli is linked to self-rated sexual orientation in Czech men. PloS One, 8(12), e82417.
  • van Bezooijen, R. (1995). Sociocultural aspects of pitch differences between Japanese and Dutch women. Language and Speech, 38(3), 253–265.
  • van Borsel, J., de Cuypere, G., Rubens, R., & Destaerke, B. (2000). Voice problems in female-to-male transsexuals. International Journal of Communication Disorders, 35, 427–442.
  • van Borsel, J., Vandaele, J., & Corthals, P. (2013). Pitch and pitch variation in lesbian women. Journal of Voice, 27(5), 656.e13–656.e16.
  • Vorperian, H. K., Wang, S., Schimek, E. M., Durtschi, R. B., Kent, R. D., Gentry, L. R., & Chungb, M. K. (2011). Developmental sexual dimorphism of the oral and pharyngeal portions of the vocal tract: An imaging study. Journal of Speech, Language, and Hearing Research, 54, 995–1010.
  • Waksler, R. (2001). Pitch range and women’s sexual orientation. Word, 52(1), 69–77.
  • Weirich, M., Fuchs, S., Simpson, A. P., Winkler, R., & Perrier, P. (2016). Mumbling: Macho or morphology? Journal of Speech, Language and Hearing Research, 59(6), S1587–S1595.
  • Weirich, M., & Simpson, A. P. (2013). Investigating the relationship between average speaker fundamental frequency and acoustic vowel space size. Journal of the Acoustical Society of America, 134(4), 2965–2974.
  • Weirich, M., & Simpson, A. P. (2014a). Differences in acoustic vowel space and the perception of speech tempo. Journal of Phonetics, 43, 1–10.
  • Weirich, M., & Simpson, A. P. (2014b). Impact and interaction of accent realization and speaker sex on vowel length in German. In A. Leemann, Ma.-J. Kolly, V. Dellwo, & S. Schmid (Eds.), Trends in phonetics and phonology in German-speaking Europe. Frankfurt, Germany: Peter Lang.
  • Weirich, M., & Simpson, A. P. (2015). Gender-specific differences in sibilant contrast realizations in English and German. In The Scottish Consortium for ICPhS 2015 (Ed.), Proceedings XVIIIth ICPhS (paper 0261). Glasgow: University of Glasgow.
  • Weirich, M., & Simpson, A. P. (2018a). Gender identity is indexed and perceived in speech. PLoS One, 13(12), e0209226.
  • Weirich, M., & Simpson, A. P. (2018b). Individual differences in acoustic and articulatory undershoot in a German diphthong: Variation between male and female speakers. Journal of Phonetics, 71, 35–50.
  • Weirich, M., Simpson, A. P., Öjbro, J., & Ericsdotter Nordgren, C. (2019). The phonetics of gender in Swedish and German. In FONETIK 2019 (pp. 49–53). Stockholm: Stockholm University.
  • Westbury, J. R. (1994). X-ray microbeam speech production database user’s handbook, version 1.0. Madison, WI.
  • Whiteside, S. P. (2001). Sex-specific fundamental and formant frequency patterns in a cross-sectional study. Journal of the Acoustical Society of America, 110, 464–478.
  • Whiteside, S. P., Henry, L., & Dobbin, R. (2004). Sex differences in voice onset time: A developmental study of phonetic context effects in British English. Journal of the Acoustical Society of America, 116, 1179–1183.
  • Whiteside, S. P., & Hodgson, C. (1999). Acoustic characteristics in 6–10-year-old children’s voices: Some preliminary findings. Logopedic Phoniatrics Vocology, 24, 6–13.
  • Whiteside, S. P., & Marshall, J. (2001). Developmental trends in Voice Onset Time: Some evidence for sex differences. Phonetica, 58, 196–210.
  • Winkler, R., Fuchs, S., & Perrier, P. (2006). The relation between differences in vocal tract geometry and articulatory control strategies in the production of French vowels: Evidence from MRI and modelling. In H. C. Yehia, D. Demolin, & R. Laboissière (Eds.), Proceedings of the 7th International Seminar on Speech Production (pp. 509–516). Ubatuba, Brazil: CEFALA.
  • Yairi, E., Horton Currin, L., Bulian, N., & Yairi, J. (1974). Incidence of hoarseness in school children over a 1-year period. Journal of Communication Disorders, 7(4), 321–328.
  • Yuasa, I. P. (2010). Creaky voice: A new feminine voice quality for young urban-oriented upwardly mobile American women? American Speech, 85(3), 315–337.
  • Zimman, L. (2013). Hegemonic masculinity and the variability of gay-sounding speech: The perceived sexuality of transgender men. Journal of Language and Sexuality, 2(1), 1–39.
  • Zimman, L. (2017). Gender as stylistic bricolage: Transmasculine voices and the relationship between fundamental frequency and/s. Language in Society, 46(3), 339–370.


  • 1. The article contains an exhaustive tabular summary of previous findings, including not only f0 values but also elicitation materials and the instrumental techniques used to measure f0.

  • 2. Studies of hoarseness in children’s voices has a long historical tradition and good overviews are provided in Sederholm (1995) and Kallvik, Lindström, Holmqvist, Lindman, and Simberg (2015).

  • 3. It should be said that our interpretation of Robb and Simmons (1990)’s results is at odds with theirs in which they treat the higher Qc found for the boys’ /a/ as meaning that the glottis is open for longer than the girls.