Show Summary Details

Page of

PRINTED FROM the OXFORD RESEARCH ENCYCLOPEDIA, LINGUISTICS ( (c) Oxford University Press USA, 2019. All Rights Reserved. Personal use only; commercial use is strictly prohibited (for details see Privacy Policy and Legal Notice).

date: 22 November 2019

The Motor Theory of Speech Perception

Summary and Keywords

The Motor Theory of Speech Perception is a proposed explanation of the fundamental relationship between the way speech is produced and the way it is perceived. Associated primarily with the work of Liberman and colleagues, it posited the active participation of the motor system in the perception of speech. Early versions of the theory contained elements that later proved untenable, such as the expectation that the neural commands to the muscles (as seen in electromyography) would be more invariant than the acoustics. Support drawn from categorical perception (in which discrimination is quite poor within linguistic categories but excellent across boundaries) was called into question by studies showing means of improving within-category discrimination and finding similar results for nonspeech sounds and for animals perceiving speech. Evidence for motor involvement in perceptual processes nonetheless continued to accrue, and related motor theories have been proposed. Neurological and neuroimaging results have yielded a great deal of evidence consistent with variants of the theory, but they highlight the issue that there is no single “motor system,” and so different components appear in different contexts. Assigning the appropriate amount of effort to the various systems that interact to result in the perception of speech is an ongoing process, but it is clear that some of the systems will reflect the motor control of speech.

Keywords: Motor Theory of Speech Perception, speech perception, speech production, Haskins Laboratories, brain imaging

1. Early Indications

Finding a connection to the motor system when they were studying speech perception was a surprise to its discoverers: “The occasional complexity of the relation between articulation and the resulting sound wave is, for the most part, a nuisance, but it does provide us with a rare opportunity to ask this interesting question: when articulation and sound wave go their separate ways, which way does the perception go? The answer so far is clear. The perception always goes with articulation” (Liberman, 1957, p. 121). The studies that led to this conclusion were the result of an effort to devise a reading machine for the blind at Haskins Laboratories (Liberman, 1996; Shankweiler & Fowler, 2015). The initial expectation for the reading machine (developed before optical character recognition and speech synthesis by rule were available) was that discrete acoustic signals could substitute for letters and that text presented in this acoustic alphabet would enable acoustic “reading.” With great effort, this goal was, to a certain extent, achieved, but with “reading” rates that were no faster than could be obtained (with much less work) with Morse code. Speech, it turned out, is a remarkably efficient way to convey the language being written.

Developing reading machines in which text was transformed to synthetic speech required optical character recognition, spelling to pronunciation rules, and speech synthesis from the resulting pronunciations. Many aspects of that goal were eventually reached at Haskins Laboratories, resulting in a system that, though functional, was not marketable (Shankweiler & Fowler, 2015, p. 90). However, research designed to understand how speech efficiently conveys language to perceivers did ensue from the research. Researchers found that there is no one-to-one correspondence between phonemes and acoustic patterns, and trying to synthesize speech from a fixed set of sounds was almost entirely unsuccessful. Because speakers produce phonemes in overlapping time frames, a process dubbed “coarticulation,” there is massive context sensitivity in the speech wave and no units in the signal corresponding to individual phonemes. By itself, of course, this does not help to explain why natural speech efficiently conveys language to listeners or why attempts to substitute an acoustic alphabet for the natural speech signal failed. Research seeking an explanation led Liberman and colleagues to propose their Motor Theory of Speech Perception.

The discrepancies between the acoustics as specified in the synthetic signals and the linguistic categories perceived by listeners were greatly reduced when articulation was taken as the basis for perception. The extreme differences of the second formant (F2) transitions for /di/ and /du/, with /di/ high in frequency and rising and /du/ low in frequency and falling sharply (Liberman, Delattre, Cooper, & Gerstman, 1954), was quite unexpected. Listeners hearing both consonants as /d/ only made sense in relation to the consistency of the place of articulation of the stops. Similarly, the finding that a single frequency of a stop burst could cue /p/ in /i/ and /u/ contexts but /k/ in /ɑ/ contexts (Cooper, Delattre, Liberman, Borst, & Gerstman, 1952) made sense if the overlap of speech articulators was accounted for, but it made very little sense if isolable components of the acoustics were primary cues. The Motor Theory arose as a way to explain such data, not because of any first principles that might have been adduced.

Liberman provided formal statements of a motor theory at a meeting of the Acoustical Society of America in 1956 (Liberman, 1957) and at the Speech Communication Seminar held in Stockholm in 1962 (Liberman, Cooper, Harris, & MacNeilage, 1962). This early motor theory reflected Liberman’s training as a behaviorist, assuming a learning process in which acoustic speech signals were associated with the articulations that produced them. Somehow the articulations underlay the listener’s perceptual experience.

Another line of evidence, which would result in some of the most contentious debates, was that of categorical perception: Discrimination was found to be good when two sounds signaled different speech categories (and different articulations) but poor when they did not (Liberman, Harris, Hoffman, & Griffith, 1957). For stop place of articulation, the lack of intermediate articulations between /b/ (lips), /d/ (tongue tip), and /g/ (tongue body) corresponds well with a reference of the perception to the articulation (Liberman et al., 1962, p. 4). This report also proposed that the articulatory invariance would be found at the level of motor commands to the muscles, not the level of articulator movement.

2. Main Statement (1967), Revision (1985), and Controversy

Like many psychologists, Liberman and colleagues were influenced by the cognitive revolution of the late 1950s and 1960s, and the Motor Theory underwent a corresponding major revision. In one of the most often cited articles in the phonetics literature, Liberman, Cooper, Shankweiler, and Studdert-Kennedy (1967) laid out the main formulation of the new Motor Theory. The overarching goal was to find something invariant underlying the quite variable acoustic signal that might explain listeners’ perception of segments. Because production matched the percept better than the acoustics, the theory was aimed at finding invariance in speech production, along with active participation of the production system in the perception of speech itself. As such, the phonemes of a language were considered to be “encoded” (Liberman et al., 1967, p. 431) in the acoustic signal. A code is necessary, because a “cipher,” in which each phoneme has an invariant form, simply is not supported by the perceptual data.

The main components of the theory were that:

  1. 1. Hearers are also speakers; it would be odd for the two systems to be unrelated.

  2. 2. Both production and perception systems are activated when either speaking or listening.

  3. 3. The variability found at the acoustic level for the “same” phonetic segment will be minimal at the level of motor commands for producing the segment.

  4. 4. In recruiting the motor system in the service of speech perception, speech is a special system. It is a component of the human specialization for language.

Evidence for the theory included two findings mentioned in Section 1, acoustic mismatches between phonemes and their acoustic correlates coupled with matches between phonemes and their articulatory correlates, and categorical perception. Another consideration was the poor temporal resolving power of the ear. If discrete, acoustically distinct sounds are presented to the ear at the rate that phonemes typically occur, the result is an unanalyzable buzz (Liberman et al., 1967, p. 432). If speech perception depends on production, it follows that it should be achieved via a special system, separate from ordinary audition, one that can decode effects of coarticulatory overlap on the acoustic signal. The large time scale over which coarticulation occurs (Fowler, 1980; Magen, 1989; Whalen, 1990) further supports a specialized system that can attribute overlapping information to several articulatory gestures (e.g., Fowler & Smith, 1986). There is room, however, for “unencoded aspects of speech” (Liberman et al., 1967, p. 451) to be treated as typical auditory signals.

Early evidence interpreted as favoring invariance of motor commands was not supported by later studies. For example, MacNeilage (1970) re-examined some of his own earlier EMG (electromyography) data and found more variability than was initially reported. That paper also surveys other EMG studies that showed variability rather than the assumed invariance. It became clear that motor-command invariance was not present, so constancy had to be sought elsewhere.

Categorical perception (CP) was found for certain nonspeech categories, and some speech continua did not show categorical perception, weakening the argument that CP by itself showed a speech specialization (see articles in Harnad, 1987). For example, auditory rise time distinguishing “pluck” and “bow” sounds for a violin were initially found to be categorially perceived (Cutting & Rosner, 1974), though this turned out to be an artifact of the stimuli (Rosen & Howell, 1981). Color has also been claimed to elicit categorical perception (Bornstein & Korda, 1985), although the necessary discrimination tests were not performed. Some speech continua, such as Voice Onset Time (VOT; Lisker & Abramson, 1964) yielded categorical perception in some studies (e.g., Abramson & Lisker, 1970) but fairly continuous perception in others (e.g., Carney, Widin, & Viemeister, 1977). Some animals, such as chinchillas (Kuhl & Miller, 1978) and budgerigars (Dooling, Okanoya, & Brown, 1989), also exhibited sharp boundaries that matched human ones, suggesting category formation if not categorical perception. Steady-state vowels are typically shown to yield continuous perception (e.g., Fry, Abramson, Eimas, & Liberman, 1962). The vowel effect is compatible with a use of articulation in perception given that intermediate vowel shapes are quite possible, even if there are preferences for the articulations near the center of each vowel’s configuration. For VOT as well, there is some evidence that there are gradations along the continuum for individuals (Allen, Miller, & DeSteno, 2003) and languages (Cho & Ladefoged, 1999). Although such counterexamples to the articulatory involvement in categorical perception are often taken as refuting Motor Theory, the main effect that was used as evidence—the categorical perception of stop place—continues to be replicated and thus supports the theory.

The special status of speech perception is, in the 1967 paper, explicitly linked to the recruitment of articulation in perception. Just where that link is to be found is not entirely clear, though it was explicitly placed above the periphery in footnote 30: “We should suppose that the links between perception and articulation exist at relatively high levels of the nervous system. For information about, or reference to, motor activity, the experienced organism need not rely—at least not very heavily and certainly not exclusively—on proprioceptive returns from the periphery, for example, from muscular contractions” (pp. 452–453). Because the periphery is more easily measured than the central functions, this component of the theory could not be tested with techniques available at the time. Further, there is ambiguity about just how high a level would still count as “motoric.” This issue, and the ambiguity of the term “special” in general, would be debated for some time.

Two decades after the 1967 paper, Liberman and Mattingly (1985) revised the theory. They recast it in terms of the notion of modules (Fodor, 1983), which were claimed to have (to a large extent) eight properties: Being domain specific rather than generally cognitive; being encapsulated (that is, not referring to other processes); obligatorily completing their task; being fast; having “shallow” (simple) outputs; being relatively impenetrable to other functions (not allowing the intermediate steps to be accessed); having a regular path in a child’s development; and having a fixed neural architecture. Adopting modularity allowed the special nature of speech to have more theoretical content. For instance, it made possible a comparison between a speech module and other modules, for example, a module for auditory localization; in localization, it is clear that the peripheral signal (interaural time differences) are far removed from the percept (a location in space). In the case of speech, the module takes in complex acoustic patterns and transforms them into linguistic gestures. Unfortunately, the move also brought the controversy surrounding modularity to bear on the motor theory as well (see, e.g., Fodor, 2000; Pinker, 1997). Nonetheless, the revisions to the theory included identifying speech “gestures” both as linguistically significant speech actions caused by invariant motor commands and as objects of speech perception; asserting that the perception is of the “intended” gestures (that is, the gestures in their uncoarticulated form; Liberman & Mattingly, 1985, p. 23); and positing that infants are born with a sensitivity to what is a possible phonetic gesture (p. 24).

Further evidence for the revised motor theory was found in (1) listeners’ parsing of coarticulated speech to separate perceptually information for temporally overlapping gestures (e.g., Fowler, 1984; Whalen, 1981); (2) listeners’ integration of gestural information when it is presented piecemeal to the two ears (“duplex perception”; Rand, 1974); (3) perceivers’ integrating cross-modal information for speech gestures (McGurk & MacDonald, 1976); (4) listeners’ hearing speech even though there are no overt indicators that it is, in fact, speech; the only way to perceive speech is to parse it into speech successfully; this is seen most dramatically in the absence of traditionally identified acoustic speech cues in sinewave speech (Remez, Rubin, Pisoni, & Carrell, 1981).

One aspect of the position of Liberman and Mattingly (1985) is that recovery of gestures is performed via an analysis-by-synthesis approach (p. 26). The proposed analysis of an incoming signal is compared to a synthetic (that is, mentally constructed) version to ensure that the initial estimate is accurate. It is not clear that such a method (which has never been fully explicated) would function at the relatively high rates of speed at which phonemes are produced (Fowler, Galantucci, & Saltzman, 2003, p. 705).

Many of these observations are addressed critically in articles published in an edited book devoted to the motor theory (Mattingly & Studdert-Kennedy, 1991). Although the papers in the book derived from talks given at a conference honoring Liberman, they were written by both proponents and opponents of motor theory. As Jenkins (1991) summarized, “In fact, one speaker [Stephen Crain—DhW] roused the whole group when he said, ‘This is different from the other talks, because we agree with Al’” (p. 441).

In his final theoretical paper, Liberman avoided calling his account a Motor Theory (Liberman & Whalen, 2000), despite continuing to identify phonetic gestures as objects of speech perception. The shift in terminology reflected abandonment of some of the claims of the earlier versions of the Motor Theory. The involvement of specific muscle patterns was replaced by more general neural structures. There was less reliance on categorical perception, though it was pointed out that the higher-than-expected discrimination that is commonly found within categories is potentially due to the inability (indeed, impossibility) of synthesizing intermediate steps with acoustics that are completely natural. The specific characteristics of Fodor’s modules were replaced with a more general description of “vertical” (modular) versus “horizontal” (general) processing. New aspects were an emphasis on the “particulate principle” (Studdert-Kennedy, 1998), namely, that discrete, invariant elements can be combined so that the combination has different properties than the elements themselves: Meaningless phonemes can combine into an unbounded number of meaningful words. Emphasis was also placed on “parity,” the need for a common understanding of what counts as critical, that speakers can become hearers and vice versa, and that such a system entails that both sides of the process co-evolved. Nonetheless, the reference to motor actions as perceptual objects is still prominent in the new account, as is the predicted involvement of the motor system at a neurological level (see Section 4). Thus, depending on how many changes are allowed before a theory is not the same any more, this article can be (and often is) treated as proposing a Motor Theory of Speech Perception.

In a comprehensive assessment of the Motor Theory (Galantucci, Fowler, & Turvey, 2006), the authors assess three aspects of the theory: speech processing is special, perceiving speech is perceiving vocal tract actions, and speech perception involves recruitment of the speech motor system (p. 361). They found that the ambiguity in what is meant by “special” makes it difficult if not impossible to assess the first aspect (but see the neurological evidence in Section 4). They found that substantial evidence had accumulated over the intervening years to support the second aspect; the supporting evidence, however, not unique to the motor theory, being part of the “direct perception” stance as well (e.g., Best, 1995; Fowler, 1986). The evidence for motor involvement is complex and is developed extensively in Galantucci et al. (2006). Here, it is perhaps sufficient to direct the reader to the article and to Section 4, in which neurolinguistic evidence, including studies that have been done since publication of that paper, is examined. The ultimate conclusion was that “perceiving speech is perceiving gestures and that perceiving speech involves the motor system—warrant extended scientific scrutiny but the claim that speech is special, to the extent that it can be evaluated, does not” (Galantucci et al., 2006, p. 373).

3. Other Motor Accounts

Although the Motor Theory as proposed by Liberman and his colleagues is the most prominent theory directly linking speech perception to the motor system, it is not the only one. Just as there is no single defining feature that connects the various versions of the position of Liberman and his colleagues, so too are there sizable differences in what is included in other theories that call themselves, or could be called, motor theories.

An account of speech perception that focused on neurological support for audiovisual integration of the speech information has been proposed (Skipper, Nusbaum, & Small, 2006). The account relies heavily on mirror neurons (e.g., Rizzolatti & Craighero, 2004), and it further proposes that the motor system becomes involved in speech perception when the acoustic signal provides insufficient information. The account of Skipper and colleagues is distinguished from Liberman et al.’s motor theory, but it does include a variant of analysis-by-synthesis. The mechanisms underlying their version of analysis-by-synthesis are not fully specified (as is true of Liberman’s as well), and neither the triggers nor the test for failure of acoustic analysis (so that the motor system is engaged) is addressed. Overall, by only engaging the motor system for some perception and not others, this alternative is not fully a motor theory.

The Perception-for-Action-Control Theory (PACT) (Schwartz, Basirat, Ménard, & Sato, 2012) is identified as a perceptuo-motor theory of speech perception. The authors posit that a perceptuo-motor link structures perceptual categories as they are learned but does not drive perception after acquisition. In this way, motor consistencies between acoustics and articulatory acts that generate them are encoded but not enacted, leading to the expectation that the motor system will not be directly involved in perception. The neurological evidence on this is mixed (see Section 4), but, in any case, this position makes it clear that PACT is not a motor theory of perception.

A later development by the PACT laboratory researchers resulted in the Communicating about Objects using Sensory–Motor Operations (COSMO) account (Moulin-Frier, Diard, Schwartz, & Bessière, 2015). It maintains the PACT emphasis on linking the sensory and motor systems at a representational (but not functional) level. The authors claim that the Motor Theory is a “simplification” (p. 35) of COSMO, in that it putatively addresses only the motor side of communication and not the acoustic. This is a misunderstanding of Liberman’s theory, which relates the acoustic signal directly to motor actions but does not ignore the acoustics. In addition, COSMO fails to make the distinction between coding and online use, so that any effect seen in learning a category is taken as the same kind of evidence as processing an incoming signal (see also Fowler, 2015). Despite its ability to use internal manipulations that mimic effects of brain organization, it does not implement the kind of motor involvement invoked by Liberman et al.’s Motor Theory.

4. Neurological and Neuroimaging Evidence

With the increase in use of brain imaging techniques beginning in the 1970s, it has been possible to bring evidence from it to bear on the debate about the involvement of the motor system in perception. The literature is too extensive to review in full here, but some summaries are available (Badino, D'Ausilio, Fadiga, & Metta, 2014; Fowler & Xie, 2016; Schomers & Pulvermüller, 2016; Scott, McGettigan, & Eisner, 2009; Skipper, Devlin, & Lametti, 2017). Testing “the” motor system is complex. As Skipper et al. (2017) point out, it is not always clear just what brain regions are to be included in the “motor system.” Some researchers limit their analysis to Broca’s area (Geschwind, 1970), even though there is a great deal of intersubject variability in its structure (Amunts et al., 1999) and little direct involvement during speech production (Flinker et al., 2015). Other areas to be considered are primary motor cortex and the premotor cortex (Scott et al., 2009), as well as the supplementary motor area (Hertrich, Dietrich, & Ackermann, 2016). The brain has been hypothesized to have a quite distributed organization of most mental processes (Haxby et al., 2001; Price & Friston, 2002). Related work in the domain of vision indicates that shared processing may be evident even though unique processing can be localized (Shehzad & McCarthy, 2018). The varying levels of what counts as “motor” and the multiplicity of neural pathways make such issues as modularity and the Motor Theory difficult to assess directly.

Rather than taking Broca’s area as the critical region, many researchers examine regions that more directly lead to motor behavior, including the pre-motor region and the motor cortex itself. Some studies show that passive listening to natural speech does not obviously engage the motor regions of the cortex (Arsenault & Buchsbaum, 2015; Benson et al., 2001) while other studies suggest that it does (Correia, Jansma, & Bonte, 2015; Wilson, Saygin, Sereno, & Iacoboni, 2004). One of the most direct studies used transcranial magnetic stimulation (TMS) to show that tongue muscles had an increase in motor-evoked potentials when the participant listened to words with strongly articulated lingual sounds (Fadiga, Craighero, Buccino, & Rizzolatti, 2002). Further, when the signal is made more difficult, by using distorted speech (Adank, 2012), the accented speech of a second-language speaker (Callan, Jones, Callan, & Akahane-Yamada, 2004), or sinewave replicas (Benson, Richardson, Whalen, & Lai, 2006), motor areas are engaged even in passive listening. A more complex pattern emerged when examining nonnative sounds that varied in producibility (Wilson & Iacoboni, 2006): Motor areas were active for listening to the speech sounds (which included native and nonnative segments), but activation increased with decreasing producibility only in sensory areas. Wilson et al. (2004) concluded that speech perception was thus “sensorimotor” (p. 322). In contrast, when active responses are required, the motor system is typically engaged (Lee, Turkeltaub, Granger, & Raizada, 2012; Tremblay & Gracco, 2006). Both the manual response that participants typically provide in this research and the high level of background noise from the MRI (magnetic resonance imaging) system make this connection somewhat ambiguous (Schomers & Pulvermüller, 2016). Judging how the motor system is engaged if it is not normally active is also unexplained. Using direct measurements of muscle activity, however, Panouillères, Boyles, Chester, Watkins, and Möttönen (2018) found that the activations of lip and tongue muscles were present whether the stimulus was presented in noise or not, and that the amount of activation did not differ according to noise level. This result suggests that the fMRI results are, perhaps, not sensitive enough to detect activation that is present even for speech in the clear. Given the complexity of the brain and language, the issue will continue to be explored, but there is certainly positive evidence for motor involvement in speech perception.

Evidence from infants is especially informative given that they have not yet begun to produce speech systematically. Behavioral results suggest that there is at least motor awareness if not involvement. Infants prefer to look at a face that matches what they are hearing rather than one that does not (MacKain, Studdert-Kennedy, Spieker, & Stern, 1983). More directly, when 6-month-old infants were prevented from moving either their lips or their tongue tip (by means of specially designed pacifiers), discrimination of labial or lingual nonnative contrasts (respectively) was impaired (Bruderer, Danielson, Kandhadai, & Werker, 2015). This result is striking both because the infants were preverbal and the sounds being presented were not familiar, occurring, as they did, in an unexperienced foreign language. Neuroimaging results have shown motor involvements in speech perception in even younger children. Dehaene-Lambertz et al. (2006) found activation in Broca’s area that was similar in strength and its timing relationship to other brain regions when listening to sentences. Repetitions of the same sentence elicited more activation than the first presentation, indicating that Broca’s area may be sensitive to motor organization prior to the child’s own experience with articulation. Imada et al. (2006), however, found that motor areas were not active in listening to syllables in their infants and their 6-month-old participants but were for their 12-month-olds. They interpreted this to mean that some productive experience was necessary for engaging the motor regions during listening. Kuhl, Ramírez, Bosseler, Lin, and Imada (2014), on the other hand, found motor involvement for 12-month-olds and adults when the sounds occurred in the native language, but for both native and nonnative sounds for 6-month olds. Although the findings present a complex picture, activation of motor areas is indeed present during listening, indicating that even children with little to no linguistic experience involve motor areas in listening to speech.

Some studies of aphasia have been taken to indicate that speech production can be damaged without impairing perception (for a review, see Scott et al., 2009). Such studies are difficult to interpret because the range of tests made is generally narrow, the size of the brain lesion is typically rather large, and the stages of recovery are difficult to map and predict. For example, if tests are made only under optimal conditions (quiet room, clear speech), deficits in word recognition may be missed, ones that would be detected when slightly degraded stimuli are used instead (Moineau, Dronkers, & Bates, 2005). It may be that subcortical regions are also more heavily involved than typically envisioned (Corbetta et al., 2015; Lieberman, 2009). The evidence from aphasia, while relevant to the Motor Theory, is not compelling enough to decide the issue either way.

Some theorists propose that the connection between the motor system and speech perception is strong but not necessary during online processing (Hickok & Poeppel, 2004). Motor connections would presumably be made during language acquisition (as in PACT or COSMO; see Section 3). Connections to the lexicon, with the activation of words being the ultimate goal of speech recognition, receive more attention in this model. As with other accounts, motor representations are assumed to be used in some circumstance (Hickok & Poeppel, 2004, p. 91), but the details of how such involvement would be invoked and how it would operate when the motor link is typically unused are not addressed.

The Motor Theory was an inspiration for the work on mirror neurons, with the motor link to be partly explained by the presence of mirror neurons (Rizzolatti & Arbib, 1998, p. 189). Mirror neurons have generated, perhaps, even more controversy than the Motor Theory, but they are neither necessary for the theory nor would they be sufficient to prove it. Mirror neurons have been proposed to develop with exposure to an action (Keysers et al., 2003, p. 635), so they cannot be assumed to be a prerequisite, even if they are found to exist. It is also true that the actions that trigger mirror neurons do not allow us to understand how coarticulation is parsed in perception (Lotto, Hickok, & Holt, 2009). (However, the Motor Theory was not changed to accommodate mirror neurons, as suggested by Lotto et al. (2009, p. 112).) Mirror neurons and the Motor Theory are largely independent of each other.

Direct (and immediately reversible) impairment of the motor system can be achieved with TMS, and doing so has been shown to affect speech perception, as predicted by the Motor Theory. An early study showed that tongue muscle response during listening was correlated with the activity shown during tongue motion (Fadiga et al., 2002). Other studies have explored using TMS to disrupt processing in motor areas associated with the tongue and lips (D’Ausilio et al., 2009; Meister, Wilson, Deblieck, Wu, & Iacoboni, 2007; Sato, Tremblay, & Gracco, 2009). In general, the results showed that it is possible to disrupt one articulator (e.g., the lips) and impair perception of the relevant phonemes (labials) while leaving perception of other phonemes intact. Stasenko, Garcea, and Mahon (2013) suggest that the TMS technique is not as localized as would be needed to test the motor theory completely. Other approaches, such as using signal detection theory to doubly dissociate response bias and perception (Smalle, Rogers, & Möttönen, 2015) help clarify the issue. Overall, the evidence is rather strong that TMS applied to speech production areas disrupts perception, as predicted by the Motor Theory.

An experimental technique that is only used with preoperative seizure patients, electrocorticography (ECoG), has provided results that avoid many of the technical issues that have made previous results difficult to interpret (Glanz et al., 2018). In their study, Glanz et al. were able to use 64 electrodes to map an approximately 8 by 8 cm region of the brain surface. The electrodes could be recorded from continuously, allowing for a wide variety of tests of the motor effects of each electrode’s region and of the perceptual effect, with the ability to test naturally occurring speech as well as constructed stimuli. Across all eight of their participants, they found regions that were reliably active for both speech production and perception in natural situations; that these regions had motor properties for the mouth; and that these regions showed earlier activation than others when speech was being prepared. Previous dismissals of motor involvement in perception have referred only to experiments that used degraded stimuli (e.g., Hickok, 2015), but the several studies cited here undermine that objection. Still, even though all of these results are compatible with the Motor Theory, they are not definitive, and the topic continues to be investigated.

The analysis-by-synthesis approach proposed in Liberman and Mattingly (1985, p. 26) has received some support from a brain imaging study. It suggests that, at least during the early acquisition stage, analysis-by-synthesis is a possible mechanism of speech perception (Kuhl et al., 2014). As with the theoretical discussion before, without being fully developed, it is not possible for this aspect of the theory to be fully tested.

The final aspect of Liberman’s position to be considered is that “speech is special.” Liberman and Whalen (2000) prefer to refer to a specialization for speech, but the issue remains the same: The biological importance of intraspecies communication is best served by a system (perhaps modular) devoted to that communication. It is a reasonable assumption that every species has such a specialization, so requiring humans to depend on more general capacities would make us unique and uniquely ill-adapted (Liberman & Whalen, 2000, p. 193). Neuroimaging evidence for this specialization was found in fMRI studies of speech and nonspeech stimuli (Benson et al., 2001; Whalen et al., 2006). When the complexity of nonspeech sounds increased, there was increased activation in primary auditory cortex (PAC), as expected. When the speech complexity increased, there was no effect on PAC, even though the speech had more even more acoustic differences related to complexity than the nonspeech stimuli did. Only a region in the superior temporal gyrus (STG) showed increased activation with increasing speech complexity. This indicates that the speech system is not only separate from the auditory primitives of PAC, it actively suppresses their processing by preempting them. This is consistent with behavioral studies (Whalen & Liberman, 1987; Xu, Liberman, & Whalen, 1997), but it further demonstrates that the neural organization is specialized and organized on a temporal scale that is not expected: A neurological region (STG) that receives input from another (PAC) still affects the processing of that earlier region. Supporting evidence has been found with event-related potentials (Pérez, Meyer, & Harrison, 2008). Although suppression of processing in an earlier region by a later one seems unintuitive, it has also been found in vision (Murray, Kersten, Olshausen, Schrate, & Woods, 2002). The degree of specialization remains controversial, but the evidence for specialization itself has accumulated from both behavioral and brain imaging studies.

The brain is highly interactive, and our tools, though impressive, are still quite limited. It is likely that further developments in imaging and modeling will allow a better picture of how the brain processes speech and how much of the production system is involved in perception. Overlap between speech and nonspeech use of the lips and tongue (e.g., Pulvermüller et al., 2006) can be taken as indicating that speech is not entirely special, but the large size of the areas identified by fMRI makes it impossible to decide the issue one way or the other. At the current stage, there is good evidence for some kind of motor involvement, but the details are too complex to provide the kind of definitive answer that Liberman and his colleagues sought.

5. Current Use

Direct reference to the papers detailing the Motor Theory of Speech Perception by Liberman and his colleagues continues, even if it is less clear how often the position is adopted and how often it is rejected. As noted in Section 2, some aspects of the theory have proven untenable or exceedingly controversial: invariance in motor commands, taking categorical perception (in general) as an argument for the special status of speech, perception of the intended utterance (as opposed to perceiving the phonemes for which the signal gives us sufficient evidence after accounting for coarticulation). As also noted in Section 2, Liberman’s last theoretical paper focused on the aspects of the theory that were still viable without using the label Motor Theory; still, his account assumed “distinctly phonetic motor structures to serve as the ultimate constituents of language” (Liberman & Whalen, 2000, p. 195). His notion of “parity” (p. 189), in which speakers and listeners must share a system, leads naturally to thinking of speech perception and speech production as aspects of a single system, with a plausible if debatable contribution of the motor system to the perception of speech. The elaboration of the connection, and, indeed, the definition of “the” motor system, continues, but the basic insight that speech perception involves active participation of neural underpinnings of the motor realization of speech has substantial support.


The writing of this entry was supported by NIH grant DC-002717 to Haskins Laboratories. Many thanks to Carol A. Fowler for her insightful comments, as well as helpful suggestions from two anonymous reviewers; mistakes remain my own.

Further Reading

Fadiga, L., Craighero, L., Buccino, G., & Rizzolatti, G. (2002). Speech listening specifically modulates the excitability of tongue muscles: A TMS study. European Journal of Neuroscience, 15, 399–402.Find this resource:

Fowler, C. A., Shankweiler, D., & Studdert-Kennedy, M. (2016). "Perception of the speech code" revisited: Speech is alphabetic after all. Psychological Review, 123, 125–150. doi: 10.1037/rev0000013Find this resource:

Galantucci, B., Fowler, C. A., & Turvey, M. T. (2006). The motor theory of speech perception reviewed. Psychonomic Bulletin and Review, 13, 361–377.Find this resource:

Liberman, A. M. (1996). Speech: A special code. Cambridge, MA: MIT Press.Find this resource:

Liberman, A. M., Cooper, F. S., Shankweiler, D. P., & Studdert-Kennedy, M. (1967). Perception of the speech code. Psychological Review, 74, 431–461.Find this resource:

Liberman, A. M., & Mattingly, I. G. (1985). The motor theory of speech perception revised. Cognition, 21, 1–36.Find this resource:

Skipper, J. I., Devlin, J. T., & Lametti, D. R. (2017). The hearing ear is always found close to the speaking tongue: Review of the role of the motor system in speech perception. Brain and Language, 164, 77–105. doi: 10.1016/j.bandl.2016.10.004Find this resource:

Skipper, J. I., Nusbaum, H. C., & Small, S. L. (2006). Lending a helping hand to hearing: Another motor theory of speech perception. In M. Arbib (Ed.), Action to language via the mirror neuron system (pp. 250–285). Oxford: Oxford University Press.Find this resource:


Abramson, A. S., & Lisker, L. (1970). Discriminability along the voicing continuum: Cross-language tests. In B. Hála, M. Romportl, & P. Janota (Eds.), Proceedings of the 6th International Congress of Phonetic Sciences, Prague 1967 (pp. 569–573). Prague: Academia.Find this resource:

Adank, P. (2012). The neural bases of difficult speech comprehension and speech production: Two Activation Likelihood Estimation (ALE) meta-analyses. Brain and Language, 122, 42–54. doi: 10.1016/j.bandl.2012.04.014Find this resource:

Allen, J. S., Miller, J. L., & DeSteno, D. (2003). Individual talker differences in voice-onset-time. Journal of the Acoustical Society of America, 113, 544–552. doi: 10.1121/1.1528172Find this resource:

Amunts, K., Schleicher, A., Bürgel, U., Mohlberg, H., Uylings, H. B. M., & Zilles, K. (1999). Broca's region revisited: Cytoarchitecture and intersubject variability. Journal of Comparative Neurology, 412, 319–341. doi: 10.1002/(SICI)1096-9861(19990920)412:2<319::AID-CNE10>3.0.CO;2-7Find this resource:

Arsenault, J. S., & Buchsbaum, B. R. (2015). No evidence of somatotopic place of articulation feature mapping in motor cortex during passive speech perception. Psychonomic Bulletin and Review, 23, 1231–1240.Find this resource:

Badino, L., D'Ausilio, A., Fadiga, L., & Metta, G. (2014). Computational validation of the motor contribution to speech perception. Topics in Cognitive Science, 6, 461–475. doi: 10.1111/tops.12095Find this resource:

Benson, R. R., Richardson, M., Whalen, D. H., & Lai, S. (2006). Phonetic processing areas revealed by sinewave speech and acoustically similar non-speech. NeuroImage, 31, 342–353.Find this resource:

Benson, R. R., Whalen, D. H., Richardson, M., Swainson, B., Clark, V., Lai, S., & Liberman, A. M. (2001). Parametrically dissociating speech and nonspeech perception in the brain using fMRI. Brain and Language, 78, 364–396.Find this resource:

Best, C. T. (1995). A direct realist perspective on cross-language speech perception. In W. Strange & J. J. Jenkins (Eds.), Cross-language speech perception (pp. 171–204). Timonium, MD: York Press.Find this resource:

Bornstein, M. H., & Korda, N. O. (1985). Identification and adaptation of hue: Parallels in the operation of mechanisms that underlie categorical perception in vision and in audition. Psychological Research, 47(1), 1–17. doi: 10.1007/BF00309214Find this resource:

Bruderer, A. G., Danielson, D. K., Kandhadai, P., & Werker, J. F. (2015). Sensorimotor influences on speech perception in infancy. Proceedings of the National Academy of Sciences, 112(44), 13531–13536. doi: 10.1073/pnas.1508631112Find this resource:

Callan, D. E., Jones, J. A., Callan, A. M., & Akahane-Yamada, R. (2004). Phonetic perceptual identification by native- and second-language speakers differentially activates brain regions involved with acoustic phonetic processing and those involved with articulatory–auditory/orosensory internal models. NeuroImage, 22, 1182–1194. doi: 10.1016/j.neuroimage.2004.03.006Find this resource:

Carney, A. E., Widin, G. P., & Viemeister, N. F. (1977). Noncategorical perception of stop consonants differing in VOT. Journal of the Acoustical Society of America, 62, 961–970.Find this resource:

Cho, T., & Ladefoged, P. (1999). Variation and universals in VOT: Evidence from 18 languages. Journal of Phonetics, 27, 207–229.Find this resource:

Cooper, F. S., Delattre, P. C., Liberman, A. M., Borst, J. M., & Gerstman, L. J. (1952). Some experiments on the perception of synthetic speech sounds. Journal of the Acoustical Society of America, 24, 597–606.Find this resource:

Corbetta, M., Ramsey, L., Callejas, A., Baldassarre, A., Hacker, C. D, Siegel, J. S., . . . Shulman, Gordon L. (2015). Common behavioral clusters and subcortical anatomy in stroke. Neuron, 85(5), 927–941. doi: 10.1016/j.neuron.2015.02.027Find this resource:

Correia, J. M., Jansma, B. M. B., & Bonte, M. (2015). Decoding articulatory features from fMRI responses in dorsal speech regions. Journal of Neuroscience, 35, 15015–15025. doi: 10.1523/jneurosci.0977-15.2015Find this resource:

Cutting, J. E., & Rosner, B. S. (1974). Categories and boundaries in speech and music. Perception and Psychophysics, 16, 564–570. doi: 10.3758/bf03198588Find this resource:

D’Ausilio, A., Pulvermüller, F., Salmas, P., Bufalari, I., Begliomini, C., & Fadiga, L. (2009). The motor somatotopy of speech perception. Current Biology, 19, 381–385.Find this resource:

Dehaene-Lambertz, G., Hertz-Pannier, L., Dubois, J., Mériaux, S., Roche, A., Sigman, M., & Dehaene, S. (2006). Functional organization of perisylvian activation during presentation of sentences in preverbal infants. Proceedings of the National Academy of Sciences, 103(38), 14240–14245. doi: 10.1073/pnas.0606302103Find this resource:

Dooling, R. J., Okanoya, K., & Brown, S. D. (1989). Speech perception by budgerigars (Melopsittacus undulatus): The voiced-voiceless distinction. Perception and Psychophysics, 46, 65–71. doi: 10.3758/bf03208075Find this resource:

Fadiga, L., Craighero, L., Buccino, G., & Rizzolatti, G. (2002). Speech listening specifically modulates the excitability of tongue muscles: A TMS study. European Journal of Neuroscience, 15, 399–402.Find this resource:

Flinker, A., Korzeniewska, A., Shestyuk, A. Y., Franaszczuk, P. J., Dronkers, N. F., Knight, R. T., & Crone, N. E. (2015). Redefining the role of Broca’s area in speech. Proceedings of the National Academy of Sciences, 112(9), 2871–2875. doi: 10.1073/pnas.1414491112Find this resource:

Fodor, J. A. (1983). The modularity of mind. Cambridge, MA: MIT Press.Find this resource:

Fodor, J. A. (2000). The mind doesn't work that way: The scope and limits of computational psychology. Cambridge, MA: MIT Press.Find this resource:

Fowler, C. A. (1980). Coarticulation and theories of extrinsic timing. Journal of Phonetics, 8, 113–133.Find this resource:

Fowler, C. A. (1984). Segmentation of coarticulated speech in perception. Perception and Psychophysics, 36, 359–368.Find this resource:

Fowler, C. A. (1986). An event approach to the study of speech perception from a direct-realist perspective. Journal of Phonetics, 14, 3–28.Find this resource:

Fowler, C. A. (2015). COSMO's “motor theory” is not the motor theory of Liberman, Cooper, and Mattingly. Journal of Phonetics, 53, 42–45. doi: 10.1016/j.wocn.2015.06.002Find this resource:

Fowler, C. A., Galantucci, B., & Saltzman, E. (2003). Motor theories of perception. In M. A. Arbib (Ed.), The handbook of brain theory and neural networks (pp. 705–707). Cambridge, MA: MIT Press.Find this resource:

Fowler, C. A., & Smith, M. (1986). Speech perception as “vector analysis”: An approach to the problems of segmentation and invariance. In J. Perkell & D. Klatt (Eds.), Invariance and variability in speech processes (pp. 123–136). Hillsdale, NJ: Erlbaum.Find this resource:

Fowler, C. A., & Xie, X. (2016). Involvement of the speech motor system in speech perception. In P. H. H. M. Van Lieshout, B. A. M. Maassen, & H. Terband (Eds.), Speech motor control in normal and disordered speech: Future developments in theory and methodology (pp. 1–24). Rockville, MD: ASHA Press.Find this resource:

Fry, D. B., Abramson, A. S., Eimas, P. D., & Liberman, A. M. (1962). The identification and discrimination of synthetic vowels. Language and Speech, 5, 171–189.Find this resource:

Galantucci, B., Fowler, C. A., & Turvey, M. T. (2006). The motor theory of speech perception reviewed. Psychonomic Bulletin and Review, 13, 361–377.Find this resource:

Geschwind, N. (1970). The organization of language and the brain. Science, 170(3961), 940–944.Find this resource:

Glanz, O., Derix, J., Kaur, R., Schulze-Bonhage, A., Auer, P., Aertsen, A., & Ball, T. (2018). Real-life speech production and perception have a shared premotor-cortical substrate. Scientific Reports, 8(1), 8898. doi: 10.1038/s41598-018-26801-xFind this resource:

Harnad, S. (Ed.). (1987). Categorical perception: The groundwork of cognition. Cambridge: Cambridge University Press.Find this resource:

Haxby, J. V., Gobbini, M. I., Furey, M. L., Ishai, A., Schouten, J. L., & Pietrini, P. (2001). Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science, 293(5539), 2425–2430. doi: 10.1126/science.1063736Find this resource:

Hertrich, I., Dietrich, S., & Ackermann, H. (2016). The role of the supplementary motor area for speech and language processing. Neuroscience and Biobehavioral Reviews, 68, 602–610. doi: 10.1016/j.neubiorev.2016.06.030Find this resource:

Hickok, G. (2015). The motor system’s contribution to perception and understanding actions: Clarifying mirror neuron myths and misunderstandings. Language and Cognition, 7, 476–484. doi: 10.1017/langcog.2015.2Find this resource:

Hickok, G., & Poeppel, D. (2004). Dorsal and ventral streams: A framework for understanding aspects of the functional anatomy of language. Cognition, 91, 67–99.Find this resource:

Imada, T., Zhang, Y., Cheour, M., Taulu, S., Ahonen, A., & Kuhl, P. K. (2006). Infant speech perception activates Broca's area: A developmental magnetoencephalography study. NeuroReport, 17, 957–962.Find this resource:

Jenkins, J. J. (1991). Summary of the conference: Speech is special. In I. G. Mattingly & M. Studdert-Kennedy (Eds.), Modularity and the motor theory of speech perception (pp. 431–442). Hillsdale, NJ: Erlbaum.Find this resource:

Keysers, C., Kohler, E., Umiltà, M. A., Nanetti, L., Fogassi, L., & Gallese, V. (2003). Audiovisual mirror neurons and action recognition. Experimental Brain Research, 153, 628–636.Find this resource:

Kuhl, P. K., & Miller, J. D. (1978). Speech perception by the chinchilla: Identification functions for synthetic VOT stimuli. Journal of the Acoustical Society of America, 63, 905–917.Find this resource:

Kuhl, P. K., Ramírez, R. R., Bosseler, A., Lin, J. L., & Imada, T. (2014). Infants’ brain responses to speech suggest Analysis by Synthesis. Proceedings of the National Academy of Sciences, 111(31), 11238–11245. doi: 10.1073/pnas.1410963111Find this resource:

Lee, Y-S., Turkeltaub, P., Granger, R., & Raizada, R. D. S. (2012). Categorical speech processing in Broca's area: An fMRI study using multivariate pattern-based analysis. Journal of Neuroscience, 32, 3942–3948.Find this resource:

Liberman, A. M. (1957). Some results of research on speech perception. Journal of the Acoustical Society of America, 29, 117–123.Find this resource:

Liberman, A. M. (1996). Speech: A special code. Cambridge, MA: MIT Press.Find this resource:

Liberman, A. M., Cooper, F. S., Harris, K. S., & MacNeilage, P. F. (1962). A motor theory of speech perception Proceedings of the Speech Communication Seminar. Stockholm, Sweden.Find this resource:

Liberman, A. M., Cooper, F. S., Shankweiler, D. P., & Studdert-Kennedy, M. (1967). Perception of the speech code. Psychological Review, 74, 431–461.Find this resource:

Liberman, A. M., Delattre, P. C., Cooper, F. S., & Gerstman, L. J. (1954). The role of consonant-vowel transitions in the perception of the stop and nasal consonants. Psychological Monographs, 68, 1–13.Find this resource:

Liberman, A. M., Harris, K. S., Hoffman, H. S., & Griffith, B. C. (1957). The discrimination of speech sounds within and across phoneme boundaries. Journal of Experimental Psychology, 54, 358–368.Find this resource:

Liberman, A. M., & Mattingly, I. G. (1985). The motor theory of speech perception revised. Cognition, 21, 1–36.Find this resource:

Liberman, A. M., & Whalen, D. H. (2000). On the relation of speech to language. Trends in Cognitive Sciences, 4, 187–196.Find this resource:

Lieberman, P. (2009). Human language and our reptilian brain: The subcortical bases of speech, syntax, and thought. Cambridge, MA: Harvard University Press.Find this resource:

Lisker, L., & Abramson, A. S. (1964). A cross-language study of voicing in initial stops: Acoustical measurements. Word, 20, 384–422. doi: 10.1080/00437956.1964.11659830Find this resource:

Lotto, A. J., Hickok, G., & Holt, L. (2009). Reflections on mirror neurons and speech perception. Trends in Cognitive Sciences, 13, 110–114.Find this resource:

MacKain, K. S., Studdert-Kennedy, M., Spieker, S., & Stern, D. (1983). Infant intermodal speech perception is a left-hemisphere function. Science, 219, 1347–1349.Find this resource:

MacNeilage, P. F. (1970). Motor control of serial ordering of speech. Psychological Review, 77, 182–196.Find this resource:

Magen, H. S. (1989). An acoustic study of vowel-to-vowel coarticulation in English (Unpublished doctoral dissertation). Yale University, New Haven, CT.Find this resource:

Mattingly, I. G., & Studdert-Kennedy, M. (Eds.). (1991). Modularity and the motor theory of speech perception. Hillsdale, NJ: Erlbaum.Find this resource:

McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing voices. Nature, 264, 746–748.Find this resource:

Meister, I. G., Wilson, S. M., Deblieck, C., Wu, A. D., & Iacoboni, M. (2007). The essential role of premotor cortex in speech perception. Current Biology, 17, 1692–1696. doi: 10.1016/j.cub.2007.08.064Find this resource:

Moineau, S., Dronkers, N. F., & Bates, E. (2005). Exploring the processing continuum of single-word comprehension in aphasia. Journal of Speech, Language, and Hearing Research, 48, 884–896. doi: 10.1044/1092-4388(2005/061)Find this resource:

Moulin-Frier, C., Diard, J., Schwartz, J-L., & Bessière, P. (2015). COSMO (“Communicating about Objects using Sensory–Motor Operations”): A Bayesian modeling framework for studying speech communication and the emergence of phonological systems. Journal of Phonetics, 53, 5–41. doi: 10.1016/j.wocn.2015.06.001Find this resource:

Murray, S. O., Kersten, D., Olshausen, B. A., Schrate, P., & Woods, D. L. (2002). Shape perception reduces activity in human primary visual cortex. Proceedings of the National Academy of Sciences of the United States of America, 99, 15164–15169.Find this resource:

Panouillères, M. T. N., Boyles, R., Chesters, J., Watkins, K. E., & Möttönen, R. (2018). Facilitation of motor excitability during listening to spoken sentences is not modulated by noise or semantic coherence. Cortex, 103, 44–54. doi: 10.1016/j.cortex.2018.02.007Find this resource:

Pérez, E., Meyer, G., & Harrison, N. (2008). Neural correlates of attending speech and non-speech: ERPs associated with duplex perception. Journal of Neurolinguistics, 21, 452–471. doi: 10.1016/j.jneuroling.2007.12.001Find this resource:

Pinker, S. (1997). How the mind works. New York: W. W. Norton.Find this resource:

Price, C. J., & Friston, K. J. (2002). Degeneracy and cognitive anatomy. Trends in Cognitive Sciences, 6(10), 416–421. doi: 10.1016/S1364-6613(02)01976-9Find this resource:

Pulvermüller, F., Huss, M., Kherif, F., Moscoso del Prado Martin, F., Hauk, O., & Shtyrov, Y. (2006). Motor cortex maps articulatory features of speech sounds. Proceedings of the National Academy of Sciences, 103, 7865–7870. doi: 10.1073/pnas.0509989103Find this resource:

Rand, T. C. (1974). Dichotic release from masking for speech. Journal of the Acoustical Society of America, 55, 678–680.Find this resource:

Remez, R. E., Rubin, P. E., Pisoni, D. B., & Carrell, T. D. (1981). Speech perception without traditional speech cues. Science, 212, 947–950.Find this resource:

Rizzolatti, G., & Arbib, M. A. (1998). Language within our grasp. Trends in Neurosciences, 21, 188–194.Find this resource:

Rizzolatti, G., & Craighero, L. (2004). The mirror-neuron system. Annual Review of Neuroscience, 27(1), 169–192. doi: 10.1146/annurev.neuro.27.070203.144230Find this resource:

Rosen, S. M., & Howell, P. (1981). Plucks and bows are not categorically perceived. Perception and Psychophysics, 30, 156–168. doi: 10.3758/bf03204474Find this resource:

Sato, M., Tremblay, P., & Gracco, V. L. (2009). A mediating role of the premotor cortex in phoneme segmentation. Brain and Language, 111, 1–7. doi: 10.1016/j.bandl.2009.03.002Find this resource:

Schomers, M. R., & Pulvermüller, F. (2016). Is the sensorimotor cortex relevant for speech perception and understanding? An integrative review. Frontiers in Human Neuroscience, 10(435), 1–18. doi: 10.3389/fnhum.2016.00435Find this resource:

Schwartz, J-L., Basirat, A., Ménard, L., & Sato, M. (2012). The Perception-for-Action-Control Theory (PACT): A perceptuo-motor theory of speech perception. Journal of Neurolinguistics, 25, 336–354. doi: 10.1016/j.jneuroling.2009.12.004Find this resource:

Scott, S. K., McGettigan, C., & Eisner, F. (2009). A little more conversation, a little less action—Candidate roles for the motor cortex in speech perception. Nature Reviews Neuroscience, 10, 295–302.Find this resource:

Shankweiler, D., & Fowler, C. A. (2015). Seeking a reading machine for the blind and discovering the speech code. History of Psychology, 18, 78–99.Find this resource:

Shehzad, Z., & McCarthy, G. (2018). Category representations in the brain are both discretely localized and widely distributed. Journal of Neurophysiology, 119, 2256–2264. doi: 10.1152/jn.00912.2017Find this resource:

Skipper, J. I., Devlin, J. T., & Lametti, D. R. (2017). The hearing ear is always found close to the speaking tongue: Review of the role of the motor system in speech perception. Brain and Language, 164, 77–105. doi: 10.1016/j.bandl.2016.10.004Find this resource:

Skipper, J. I., Nusbaum, H. C., & Small, S. L. (2006). Lending a helping hand to hearing: Another motor theory of speech perception. In M. Arbib (Ed.), Action to language via the mirror neuron system (pp. 250–285). Oxford: Oxford University Press.Find this resource:

Smalle, E. H. M., Rogers, J., & Möttönen, R. (2015). Dissociating contributions of the motor cortex to speech perception and response bias by using transcranial magnetic stimulation. Cerebral Cortex, 25, 3690–3698. doi: 10.1093/cercor/bhu218Find this resource:

Stasenko, A., Garcea, F. E., & Mahon, B. Z. (2013). What happens to the motor theory of perception when the motor system is damaged? Language and Cognition, 5, 225–238. doi: 10.1515/langcog-2013-0016Find this resource:

Studdert-Kennedy, M. (1998). The particulate origins of language generativity: From syllable to gesture. In J. R. Hurford, M. Studdert-Kennedy, & C. Knight (Eds.), Approaches to the evolution of language: Social and cognitive bases (pp. 202–221). Cambridge: Cambridge University Press.Find this resource:

Tremblay, P., & Gracco, V. L. (2006). Contribution of the frontal lobe to externally and internally specified verbal responses: fMRI evidence. NeuroImage, 33, 947–957. doi: 10.1016/j.neuroimage.2006.07.041Find this resource:

Whalen, D. H. (1981). Effects of vocalic formant transitions and vowel quality on the English [s]-[š] boundary. Journal of the Acoustical Society of America, 69, 275–282.Find this resource:

Whalen, D. H. (1990). Coarticulation is largely planned. Journal of Phonetics, 18, 3–35.Find this resource:

Whalen, D. H., Benson, R. R., Richardson, M., Swainson, B., Clark, V., Lai, S., . . . Liberman, A. M. (2006). Differentiation for speech and nonspeech processing within primary auditory cortex. Journal of the Acoustical Society of America, 119, 575–581.Find this resource:

Whalen, D. H., & Liberman, A. M. (1987). Speech perception takes precedence over nonspeech perception. Science, 237(4811), 169–171.Find this resource:

Wilson, S. M., & Iacoboni, M. (2006). Neural responses to non-native phonemes varying in producibility: Evidence for the sensorimotor nature of speech perception. NeuroImage, 33, 316–325. doi: 10.1016/j.neuroimage.2006.05.032Find this resource:

Wilson, S. M., Saygin, A. P., Sereno, M. I., & Iacoboni, M. (2004). Listening to speech activates motor areas involved in speech production. Nature Neuroscience, 7, 701–702.Find this resource:

Xu, Yi, Liberman, A. M., & Whalen, D. H. (1997). On the immediacy of phonetic perception. Psychological Science, 8, 358–362.Find this resource: