Show Summary Details

Page of

Printed from Oxford Research Encyclopedias, Linguistics. Under the terms of the licence agreement, an individual user may print out a single article for personal use (for details see Privacy Policy and Legal Notice).

date: 15 January 2021

Tongue Muscle Anatomy: Architecture and Functionfree

  • Maureen StoneMaureen StoneUniversity of Maryland School of Dentistry


The tongue is composed entirely of soft tissue: muscle, fat, and connective tissue. This unusual composition and the tongue’s 3D muscle fiber orientation result in many degrees of freedom. The lack of bones and cartilage means that muscle shortening creates deformations, particularly local deformations, as the tongue moves into and out of speech gestures. The tongue is also surrounded by the hard structures of the oral cavity, which both constrain its motion and support the rapid small deformations that create speech sounds. Anatomical descriptors and categories of tongue muscles do not correlate with tongue function as speech movements use finely controlled co-contractions of antagonist muscles to move the oral structures during speech. Tongue muscle volume indicates that four muscles, the genioglossus, verticalis, transversus, and superior longitudinal, occupy the bulk of the tongue. They also comprise a functional muscle grouping that can shorten the tongue in the x, y, and z directions. Various 3D muscle shortening patterns produce large- or small-scale deformations in all directions of motion. The interdigitation of the tongue’s muscles is advantageous in allowing co-contraction of antagonist muscles and providing nimble deformational changes to move the tongue toward and away from any position.

1. Introduction

The behavior of tongue muscles has been of interest since tongue dissections identified the complexity of tongue muscle architecture in the 1930s. The activation of tongue muscles and its effect on tongue surface shape has been of interest since lateral X-rays made it feasible to see a 2D tongue surface in the 1950s. With the development of electromyography (EMG) in the 1970s as a viable tool for human muscle measurement, insight into tongue muscle behavior began. With time, however, it became apparent that there was no one-to-one relationship between activity of any single muscle and a specific tongue shape change. In addition, it was observed that the interdigitation of most of the muscles once they entered the tongue proper made it impossible to identify the activity of a specific muscle. In the 1980s and 1990s, imaging techniques, such as ultrasound and magnetic resonance imaging (MRI), began to provide improved information about 2D and 3D tongue shape, muscle anatomy, and muscle shortening. The ability to visualize muscle in detail has led to research that can answer motor control questions about tongue behavior. This article begins to address this theme.

The tongue is unique in the body. It is composed entirely of soft tissue and deforms without changing its volume. The heart, like the tongue, is composed entirely of soft tissue but differs in notable ways. The heart increases and decreases in size as it pushes blood through the body. It also moves much more slowly than the tongue. The heart beats about 72 times per minute or about 1.2 beats per second. The tongue moves at 10–15 phonemes per second, a much faster rate than the heart, even allowing for coarticulation (cf. Liberman, 1996, p. 204). In addition, tongue muscle architecture is very complicated, with muscles forming a mirror image from left to right, and being anisotropic from front to back. This nonuniformity allows the tongue to deform into the many shapes needed to perform its many functions, such as chewing, swallowing, speaking, singing, breathing, and directing air into musical instruments. Chewing requires the tongue to throw food onto the teeth, while adapting to changing bolus properties. Swallowing requires the tongue to propagate the bolus backward, which it does by elevating its surface sequentially from front to back in a manner similar to a transverse waveform, which squeezes the bolus backward. Speaking and singing require massive varieties of complex tongue deformations. Other lingual behaviors that are linguistically functional or just fun are also made by tongue deformation (e.g., clicks, trills, and rolling the tongue). Even breathing requires active tongue involvement; during every inhalation, tongue muscles contract to maintain airway opening (Sauerland & Mitchell, 1975). Why does the tongue have all these roles? The key is its location. The oral cavity is a gateway into the body, that is, an interface between the environment and the internal gastric and respiratory structures. The oral cavity is the entrance for air and food; it is the exit for breath, sound, and GI (gastrointestinal) activity gone wrong. The tongue is in the right place to facilitate all these functions.

Figure 1. A midsagittal MR image of the human tongue.

Source: Author.

The tongue is very large and fits tightly inside the oral cavity where it is surrounded by rigid structures (see Figure 1). Figure 1 shows a midsagittal anatomical magnetic resonance image of the tongue surrounded by the teeth, palate, and jaw and occupying most of the oropharyngeal cavity. It is hard to imagine the tongue moving around in such a small space. The small cavity size is not a problem, however, as the jaw lowers to enlarge the oral cavity and facilitate tongue repositioning. Moreover, the proximity of hard structures helps the tongue to produce large numbers of lingual phonemes with minimal internal motion. For example, the jaw performs gross tongue positioning by lowering and raising the tongue to modify the oral and pharyngeal cavity size as needed. The teeth and hard palate do not move; they provide a reliable set of oppositional surfaces to facilitate or obstruct airflow and create sound sources. The hard structures allow the tongue to deform and shape the oral cavity, which provides variable resonator shapes during speech.

The lips and velum are soft muscular structures like the tongue, which deform to shape and constrict the vocal tract. Their movements, although independent, can coordinate with the tongue as during lip-rounded sounds or nasal/nonnasal distinctions. However, the lips and velum are limited in degrees of freedom to opening/closing and rounding/spreading. Thus, the tongue has the most degrees of freedom of any of the oral structures, becoming the most frequently used structure in producing speech phonemes.

Historically, motion occurring on the surface of the tongue has been of the greatest interest to linguists. This focus is important because the tongue surface forms the lower portion of the vocal tract tube and thus is crucial to shaping the vocal tract during speech sounds. It is also the sound source, in conjunction with other oral structures, for many consonants. Instruments that directly measure the tongue, such as Electropalatography, Electromagnetic Articulography, and the X-ray Microbeam, have provided great insight into tongue surface behavior and its mechanisms and have been summarized elsewhere (cf. Stone, 2006).

This article focuses on activity occurring within the tongue, as this provides insight useful for theories of motor control and speech production variability. Although the goal of tongue motion in speech is to position the tongue surface properly along the vocal tract, the mechanism for doing so is the complex, interactive activity of its deformable muscular architecture.

2. Tongue Muscle Structure

The tongue is a deformable, volume-preserving body, which deforms in three directions. It is often called a muscular hydrostat (Kier & Smith, 1985; Smith & Kier, 1989). True hydrostats, such as a squid or octopus, have no bony skeleton; they have a fluid-filled sack that gives them a soft fluid structure. Muscular hydrostats have no skeleton and no fluid-filled sack. They include structures like the elephant’s trunk, tentacles, and tongues. They are composed entirely of soft tissue, that is, muscles, fat, connective tissue, blood vessels, and nerves. These soft tissues give muscular hydrostats structure, and muscle shortening deforms them into complex motions. There are two definitive features of muscular hydrostats that allow them to operate successfully: (a) they have 3D orthogonal muscle orientation, which allows them to deform in any direction, and crucially, to return to their original shape; and (b) they are volume preserving, which means they can change shape but not size. Since size is constant, compression in one location yields expansion in another.

The tongue is a muscular hydrostat. It has no skeleton, bones, or joints. It is made entirely of muscle and other soft tissues. It moves by deforming. Tongue motion requires complex tongue muscle architecture, because speaking, eating, and the other lingual functions use a remarkable number of rapidly changing tongue shapes and biomechanical complexity. Moreover, the tongue makes all its deformations in three directions (x, y, z) by creating regions of compression and expansion, internally and on the surface. Only the human tongue has a rounded surface shape and is situated in a curved vocal tract (see Figure 1). This provides a unique vocal tract structure in which small deformations of the curved tongue can constrict and obstruct the vocal tract tube locally for speech. This suggests that it may be easier to create the constrictions that cause speech resonances with a curved architecture than with a long vocal tract and flat tongue of other primates.

This article considers muscles based on anatomy, size, and interdigitation. Anatomy is the basis for the classical division of tongue muscles into extrinsic and intrinsic groups. Table 1 shows this anatomical division and also the muscle orientations for protruding and retruding the tongue. Extrinsic muscles originate on bones outside the tongue and insert into the tongue surface. Their fibers are located very laterally from back to front (Styloglossus (SG), Hyoglossus (HG), and Palatoglossus (PG)) or very medially from front to back (Genioglossus (GG)). These orientations keep the airway open centrally for food and breath to pass. Extrinsic muscles are bundled, like other skeletal muscles, until they enter the tongue proper; then they interdigitate with the intrinsic muscles.

Table 1. Tongue Muscles and Functions

Extrinsic Muscles







GGa depresses anterior tongue; GGp pulls posterior tongue forward; assists tongue protrusion



Hyoid bone/tongue

Retrudes and depresses tongue



Styloid process/tongue

Retrudes and elevates tongue



Soft palate/tongue

Lowers velum; raises tongue if velum fixed

Intrinsic Muscles



Superior longitudinal(is)


Tongue tip/tongue root

Shortens and retrudes tongue; elevates tip

Inferior longitudinal(is)


Tongue tip/tongue root

Shortens and retrudes tongue; depresses tip



Tongue dorsum/upper surface of IL

Flattens and widens tongue; protrudes tongue with T



Median septum/lateral tongue

Narrows and protrudes tongue

Figure 2. Midsagittal image of tongue muscles of human (left) and cat (right).

Reprinted from Kahane and Folkins (1984).

Intrinsic muscles originate and insert on soft tissue inside the tongue. The intrinsic muscles closely track the lengthwise and crosswise directions of the tongue. Figure 2 shows a parasagittal slice of the human (left) and cat (right) tongue showing some of these muscles. The intrinsic lengthwise muscles, superior and inferior longitudinal (SL, IL), oppose crosswise muscles in two directions. The fibers in the horizontal direction are the transverse (T) muscle. The vertical muscles are the GG muscle and the verticalis (V) muscle (a more lateral muscle not seen in Figure 2). In the mid-tongue region, alternating layers of these horizontal and vertical muscle fibers are organized into about 100 alternating slices repeating from tip to root (Takemoto, 2001). Thus, the central tongue body, contains crosswise fibers in the horizontal (T) and the vertical (V) directions, and also radially (GG). The body is surrounded by lengthwise muscles on the top (SL), bottom (IL), and sides (HG) and (SG). These four intrinsic muscles plus the extrinsic GG encompass all the directions of compression and expansion needed by the tongue. The additional extrinsic muscles supplement these fiber directions once interdigitated into the tongue body. The SG interdigitates with T when it inserts into the posterior tongue and also follows the HG on the lateral tongue edge as they both course forward in parallel with SL. They pull the tongue backward when activated together and upward (SG) or downward (HG), respectively. They deform the tongue as they move it, since it is a nonrigid body.

Tongue muscle size is a good indication of its role in moving and deforming the tongue. Stone et al. (2018) studied 14 in vivo tongues by segmenting 3D tongue muscles from their MRI volumes. Figure 3 shows the relative size of the muscles functionally and structurally. The muscle volumes were characterized in two ways. The first was their structural volume, or physical mass, and the second was their functional volume, or the proportion of the tongue they control. The functional volume measured the volume of the tongue occupied and moved by the muscle, irrespective of whether other interdigitated muscles crossed the same region. The structural volume halved the muscle volume where it was interdigitated with other muscles. The intrinsic muscles are almost entirely interdigitated. The extrinsic muscles are interdigitated only in the tongue body.

Figure 3. Functional and structural muscle volumes arranged by size. Bars include bundled fibers (black), interdigitated fibers (gray and light gray) with interdigitated muscles indicated on bar.

Reprinted from Stone et al. (2018), CMBBE (Computer Methods in Biomechanics and Biomedical Engineering) .

This figure includes the floor of the mouth muscles, geniohyoid (GH), the anterior belly of digastric (ABD), and the mylohyoid (MH). Even though the floor muscles move the jaw and hyoid bone and do not enter the tongue body, the first two shorten the distance between the jaw and hyoid, which globally elevates the tongue, and the U-shaped MH elevates the tongue as well when shortened.

The structural and functional muscle volumes both show that the GG is the largest tongue muscle and is the workhorse of the tongue. The GG’s size and midline location allow it to control the midsagittal tongue shape almost single-handedly, by creating local and global midline grooving. In the functional muscle volumes (Figure 3, left), three intrinsic muscles, the SL, T, and V, plus the GG, occupy most of the tongue, control 70% of its volume, and maintain all three directions of motion. The SL and Genioglossus Posterior (GGp) muscles control the AP direction, the T controls the LR direction, and the V and Genioglossus Anterior (GGa) control the SI direction. These four muscles can move the most tissue with the least effort. They comprise a functional grouping that produces large-scale deformations of the tongue in three directions. They may also deform locally to produce smaller deformations, as they are densely innervated throughout their volumes.

Figure 3 also demonstrates another notable feature of the tongue musculature, the extent of its interdigitation. Almost all the muscles within the tongue proper are interdigitated, including the extrinsic muscles. This reduces the structural volume (Figure 3, right) of most of the muscles. The exceptions are IL, which is bundled anteriorly (cf. Abd el Malek, 1939), and the floor of mouth muscles: digastric (ABD), mylohyoid (MH), and geniohyoid (GH), whose singular muscle direction is consistent with other skeletal muscles that connect bones to bones.

Interdigitation is very unusual in biology and is limited to structures with no bones, like muscular hydrostats, the lips, the velum, and the heart. Typically, muscles move bones and muscle fibers are bundled and unidirectional. The overwhelming use of bundled muscles in the body suggests that it is the preferred architecture for muscles. Indeed, several disadvantages of interdigitation can be imagined. One potential disadvantage is the difficulty of independent innervation for two orthogonal interwoven muscles that are quite probably antagonistic. The second disadvantage is the reduction in tissue flexibility; the more fiber directions there are, the more difficult it is to bend the structure. The third is increased friction occurring when the muscles shorten and each fiber slides past an orthogonal one. In fact, the tongue is replete with fat and a grid of connective tissue through which its muscle fibers pass and which reduces friction and potential heat as orthogonal fibers shorten.

Figure 4. Comparison between modeled (left) and experimentally obtained (right) muscle tracts. The modeled GG and GH tracts A have approximately the same orientation (color) as the experimental B tractography results. The whole tongue view of the modeled tracts C contains all muscles. The experimental whole tongue tractography D has no transverse fibers due to the single fiber assumption.

Reprinted from Gomez et al. (2018).

Interestingly, the interdigitated muscles groups in the tongue are composed of two orthogonal fiber directions, but not three, as shown in dissections (cf. Abd-el-Malek, 1939; Miyawaki, 1974) and MRI-Diffusion Tensor Images (DTI). Figure 4 shows how the vertical fibers of Genioglossus and Verticalis exist in very thin sheets that alternate with the horizontal fibers of transverse. DTIs are made by measuring Brownian motion of water in muscle cells. Water moves more rapidly within a cell than across its membrane (osmosis). A long thin cell, such as a muscle or nerve, will exhibit faster water motion along its length than width. This difference in speed is tracked with DTI to determine fiber direction. The V and GG fibers in Figures 2 and 4 persist to the top of the tongue surface and interdigitate superiorly with the lengthwise fibers of SL. The T fibers, however, stop below the superior longitudinal muscle, thus preventing three orthogonal directions. One can imagine even more disadvantages to three fiber directions than two. Lingual tissue is volume preserving. If 3D interdigitated muscles were activated simultaneously, the 3D shortening would violate volume preservation because one direction must expand. In addition, three fiber directions would mean an even more tightly packed set of fibers in a single region, making it even more difficult for the muscle to expand laterally when shortened and more likely for the muscles to compress local blood vessels. Thus, 2D orthogonality of the tongue muscles is advantageous to its motion requirements, despite its disadvantages. Three interdigitated fiber directions are not found in the tongue, suggesting that these additional disadvantages are too difficult to overcome (Stone et al., 2018).

3. Tongue Innervation

Tongue muscle anatomy provides the bases for tongue motion, that is, extrinsic versus intrinsic muscles and protruders versus retruders. In this respect the fiber orientation of the human tongue is not that different from the tongue of other animals, such as the cat (Figure 2, right), despite the shape differences. Humans, however, use their tongue in much more complex ways for skilled speech than other animals do. We can learn (and understand) different languages, dialects and accents. We produce consistent speech standing up, lying down, walking, and running. To do this requires both distinctive deformations and global categorizations that include subtle phonemic differences across languages and dialects. These tasks require complex synergies of local tongue muscle activations. Despite the simple muscle categorizations based on anatomy, muscle complexity is very prominent in the fiber patterns and the neural composition of the tongue. For example, fiber analysis shows that SL spans the tongue length with short overlapping fibers, not long ones, providing an opportunity for localized fiber innervation and shortening (Slaughter & Sokoloff, 2005). The GG muscle, on the other hand, has long fibers with multiple innervation regions (anterior, medial, posterior), also providing an opportunity for localization of fiber innervation and shortening (Miyawaki, 1975; Mu & Sanders, 1999). In both cases, the innervation can independently activate multiple anterior-to-posterior muscle regions. Indeed, neuroanatomy indicates extensive innervation of the tongue, with as many as 13,000 motoneurons in the human hypoglossal nucleus, the source of tongue motor control (Atsumi & Miyatake, 1987; O’Kusky & Norman, 1995; Wozniak & Young, 1969). Analyses of nerve distribution in the tongue shows that of the 12 cranial nerves, 5 have a branch devoted to the tongue, representing sensation, motor control, and taste (see Table 2). The 12th cranial nerve, the Glossopharyngeus, is entirely devoted to tongue motor control. This extensive neuromuscular system and complex muscle fiber orientation is compatible with local innervation of muscle fibers, and local tongue control, not whole-muscle activation, and would facilitate the extensive variety of deformations available to the tongue for speech.

Table 2. Cranial Nerves That Innervate the Tongue




Tongue Location




anterior 2/3rd




posterior 1/3rd








7 tongue muscles


Chorda tympani


anterior 2/3rd




posterior 1/3rd

Anatomical muscle groupings such as extrinsic/intrinsic muscles or protruder/retruder appear at first glance to be consistent with the neural organization of tongue motor control. McClung and Goldberg (1999, 2000) identified the location of the tongue protruders in the ventral region of the Hypoglossal Nucleus (HGN) and the lateral branch of the hypoglossal nerve. The retruders are represented in the dorsal region of the HGN and the medial branch of the hypoglossal nerve. For a long time, the extrinsic muscles were thought to move the tongue as a rigid body, while intrinsic muscles fine-tuned the tongue shape into subtle, minimal deformations (cf. Hardcastle, 1976; Perkell, 1969). This was a reasonable interpretation as other structures, such as the hands, use extrinsic muscles for power and intrinsic muscles for precision (Long, Conrad, Hall, & Furler, 1970). In addition, extrinsic muscles originate on an immobile rigid bone, which guarantees that activation will cause only the insertion end of the muscle to move. Biomechanically, this would create a direct response to muscle activation and potentially pull the entire tongue toward the bone. The intrinsic muscles connect to two regions of soft tissue and contracting them would cause compression within the tongue and expansion in an opposing direction.

This premise, however, is too simplistic. The tongue muscles, controlled by thousands of motor units, result in myriad shapes observed across languages and even dialects. Many variable shapes occur for even essentially the same phoneme. Speech motion, and even breathing, require shape complexity beyond what simple protrusion and retrusion would imply (Sokoloff, 2004). The variety of tongue deformations used in speech is crucially dependent on the complexity of tongue muscle architecture. Thus, the neurological grouping of protruders versus retruders, for example, is too simple a categorization scheme; speech requires much more complicated tongue deformations than the simple protrusion designed to catch food and convey it into the mouth. Tongue motion becomes differentiated for speech early in life (Giulivi, Whalen, Goldstein, Nam, & Levitt, 2011). “The sucking reflex results in a piston-like protrusion and retraction movement of the tongue. Adult mastication and deglutition patterns emerge around 6 months of age” (Seikel, Konstantopoulos, & Drumright, 2018, p. 331). By adulthood even the simplest speech gestures use co-contraction of tongue muscles despite the simple neurological representation (cf. Baer, Alfonso, & Honda, 1988; MacNeilage & Sholes, 1964; Miyawaki, 1975).

Because speech requires rapid, continuous motion, the control of speech muscle activity must be continuous and rapid. Two types of control systems are available to the tongue: closed-loop and open-loop. A closed-loop system uses sensory feedback to identify and correct errors and to respond reflexively to stimuli. An open-loop system continuously compares the state of a movement with a well-known motor program for that movement, and is a feed-forward, rather than a feedback, process. Closed-loop systems use real-time feedback to adjust a motion automatically toward a reference target. This is particularly helpful for reflexive motions that are protective, and when gaining the skills needed to learn a new task, like speech. Once motor learning is acquired, the skilled behavior can be executed more rapidly (Cooper, 1953). At that point, open-loop programs are utilized to allow faster speech motion.

Feedback in the human tongue comes from several sources. Muscle spindles are stretch receptors positioned parallel to muscle fibers that detect changes in the length, or stretch, of a muscle. The stretch information is sent, via afferent fibers, to the central nervous system and a motor response is generated in the 12th Cranial Nerve. Muscle spindles provide feedback in a closed-loop system to allow for motion adjustments during speech. Spindles have been observed in human tongue dissections, but they are distributed unequally across the musculature (Cooper, 1953). The tip of the tongue has very few spindles, and they occur only in the V muscle. Near the tongue surface, muscle spindles are found in the SL muscle at mid-tongue but not in the tip or back. Spindles also occur in the GG where it enters the base of the tongue, throughout the T muscle, and in V just lateral to the bulk of the GG. This is a sparse distribution considering the high motor innervation of the tongue muscles. Another potential feedback source is the tonic stretch receptor. A tonic stretch reflex occurs as a rapid, protective response to excessive lengthening in a muscle. Tonic stretch receptors are found in the jaw, but not in the tongue, however (Neilson, Andrews, Guitar, & Quinn, 1979). The tongue also gets sensory input to motor control from the lingual nerve. The tongue tip and blade have fascicles that contain large numbers of low threshold mechanoreceptive afferents responsive to touch (Trulsson & Essick, 1997). Fast-adapting superficial units respond when the tongue contacts other intraoral structures. Slow-adapting deep units encode information about tongue movement without the need for direct contact. They provide feedback about the position and motion of the tongue during speech, swallowing, and other motions.

Tongue muscle activation is a complex dynamic system, not an interlaced set of independent feedback circuits. Therefore, an open-loop system may work better for monitoring and correcting rapid, overlearned behaviors, such as speech. If a speech motion is executed incorrectly, the speaker will hear the error and will slow down and try again. This allows faster speech than waiting for real-time individual muscle feedback to steer the motion. Nonetheless, closed-loop feedback is involved in moving the tongue during speech, providing proprioceptive feedback in an ongoing basis. The loss of closed-loop feedback can be observed when injection of a dental anesthetic removes sensory feedback in the oral cavity. This makes speech motions clumsy and slow (Seikel et al., 2018).

The ideal way to study motor control of a muscle is with EMG, which measures the electrical charge that occurs when a muscle activates. In the tongue, hooked-wire electrodes, consisting of two fine wires adjacently positioned, are inserted in the tongue at the location of an individual muscle. When the muscle activates, it generates an electrical impulse, which is recorded as it passes the wires. The wires record timing and amplitude of the impulses. Usually EMG is used in conjunction with other instruments to relate muscle activity to tongue motion or shape. EMG in the tongue has elucidated motor control of vowels (Alfonso & Baer, 1982; Bell-Berti, Raphael, Pisoni, & Sawusch, 1979; Waltl & Hoole, 2008), and coarticulation (Recasens, 1991). For example, Alfonso and Baer (1982) studied 10 vowels in /əpVp/ context using EMG, acoustics, and lateral cinefluoroscopy. Their EMG data for GG identified the coordinated timing of muscle activity with gestural events, found greater activity for tense than lax vowels, and detailed the use of GGp for tongue raising and fronting. Other studies found phase relationships between articulators during controlled motions (Tuller, Kelso, & Harris, 1982). EMG of the tongue has also been used to study breathing and sleep apnea. Sauerland and Mitchell (1975) were the first to observe that GGp contracts with every inhalation, ensuring airway patency during respiration. Strohl and Redline (1986) measured loss of activity during episodes of obstructive sleep apnea. The use of EMG in the tongue has limitations, however, because tongue muscles are almost entirely interdigitated within the tongue body. Thus, a signal coming from any specific location might originate from either of the two interdigitated muscles. Since the two muscles have opposing fiber orientations, they may include protrusor/retrusor and elevator/depressor combinations of muscles, so the EMG signal cannot clearly delineate the exact muscle and expected direction of tissue motion.

To enhance EMG or as a stand-alone technique, tongue modeling has been used to predict the relationship between muscle activation and tongue surface shape (cf. Dang & Honda, 2002; Fang, Fujita, Lu, & Dang, 2009; Lofqvist & Lindblom, 1994). Inverse modeling uses tongue surface shape changes seen in MRI or cinefluoroscopy to estimate the active muscles that caused the change. Predictive modeling estimates the tongue surface shapes that will result from activation of specific muscles. Inverse and predictive modeling are often used together to understand speech motor control, including the role of the tongue to create vowels (Buchaillard, Perrier, & Payan, , 2009; Wu, Dang, & Stavness, 2014), consonants (Harandi, Woo, Stone, Abugharbieh, & Fels, 2017), and allophonic variants (Stavness, Gick, Derrick, & Fels, 2012), and to test theories of motor control (Dang & Honda, 2004; Perrier, Loevenbruck, & Payan, 1996).

4. Tongue Muscle Function

Tongue muscles can be measured by MRI in several different ways. First, anatomical MRI, sometimes called structural MRI, is the classic method for imaging static anatomy, such as the muscles of the tongue. In this method, a lengthwise magnetic field is applied to the body, which causes hydrogen (H2) protons to align lengthwise. H2 is most prevalent in the human body as water, found in muscles and fat. To image the H2 protons, a radio frequency (RF) pulse is applied to the tissue, which knocks the protons out of alignment as much as 90 degrees. When the pulse ends, the protons rapidly realign with the magnetic field emitting a small RF signal of their own. This signal is measured and becomes the basis of the MR image. Tissues with more H2 are whiter in the grayscale image. The second MRI method, cine-MRI, records motion in the tongue. It makes an MRI movie by modifying the structural MRI recording process. After the protons are knocked out of alignment, and while they are realigning, a speech task is uttered. Multiple time frames are collected during the speech task, which is timed to occur between the removal of the RF pulse and the full return of the H2 protons to their base orientation. This motion is reconstructed into a movie of the motion (Stone et al., 2001a). The third MRI method is the one that is used to provide internal tongue tissue motion images, from which muscle shortening can be extracted. Tagged MRI tracks a magnetically induced pattern of stripes or a grid, imposed on a volume of soft tissue, as it moves in time. The tags are added to a cine-MRI movie. After the protons are knocked out of alignment, and just before the protons begin to realign to the magnetic field, a spatial gradient is applied to the tongue. The gradient dephases the H2 protons in rows (in k-space), so that when the H2 protons realign and their emitted signals are read by the MRI machine, the out-of-phase protons are invisible. The reconstructed images have black stripes where the missing protons reside. During speech motions, as the tongue deforms, the magnetic lines move with that deformation. Horizontal and vertical dephasing creates a grid of lines that move with the tissue, which allows identification of tissue point positions over time in a motion sequence. For a more complete explanation, see Stone et al. (2001b). Thus, tissue points along the length of a muscle can be tracked over time during speech and their distances calculated. As the points move closer, they reflect muscle shortening consistent with muscle activation. Interpretation of muscle shortening must be made with care, however, since muscles can shorten passively.

As a soft structure whose range of motion is bounded by the oral cavity, the tongue uses small motions. It moves by deforming, and it receives considerable feedback from contact with other structures. The tongue also can use contact with its surrounding structures to ‘brace’ one region as another region moves (cf. Gick, Allen, Roewer-Després, & Stavness, 2017; Stone, 1991; Stone & Lundberg, 1996). Rapid motions of the tongue tip, for example, may depend on tongue body muscles activating to stiffen the posterior tongue to support or increase the rapidity of local tip movement. While any muscle can serve to support or stiffen local tongue regions, the large-volume muscles are well suited to supporting local motions made by smaller muscles.

Many tongue muscles have similar, or redundant, fiber directions. This redundancy is one of the reasons the healthy tongue does not fatigue even during extensive speaking, unlike the jaw and vocal folds. Another reason is that the tongue uses only about 15–25% of its maximal strength during speech (Barlow & Abbs, 1983). Redundancy in location and direction occurs among intrinsic and extrinsic muscles, which allows both muscle groups to be used and to trade off in the same gesture. During protrusion, the tongue can use intrinsic (T, V) and extrinsic GGp muscles. Similarly, tongue retrusion is executed by the intrinsic (SL, IL) and the extrinsic (SG, HG) muscles. This redundancy also results in overspecification; that is, multiple motor programs might provide the same tongue shapes and acoustic results. Two obvious examples are a ventriloquist and a patient recovering from a speech disorder by training new muscle synergies.

Figure 5. Muscle shortening during speech for /is/ and /əs/.

Source: Author.

Figure 5 shows tagged-MRI evidence for different muscle synergies due to coarticulation. For example, in the root, the three directions of muscle shortening are captured by (a) GGp, with radial fibers that run anterior to posterior in the root, (b) Tp, with left-to-right crosswise fibers, and (c) SL, with lengthwise fibers that run along the upper tongue surface. Coarticulation may cause a different degree of usage of these muscles for the same sound. To explore this, tongue muscle shortening patterns were calculated from 3D tagged-MRI recordings of motion into /s/ during the phonetically driven tasks: “a geese” (/is/), and “a souk” (/əs/). The data for each word were spatially and temporally aligned across 10 subjects and averaged showing different muscle shortening patterns for the three muscles between words. For example, the SL (red) lengthened in the /ə/ to /s/ gesture reflecting the anterior tongue lengthening, but not during the /i/ to /s/ gesture, as the latter two sounds are made in a similar location. The posterior muscles also differed during the two /s/’s. The posterior GGp and Tp were unchanging in length from /ə/ to /s/, consistent with no posterior tongue shape change between the two sounds. For /i/-/s/, however, the left-to-right Tp fibers (gray) shortened first which would narrow the posterior tongue and expand the tongue lengthwise. Partway through the gesture, the GGp (orange) shortened to prevent too much posterior tongue backing. Thus, the /s/ gesture exhibited different muscle synergies due to coarticulation in two vowel contexts.

Glossectomy surgery is performed on tongue cancer patients to remove a malignant tumor and a margin of healthy tissue. Studies of postglossectomy tongue motion have augmented our understanding of tongue motor control and tongue-palate interaction. This is an informative population to study, because a small local region of the tongue is affected by the surgery, but the surrounding motor control, sensation, and, of course, hearing are unaffected. Thus, their speech problems stem from an anatomical change: the local loss of tissue at the cancer site. The difference from healthy controls reveals what motions are easy versus difficult to change in speech. Grimm et al. (2017) showed that healthy control subjects were influenced by palate height in the use of apical versus laminal /s/. Low palate controls were more likely to use an apical /s/ and high palate controls were more likely to use a laminal /s/. This is consistent with the sensitivity of high tongue body sounds to palate height (cf. Brunner, Fuchs, & Perrier, 2009; Weirich & Fuchs, 2013). Glossectomy patients, on the other hand, tended to use a laminal /s/ irrespective of palate height. The tumor location in these patients was unilateral and posterior to the tongue tip and blade. In the human tongue, muscles flow forward into the tip, from the back and base. Thus, the resections likely reduced the motor control of the tongue tip making tip elevation quite challenging for these patients. This motor control loss probably overcame any presurgical preference for /s/ based on palate height, indicating that the influence of palate height on /s/-type is not insurmountable.

Stone, Langguth, Woo, Chen, and Prince (2014) compared tongue movements from /i/ to /s/ between control subjects and glossectomy patients. They used a principal component analysis (PCA) and a linear discriminant analysis (LDA) to compare motion patterns in the tumor versus nontumor side of the tongue in patients with the smaller versus the larger side of the tongue in controls. PC1 was found to represent the direction of the entire tongue gesture: forward versus downward. PC2 represented independence between tip and body motion. These two PCs accounted for about 68% of the data on each side. When comparing controls to patients, the first two PCs accounted for much more variance on the tumor/small side (14.7%) than the nontumor/large side (1.4%). In other words, the patients tended to compensate for the loss of tissue by modifying behavior of the tumor side where tissue was removed, not the nontumor side even though it is stronger and better controlled. The LDA similarly distinguished function between the subject groups, based on all calculated PCs. It found that in the nontumor side, 10 PCs, or 98.7% of the data, were needed for 100% correct discrimination between controls and patients. In the tumor side only seven PCs, or 92.4% of the data, were needed for 100% discrimination—further support for greater differences between controls and patients on the tumor side than the nontumor side.

The anatomy, neurology, and function of the tongue reveal the remarkable coordination of structure, control, and deformation that is used by the tongue to perform all the tasks demanded of it.

5. Conclusion

The tongue is a highly innervated, complexly controlled structure, surrounded by structures that limit its range and direction of motion. Within these limits it has a highly responsible role in a surprisingly large number of functions, including chewing, swallowing, breathing, speaking, and singing. The anatomy of the human tongue and its boundary structures in the oral cavity are almost certainly optimized for speech. The curved, close, oral cavity boundary coupled with the tongue’s relatively large size minimizes the tongue motion needed to make constrictions and obstructions in the vocal tract during speech, which maximizes motion efficiency. As part of this efficiency, the tongue moves using an almost unique methodology for deforming in this limited space, that of a muscular hydrostat. When tongue deformation is insufficient for creating a sound, the jaw can lower the tongue, using translation and rotation to increase airway size and allow many additional sounds to be made. The hard palate, which is the immobile and unmoving upper boundary, facilitates the production of tongue shapes that require bracing and helps directs the acoustic airflow through and around the oral structures. Phonemic variety is increased even further by the interaction of tongue shape with velar position, which lowers to engage the nasal resonator. In addition, the lips can act with the tongue to create even more phonetic variety and voicing potentially doubles the consonant pool. Thus, the tongue’s function, crucial to speech, is also enmeshed with the other structures of the vocal tract to ensure the production of a large variety of speech sounds.

Further Reading

  • Kier, W. M., & Smith, K. K. (1985). Tongues, tentacles and trunks: The biomechanics of movement in muscular-hydrostats. Zoological Journal of the Linnean Society, 83, 307–324.
  • Miyawaki, K. (1975). A preliminary report on the electromyographic study of the activity of lingual muscles. Annual Bulletin of the Research Institute of Logopedics and Phoniatrics, University of Tokyo, 9, 91–106.
  • Mu, L., & Sanders, I. (1999). Neuromuscular organization of the canine tongue. Anatomical Record, 256, 412–424.
  • Sauerland, E. K., & Mitchell, S. P. (1975). Electromyographic activity of intrinsic and extrinsic muscles of the human tongue. Texas Reports on Biology and Medicine, 33, 444–455.
  • Slaughter, K., Li, H., & Sokoloff, A. J. (2005). Neuromuscular organization of the superior longitudinalis muscle in the human tongue. Cells Tissues Organs, 181(1), 51–64. doi: 10.1159/000089968
  • Smith, K. K., & Kier, W. M. (1989). Trunks, tongues and tentacles: Moving with skeletons of muscle. American Science, 77, 29–35.
  • Stone, M. (2006). Imaging and measurement of the vocal tract. In K. Brown (Ed.), Encyclopedia of language and linguistics (Vol. 5, pp. 526–539, 2nd ed.). Oxford, UK: Elsevier.
  • Stone, M., Gomez, A. D., Zhuo, J., Tchouaga, A. L., & Prince, J. L. (2019). Quantifying tongue tip shape in apical and laminal /s/: Contributions of palate shape. Journal of Speech, Language, Hearing Research, 62(9), 3149–3159.
  • Stone, M., Woo, J., Lee, J., Poole, T., Seagraves, A., Chung, M., . . . Blemker, S. S. (2018). Structure and variability in human tongue muscle anatomy. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, 6(5), 499–507.
  • Takemoto, H. (2001). Morphological analyses of the human tongue musculature for three-dimensional modeling. Journal of Speech and Hearing Research, 44, 95–107.


  • Abd-el-Malek, S. (1939). Observations on the morphology of the human tongue. Journal of Anatomy, 73, 201–210.
  • Alfonso, P. J., & Baer, T. (1982). Dynamics of vowel articulation. Language and Speech, 25, 151–173.
  • Atsumi, T., & Miyatake, T. (1987). Morphometry of the degenerative process in the hypoglossal nerves in amyotrophic lateral sclerosis. Acta Neuropathologica, 73, 25–31.
  • Baer, T., Alfonso, P. J., & Honda, K. (1988). Electromyography of the tongue muscles during vowels in/apvp/environment. Annual Bulletin RILP, No. 22, 7–19.
  • Barlow, S. M., & Abbs, J. H. (1983). Force transducers for the evaluation of labial, lingual, and mandibular motion impairments. Journal of Speech and Hearing Research, 26, 616–621.
  • Bell-Berti, F., Raphael, L. J., Pisoni, D. B., & Sawusch, J. R. (1979). Some relationships between speech production and perception. Phonetica, 36(6), 373–383. doi: 10.1159/000259974
  • Brunner, J., Fuchs, S., & Perrier, P. (2009). On the relationship between palate shape and articulatory behavior. The Journal of the Acoustical Society of America, 125(6), 3936–3949.
  • Buchaillard, S., Perrier, P., & Payan, Y. (2009). A biomechanical model of cardinal vowel production: Muscle activations and the impact of gravity on tongue positioning. The Journal of the Acoustical Society of America, 126(4), 2033–2051. doi: 10.1121/1.3204306
  • Cooper, S. (1953). Muscle spindles in the intrinsic muscles of the human tongue. The Journal of Physiology, 122(1), 193–202.
  • Dang, J., & Honda, K. (2002). Estimation of vocal tract shapes from speech sounds with a physiological articulatory model. Journal of Phonetics, 30(3), 511–532.
  • Dang, J., & Honda, K. (2004). Construction and control of a physiological articulatory model. The Journal of the Acoustical Society of America, 115(2), 853–870.
  • Fang, Q., Fujita, S., Lu, X., & Dang, J. (2009). A model-based investigation of activations of the tongue muscles in vowel production. Acoustical Science and Technology, 30(4), 277–287.
  • Gick, B., Allen, B., Roewer-Després, F., & Stavness, I. (2017). Speaking tongues are actively braced. Journal of Speech, Language and Hearing Research, 60(3), 494–506.
  • Giulivi, S., Whalen, D. H., Goldstein, L. M., Nam, H., & Levitt, A. G. (2011). An articulatory phonology account of preferred consonant-vowel combinations. Language Learning and Development, 7(3), 202–225.
  • Gomez, A. D., Elsaid. N. M., Stone, M., Zhuo, J., & Prince, J. L. (2018). Laplace-based modeling of fiber orientation in the tongue. Biomechanics and Modeling in Mechanobiology, 17, 1119–1130.
  • Grimm, D., Stone, M., Woo, J., Lee, J., Hwang, J.-H., Bedrosian, G. E., & Prince, J. L. (2017). The effects of palate features and glossectomy surgery on /s/ production. Journal of Speech, Language, and Hearing Research, 60, 3417–3425. doi: 10.1044/2017_JSLHR-S-16-0425
  • Harandi, N. M., Woo, J., Stone, M., Abugharbieh, R., & Fels, S. (2017). Variability in muscle activation of simple speech motions: A biomechanical modeling approach. The Journal of the Acoustical Society of America, 141, 2579–2590.
  • Hardcastle, W. J. (1976). Physiology of speech production. London, UK: Academic Press.
  • Kahane, J., & Folkins, J. (1984). Atlas of speech and hearing anatomy. Toronto, Canada: Charles E. Merrill.
  • Kier, W. M., & Smith, K. K. (1985). Tongues, tentacles and trunks: The biomechanics of movement in muscular-hydrostats. Zoological Journal of the Linnean Society, 83, 307–324.
  • Liberman, A. (1996). Speech: A special code. Cambridge, MA: MIT Press.
  • Lofqvist, A., & Lindblom, B. (1994). Speech motor control. Current Opinion in Neurobiology, 4, 823–826.
  • Long, C., Conrad, P. W., Hall, E. A., & Furler, S. L. (1970). Intrinsic-extrinsic muscle control of the hand in power grip and precision handling. Journal of Bone and Joint Surgery (American volume), 52, 853–867.
  • MacNeilage, P., & Sholes, G. (1964). An electromyographic study of the tongue during vowel production. Journal of Speech and Hearing Research, 7, 209–232.
  • McClung, J. R., & Goldberg, S. J. (1999). Organization of motoneurons in the dorsal hypoglossal nucleus that innervate the retrusor muscles of the tongue in the rat. The Anatomical Record, 254, 222–230.
  • McClung, J. R., & Goldberg, S. J. (2000). Functional anatomy of the hypoglossal innervated muscles of the rat tongue: A model for elongation and protrusion of the mammalian tongue. The Anatomical Record, 260, 378–386.
  • Miyawaki, K. (1974). A study on the musculature of the human tongue. Annual Bulletin of the Research Institute of Logopedics and Phoniatrics, University of Tokyo, 8, 23–49.
  • Miyawaki, K. (1975). A preliminary report on the electromyographic study of the activity of lingual muscles. Annual Bulletin of the Research Institute of Logopedics and Phoniatrics, University of Tokyo, 9, 91–106.
  • Mu, L., & Sanders, I. (1999). Neuromuscular organization of the canine tongue. Anatomical Record, 256, 412–424.
  • Mu, L., & Sanders, I. (2000). Neuromuscular specializations of the pharyngeal dilator muscles: II. Compartmentalization of the canine genioglossus muscle. Anatomical Record, 260(3), 308–325. doi: 10.1002/1097-0185(20001101)260:3%3C308::AID-AR70%3E3.0.CO;2-N
  • Neilson, P. D., Andrews, G., Guitar, B. E., & Quinn, P. T. (1979). Tonic stretch reflexes in lip, tongue and jaw muscles. Brain Research, 178(2–3), 311–327.
  • O’Kusky, J. R., & Norman, M. G. (1995). Sudden infant death syndrome: Increased number of synapses in the hypoglossal nucleus. Journal of Neuropathology and Experimental Neurology, 54, 627–634.
  • Perkell, J. S. (1969). Physiology of speech production: Results and implications of a quantitative cineradiographic study. MIT Research Monograph No. 53. Cambridge, MA: MIT Press.
  • Perrier, P., Loevenbruck, H., & Payan, Y. (1996). Control of tongue movements in speech: The equilibrium point hypothesis perspective. Journal of Phonetics, 24, 53–75.
  • Recasens, D. (1991). An electropalatographic and acoustic study of consonant-to-vowel coarticulation. Journal of Phonetics, 19, 177–192.
  • Sauerland, E. K., & Mitchell, S. P. (1975). Electromyographic activity of intrinsic and extrinsic muscles of the human tongue. Texas Reports on Biology and Medicine, 33, 444–455.
  • Seikel, J. A., Konstantopoulos, K., & Drumright, D. G. (2018). Neuroanatomy and neurophysiology for speech and hearing sciences. Plural Publishing.
  • Slaughter, K., Li, H., & Sokoloff, A. J. (2005). Neuromuscular organization of the superior longitudinalis muscle in the human tongue. Cells Tissues Organs, 181(1), 51–64. doi: 10.1159/000089968
  • Smith, K. K., & Kier, W. M. (1989). Trunks, tongues and tentacles: Moving with skeletons of muscle. American Science, 77, 29–35.
  • Sokoloff, A. J. (2004). Activity of tongue muscles during respiration: It takes a village? Journal of Applied Physiology, 96, 438–439. doi: 10.1152/japplphysiol.01079.2003
  • Stavness, I., Gick, B., Derrick, D., & Fels, S. (2012). Biomechanical modeling of English /r/variants. The Journal of the Acoustical Society of America, 131(5), EL355–EL360.
  • Stone, M. (1991). Imaging the tongue and vocal tract. International Journal of Language Communication Disorders, 26, 11–23. doi: 10.3109/13682829109011990
  • Stone, M. (2006). Imaging and measurement of the vocal tract. In K. Brown (Ed.), Encyclopedia of language and linguistics (Vol. 5, pp. 526–539, 2nd ed.). Oxford, UK: Elsevier.
  • Stone, M., Davis, E., Douglas, A., Ness Aiver, M., Gullapalli, R., Levine, W., & Lundberg, A. (2001a). Modeling tongue surface contours from cine-MRI images. Journal of Speech, Language, and Hearing Research, 44, 1026–1040.
  • Stone, M., Davis, E., Douglas, A., Ness Aiver, M., Gullapalli, R., Levine, W., & Lundberg, A. (2001b). Modeling the motion of the internal tongue from tagged cine-MRI images. Journal of the Acoustical Society of America, 109(6), 2974–2982.
  • Stone, M., Gomez, A. D., Zhuo, J., Tchouaga, A. L., & Prince, J. L. (2019). Quantifying tongue tip shape in apical and laminal /s/: Contributions of palate shape. Journal of Speech, Language, Hearing Research, 62(9), 3149–3159.
  • Stone, M., Langguth, J. M., Woo, J., Chen, H., & Prince, J. L. (2014). Tongue motion patterns in glossectomy and normal speakers: A principal components analysis. Journal of Speech Language and Hearing Research, 57(3), 707–717. doi: 10.1044/1092-4388(2013/13-0085
  • Stone, M., & Lundberg, A. (1996). Three-dimensional tongue surface shapes of English consonants and vowels. The Journal of the Acoustical Society of America, 99(6), 3728–3737.
  • Stone, M., Woo, J., Lee, J., Poole, T., Seagraves, A., Chung, M., . . . Blemker, S. S. (2018). Structure and variability in human tongue muscle anatomy. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, 6(5), 499–507.
  • Strohl, K. P., & Redline, S. (1986). Nasal CPAP therapy, upper airway muscle activation, and obstructive sleep apnea. The American Review of Respiratory Disease, 134(3), 555–558.
  • Takemoto, H. (2001). Morphological analyses of the human tongue musculature for three-dimensional modeling. Journal of Speech and Hearing Research, 44, 95–107.
  • Trulsson, M., & Essick, G. K. (1997). Low-threshold mechano-receptive afferents in the human lingual nerve. Journal of Neurophysiology, 77, 737–748.
  • Tuller, B., Kelso, J. S., & Harris, K. S. (1982). Interarticulator phasing as an index of temporal regularity in speech. Journal of Experimental Psychology: Human Perception and Performance, 8(3), 460–472.
  • Waltl, S., & Hoole, P. (2008). An EMG study of the German vowel system. Proceedings of the 8th International Seminar on Speech Production (pp. 445–448). Strasbourg, France.
  • Weirich, M., & Fuchs, S. (2013). Palatal morphology can influence speaker-specific realizations of phonemic contrasts. Journal of Speech, Language, and Hearing Research, 56, S1894–S1908.
  • Wozniak, W., & Young, P. A. (1969). Further observations on human hypoglossal nerve. Anatomischer Anzeiger, 125, 203–205.
  • Wu, X., Dang, J., & Stavness, I. (2014). Iterative method to estimate muscle activation with a physiological articulatory model. Acoustic Science & Technology, 35(4), 201–212.