- Matthew B. WinnMatthew B. WinnUniversity of Minnesota
- and Peggy B. NelsonPeggy B. NelsonUniversity of Minnesota
Cochlear implants (CIs) are the most successful sensory implant in history, restoring the sensation of sound to thousands of persons who have severe to profound hearing loss. Implants do not recreate acoustic sound as most of us know it, but they instead convey a rough representation of the temporal envelope of signals. This sparse signal, derived from the envelopes of narrowband frequency filters, is sufficient for enabling speech understanding in quiet environments for those who lose hearing as adults and is enough for most children to develop spoken language skills. The variability between users is huge, however, and is only partially understood.
CIs provide acoustic information that is sufficient for the recognition of some aspects of spoken language, especially information that can be conveyed by temporal patterns, such as syllable timing, consonant voicing, and manner of articulation. They are insufficient for conveying pitch cues and separating speech from noise.
There is a great need for improving our understanding of functional outcomes of CI success beyond measuring percent correct for word and sentence recognitions. Moreover, greater understanding of the variability experienced by children, especially children and families from various social and cultural backgrounds, is of paramount importance. Future developments will no doubt expand the use of this remarkable device.
- Applied Linguistics
Cochlear implants (CIs) provide partial restoration of the sensation of hearing to people who would otherwise experience severe to profound deafness. Since the mid-1980s, thousands of adults with severe to profound hearing loss have received CIs, and since 2000 they have been approved for children as young as 1 year of age (NIDCD, 2020) and are now available for children as young as 9 months of age (Cochlear Corp, 2020).
For both children and adults who are deaf, CIs have been the most successful sensory aid to develop or restore spoken communication abilities. However, the nature of that hearing sensation is so dramatically different than typical hearing that many of the basic principles of auditory perception cannot be generalized to those who use CIs. Here we introduce the essential terminology, patterns of performance, and assumptions related to CIs, which will lay the groundwork for informed discussion and interpretation of the literature. The role of CIs in a larger scientific and cultural context has evolved in the past three decades, with meaningful lessons not only about the science of hearing and language but also about how communication is a part of a person’s identity.
2. What a Cochlear Implant Is
Although CIs might look similar to hearing aids, they are much different. Specifically, CIs do not amplify sound and send sound into the ear canal the way a hearing aid does. Instead, CIs directly stimulate the auditory nerve electrically, in a way that is designed to crudely represent the neural activation that might have resulted from typical hearing. The CI is not a new physical cochlea (like a limb prothesis) but a device that mimics the function of the cochlea and exploits its existing structure. It is a significantly more invasive and risky treatment for hearing loss compared to hearing aids, and thus CI candidates try hearing aids as a first approach before committing to the long and effortful process of using a CI.
A CI consists of a microphone and signal-processing hardware that rest on the outside of the head (usually on the ear) and a small array of physical electrodes that are inserted into the cochlea. The microphone and processor capture the external sound and divide it up into bands of different frequencies. The amplitude of each frequency component is tracked and converted into instructions for the electrodes that simulate the place in the cochlea that is appropriate for the frequency of the original sound (see Figure 1). In the top panel, the full waveform of a syllable is shown. For simplified illustration, it is filtered into six frequency bands between 100 and 8000 Hz, shown in the second panel. The envelope of each band’s waveform is extracted (shown in the third panel). Finally, a sequence of electrical pulses is generated whose amplitude follows the extracted envelope in each band (lower panel). These pulse trains are delivered through the six corresponding implant electrodes.
Instructions to active electrical pulse trains on each electrode are sent by radio communication from the external device to the internal device. There is usually some recovery time after the surgical implantation of the internal device, and there is almost always a period of “relearning” how to interpret sound after the device is initially activated. CI activations in infants have been popularized for their emotional appeal, but they represent a small minority of the reality of CIs. Initial activation can be very surprising and disorienting, and it is not at all like the moment of immediate clarity from the first time putting on a pair of well-fit eyeglasses.
3. Brief Synopsis of Speech Perception With a Cochlear Implant
Understanding communication with CIs requires familiarity with at least three different main topics: (a) sources of variability, (b) the physical auditory signal in a CI, and (c) the compensation mechanisms that listeners use to make sense of that signal despite its limitations. The following sections introduce the most essential pieces of knowledge, with more detail (and citations) in the paragraphs that follow.
3.1 Speech Perception Among CI Recipients is Variable Across Individuals
Although the signal processing can be broken down and analyzed in detail, the user’s success depends heavily on cognitive and linguistic processing, since the actual auditory signal is sparse and heavily degraded. The factor with arguably the greatest impact on outcome measures is the timing of the individuals’ hearing loss relative to their acquisition of spoken language. Those who acquired speech and language in a typical fashion and then acquire hearing loss later in life tend to have much better outcomes than those who have hearing impairment before acquiring spoken language. This trend underscores the importance of linguistic structure to perception. It also could indicate that hearing loss that occurs in early childhood or at birth might cause different consequences than the hearing loss that is acquired later in life if a young child has limited access to spoken language input prior to receiving the implant.
3.2 Limitations in Sound Quality in CIs With Direct Impact on Ability to Interpret Speech
The most common limitations are degraded frequency representation, poor pitch representation, and limited dynamic range. Sound duration is transmitted relatively well. As a result of these combined factors, there are general trends for poor perception of consonant place of articulation, relatively better perception of voicing, and complicated/unreliable perception of manner of articulation. Perception of vowel quality is poor, although perception of vowel duration is intact. For numerous phonetic contrasts, listeners with CIs can recover a phonetic feature using a different acoustic cue than the cue used by those with typical hearing (DiNino et al., 2020; Moberly et al., 2014; Winn & Litovsky, 2015; Winn et al., 2012, 2013). Nittrouer et al. (2014) and Nittrouer and Lowenstein (2015) addressed the same question by systematically manipulating phonemic contrasts based on frequency cues (place distinction), duration cues (voicing distinction), and amplitude cues (manner distinction). Therefore, perception of a phonetic feature cannot be assumed to be evidence for perception of any particular acoustic property. This is especially true for perceiving sentence-length utterances, where CI listeners rely disproportionately on semantic context or expectations.
3.3 Visual Cues are Especially Important Among People Who Use CIs
Many CI recipients can engage in face-to-face communication but fail to communicate effectively over the phone or with a person who is not facing them directly. This is yet another factor that underscores the importance of nonauditory (cognitive, visual, linguistic, etc.) factors that supplement the distorted auditory signal provided by the implant.
3.4 CI Users Perceive Speech as if it Contains About Eight Frequency Channels
This notion arises from two main pieces of evidence. First, activation of a progressively larger number of electrodes in the CI results in systematically higher performance, but improvements tend to taper off and plateau near eight electrodes (Friesen et al., 2001). Second, in studies that use crude simulation of CI processing with acoustic signals presented to people with typical hearing, speech recognition performance in conditions with eight frequency channels is usually a good match to the performance of better-performing CI listeners in the same task (who would be presented with regular unfiltered speech). Having only eight or fewer channels of information (compared to 24 or more auditory filters in a normal cochlea) results in rough approximations of temporal and amplitude information (within the limits of dynamic range compression), but the spectral structure of the original signal is highly smeared. Although this gross simplification of the signal might intuitively imply that many of the rich details of phonetic structure should be lost and inaccessible, Shannon et al. (1995) demonstrated that basic speech recognition in people with normal hearing can be achieved when the temporal information is conveyed with just a small number of channels.
3.5 Hearing Speech in Noise is the Most Common Difficulty for CI Listeners
The combination of signal processing that discards pitch cues combined with physical constraints that distort the frequency spectrum results in a general lack of ability to separate speech from background noise, leading to extremely poor performance compared to the same conditions for listeners with typical hearing. Additionally, sound localization—a key part of separating speech from noise—is notoriously poor in CI listeners, even among those who use a device on both ears.
3.6 Speech Communication is Effortful for CI Listeners
The continual need to infer details from a noisy, degraded or sparse signal requires active cognitive control. Speech processing can be delayed as a person ponders whether they heard one word or another, and this process can jeopardize ongoing attention of the next utterance (Winn & Moore, 2018). Sometimes the effortfulness of communication can result in CI recipients simply withdrawing from social communication altogether, to avoid mental fatigue or the stress of an embarrassing misunderstanding (Hughes et al., 2018). In light of these realities, speech communication with a CI should not be understood simply as normal communication filtered through a CI speech processor.
4. The Structure and Sound of a Cochlear Implant
Sound processing in a CI is based on the concept of the vocoder (Dudley, 1939), which predates the CI by several decades. Like the vocoder, the CI speech processor divides the signal into a small number of frequency bands that span the range that is most important for understanding speech—generally between about 150 and 8000 Hz. Each frequency band corresponds to one of the CI electrodes inside the cochlea. For example, a low-frequency sound of 150 Hz should lead to activation of the electrode closest to the apical end of the cochlea and a high-frequency sound of 6000 Hz should lead to activation of an electrode closer to the basal end of the cochlea. Leveraging the tonotopic arrangement of the cochlea, activation of those electrodes reflects moment-to-moment changes in the intensity of their corresponding frequency bands, as first articulated by Wilson et al. (1993). It is as if a musician is sitting at a piano, listening to a melody and playing that melody back using the piano keys. She translates the perception of each pitch to the specific part of the piano that corresponds to the right note. In the same way, the cochlear implant analyzes a sound to identify its frequency components and then attempts to activate the parts of the cochlea that correspond to those frequencies.
Although it might seem straightforward to represent sound frequencies with electrodes at specific cochlear positions, there are a large number of engineering and biological factors that pose significant challenges to this goal, and which degrade the sound quality in a CI. First, the discretization of frequency bands means that fine frequency distinctions are lost. For example, if one frequency band spans from 1000 Hz to 1200 Hz, then any distinctions within that band are lost. It is akin to an image being downsampled and “pixelated.” So the musician at the piano can only play 22 keys instead of 88; some clarity and richness of detail will be lost. Adding more electrodes does not confer additional benefit, because the activation of each electrode spreads across the cochlea, creating a cluttered and smeared spectrum. It is as though our musician were not playing carefully with precise finger movements but instead pressing her whole hand (or arm) down on the keyboard, playing many notes at once, that are simply in the general range of the note she is aiming for. Because of this smearing, a device with 22 electrodes is typically programmed to only activate eight at any point in time. Technology to sharpen the frequency representation in a CI has been in development for decades but has not yet been widely implemented in actual devices worn by patients.
Because of limitations imposed by the size and shape of the cochlea, the CI internal component typically cannot be inserted all the way to the apex, where the low-frequency information is normally represented, meaning that the lower-frequency regions of the cochlea remain unstimulated (Canfarotta et al., 2020; Landsberger et al., 2015). As a consequence, most of the frequency energy is shifted upward (and upon first activation of the device, many report that all voices sound like cartoon characters). This is if the musician at the piano accidentally shifted her whole body to the right, playing every note higher than expected.
4.1 Loudness Perception Can Be Crude and Unsuccessful When Using a CI
A healthy-functioning cochlea has elegant compressive mechanical properties that gracefully scale a large range of input sound intensities onto a tiny and delicate sensory receptor. The cochlear implant bypasses that elegant mechanism and instead provides rather crude electrical stimulation that can kick the system into overdrive. The sensation resulting from electrical stimulation is expansive rather than compressive, meaning that loudness can rapidly grow to uncomfortable levels unless very tightly constrained (Zeng et al., 2002). Very small changes in input might correspond to a complete change from barely audible to way-too-loud. Our musician now has a limited range of dynamics, without much control over whether the notes are played very softly or loud enough to wake up the neighbors.
In addition to the constraints of the frequency processing, physical space and the surgical placement of the CI, the auditory system of a CI recipient is more likely to have suffered atrophy, either by prolonged periods of deafness or by physical or biological trauma. So even if the CI technology were improved to be perfect, it would be delivering stimulation to a damaged system. Thus, even if our musician were an experienced professional performer the music will still suffer if the piano is out of tune or missing strings.
Considering all the factors described, it might be assumed that the sound from a cochlear implant would be too severely degraded to faithfully transmit sound details that are needed to understand speech. And, in fact, at the introduction of CI technology into the world of otolaryngology, that was precisely the dominant line of thought. However, holders of such opinions failed to recognize just how powerful the brain can be in transforming a distorted/smeared/shifted/mistuned/dynamically volatile signal into a sensible perception. Clearly, CI users manage to recover the intended spoken message even with a degraded internal representation, leveraging the power of language to constrain expectations and make inferences that support real-world communication.
The history of speech signal processing in CIs has inadvertently taught valuable lessons about how speech perception works. Some of the most intriguing findings were summarized by Studdert-Kennedy (1983) illustrating the challenges faced when attempting to “simplify” speech into its basic acoustic components. Some early versions of implant software attempted to first extract components of the external speech stimulus that were presumed to be key features of speech. Fundamental frequency (F0) and formant peak information (F1/F2) were identified and assigned peaks in pulsatile patterns. The philosophy was that the implant stimulation was so crude and distorted that explicit encoding of key features should make the perception process easier. However, attempts to identify and replicate specific phonetic features of speech have not resulted in superior speech recognition performance. Instead, a phonetically neutral/agnostic approach to conveying the envelope of intensity changes has proven to lead to much greater success. Such an approach is encapsulated in the Continuous Interleaved Sampling (CIS) processing strategy, which transmits the envelope changes in each channel without regard to whether those changes correspond to particular phonetic features. Greater success with this approach has demonstrated that the human brain is much better at extracting meaningful informative signals from a continuous stream of information, compared to explicit attempts by a computer.
5. Patterns of Perception: What Is Heard, What Is Not Heard, What Is Mistaken
The most important observation to make about speech recognition performance with CIs is that it is extremely variable (Gifford & Dorman, 2018). Perception of individual words can be just 5% in one person but 95% in another person, even if those people appear to have a similar audiological history. Performance for whole sentences is typically better, because the syntactic and semantic coherence of a sentence can support inferences about what was spoken. In fact, semantic context appears to be used heavily by CI listeners, underscoring the reliance on top-down cognitive processing when speech input is degraded (e.g., O’Neill et al., 2019), underscoring the reliance on top-down cognitive processing.
Perception of individual consonants is relatively less affected by linguistic constraints, and thus is more straightforwardly predictable based on what is known about how the sound is coded. Consonant place of articulation is contrasted mainly by frequency differences (such as formant transitions or spectral peaks in a fricative). Since a CI distorts frequency information so severely, place of articulation is typically the consonant feature that is perceived with the least accuracy. Conversely, the voicing feature is perceived much more reliably, owing to the fact that segment duration plays a large role in voicing contrasts (Winn et al., 2012); consider the duration of the vowel in word pairs like rope/robe, or the duration of consonant cognates such as s [long] and z [short]). It is possible that the robustness of voicing is constrained to English, where duration-based voicing cues tend to be exaggerated relative to other languages. Perception of manner of articulation (the degree of airflow constriction during the consonant) is more complex (Rødvik et al., 2018), as it suffers from the constraints of limited dynamic range. The disparities between slightly different degrees of vocal tract constriction could result in sound amplitude envelopes that are too small to be discriminable by a CI listener, yet there are typically other probabilistic constraints on manner of articulation that render it somewhere in between voicing and place of articulation in terms of difficulty.
Vowel perception with a cochlear implant is notoriously poor (Harnsberger et al., 2001). As vowels typically are contrastive mainly by their spectral properties (e.g., formants) without very informative cues in the amplitude envelope, the listener must rely on the exact auditory domain that is most heavily distorted by the CI. In that way, vowel perception is somewhat akin to perception of consonant place of articulation. However, as the spectral composition of vowels is neatly described even within the constrained frequency-channel analysis in a CI, there are some CI recipients who perceive vowels successfully.
For CI listeners, pitch perception is extremely poor and almost completely absent (Moore & Carlyon, 2005; Zeng, 2002). Pitch perception in people with typical hearing involves a combination of at least three mechanisms: (a) harmonic pitch, driven by the encoding of specific sound components at equally spaced interval frequencies; (b) rate pitch, or the rate of repetition of waveform peaks; and (c) spectral pitch, or the placement of activation in the cochlea. Cochlear implant users’ pitch perception has revealed some complexities of pitch perception that were not fully understood from studies of acoustic hearing alone. Specifically, although harmonic pitch might be intuitively understood as an example of the place code, the poor pitch perception in CI users who have a place code but not harmonicity demonstrates that these are separate concepts. Harmonic pitch is completely absent from a CI because there are no harmonics represented in the cochlear stimulation; the sites of electrical stimulation are simply too broad to form the compact and precise peaks needed for harmonics. As resolved harmonics lie at the heart of a human’s ability to finely distinguish F0 differences on the order of 0.2%, this is a significant loss (Mehta & Oxenham, 2017). Perception of pitch based on rate is much less sensitive, requiring roughly 6% change to be noticeable. Although pitch information can theoretically be encoded by the rate of stimulation in a CI, the real implementation is severely limited in many ways. First, the rate of stimulation in a CI is almost always fixed at a constant rate, and therefore unrelated to the F0 of the sound being heard. F0 can therefore only be encoded by sampling the amplitude modulations that correspond to rate pitch. However, the range of amplitude modulation resulting from voice pitch can be rather shallow and also might be undersampled by the device, such that it might not be detectable. Finally, even when experimenters override the internal workings of the device to specifically represent perfectly maximal amplitude modulation at the exact F0 rate of the input speech, there are psychophysical limitations on how high that rate can go before it is simply perceived as an undifferentiated continuous buzz. Unfortunately, that limiting rate—estimated to be around 300 Hz—hovers near the middle-upper end of the F0 range for adult women, potentially jeopardizing the ability of a CI listener to detect pitch of everyday conversation partners.
As a consequence of poor pitch perception, CI listeners generally are thought to struggle with talker identification (Fuller et al., 2014), prosody and lexical tone (Chatterjee & Peng, 2008), and perceiving speech in noise (since F0 is a very powerful cue to separate one talker from background noise). Perceiving speech in noise is also weakened by the fact that the ability to localize sound is extremely poor in CI listeners (Jones et al., 2014), owing to the lack of synchrony across devices (Kan & Litovsky, 2014). There is thus little to no representation of the binaural cues that are essential for detecting the position of a sound in space. Pitch perception and speech in noise are commonly thought to be the most prominent aspirational goals of the next generation of CI technology.
Because the key element to successful CI signal processing is the representation of changes in the amplitude envelope, many realistic listening scenarios such as background noise and reverberation are known to severely disrupt speech understanding through a cochlear implant (e.g., Nelson & Jin, 2004; Oxenham & Kreft, 2014). The problem is that background noise fills in and masks the gaps and variations in intensity that would normally signal linguistically meaningful changes in the signal. Even what might be presumed to be a very favorable listening condition for persons with typical hearing (such as a +10 dB speech-to-noise ratio) can be very problematic for users of cochlear implants, and reverberation levels that are even as minor as a half-second or less (e.g., Zheng et al., 2011) can cause significant problems. Because of the implant’s poor spectral resolution and the absence of fine temporal and spectral cues, segregating speech from any kind of noise is a very challenging task.
After the initial activation of a CI, the patient typically goes through a prolonged period of gradual improvement while acclimating to the dramatically new sound quality. The patient essentially has to remap the world’s sounds to new perceived sound qualities. During the first 3 to 12 months, performance on basic speech recognition tasks typically improves steadily and finally reaches a plateau. That plateau can be highly variable across individuals and is the consequence of dozens of factors (Blamey et al., 2013; Lazard et al., 2012). Some reach a stage where their auditory abilities meet their basic communication needs, but many must make compromises, such as avoiding telephones, requiring face-to-face visual cues with good lighting, avoiding crowded and noisy situations, and so forth.
6. Replicating the Sound of a Cochlear Implant
Simulation of cochlear implant processing is done using a vocoder, which breaks down the speech sound into a set of small number of discrete spectral bands and represents a simplified form of those bands that preserves the changes in intensity but discards the frequency details. Vocoders can vary by their frequency range, number of spectral channels, type of carrier, or degree of temporal precision. Despite these meaningful differences that should always be specified, the term “vocoder” is sometimes used as shorthand for processing that analyzes a frequency range between ~100 Hz to an upper edge of 5000 to 8000 Hz, from which frequency eight channels are extracted and represented by noise. The reason this style of vocoder has prevailed is that performance by typical-hearing listeners on speech processed by an eight-channel noise vocoder usually a very good match to the performance by better-performing CI listeners in the same task. However, the exact number of spectral channels needed depends on the type of speech that is being heard and what is being asked of the listener.
7. Outcome Measures
New research is looking beyond spoken language recognition as primary measures of cochlear implant success. Measures of listening effort (Winn & Moore, 2018) and functional use of hearing (e.g., Kronenberger et al., 2020) may be able to shed light on the realistic benefits and shortcomings of cochlear implants for interpersonal communication and other functional abilities. Laboratory measures of speech recognition can be of little use when individuals withdraw from spoken communication. Measures of social isolation and interaction, and the impact of cochlear implants on quality of life are emerging as important indicators of cochlear implant success.
O’Neill (2020) showed that intelligibility scores reported in percent correct (keywords) might not indicate the conversational difficulty experienced by a cochlear implant user. Listeners repeated sentences and were scored for their percent words correct; at the same time, they were asked to rate how difficult that listening situation had felt. Listeners with normal hearing sensitivity rated listening as “somewhat difficult” or worse when their scores feel below about 90% correct. For CI users, though, some who scored quite poorly (18% correct) classified the task as only “somewhat difficult” while others who did well (75%) rated the task as “difficult.” Perceived self-efficacy and effort might therefore reflect success relative to expectations, where “18% is better than nothing” and “75% is not as good as I want” are both realistic and common feelings. The relationship between intelligibility and perceived difficulty was very weak, indicating that there is more to realistic communication than simply getting the words correct. Structured interviews with numerous CI recipients by Hughes et al. (2018) showed in vivid detail how individuals gauge the effort, anxiety, and investment needed to socially communicate, and make decisions based on effortfulness. Notably, there was little to no mention of word-recognition accuracy in these patient reports. The CI recipients in that study further went on to describe the feelings of isolation and invisibility involved in spoken communication, underscoring how psychological factors can completely overpower any contribution of the technology itself.
8. Cochlear Implants and Language Development
The primary goal of a cochlear implant for a child who is born deaf is to facilitate communication and language development. But of course not everyone who is deaf chooses to communicate using spoken language. Disagreements still exist between those who advocate early full-time use of sign language to ensure language development during a critical period (e.g., Humphries et al., 2012) and those who emphasize auditory development through the use of cochlear implants and spoken language. Some of those disagreements go far past the scope of this report because they deal with the child’s agency to choose participation in a particular culture or to choose their own path. As most children with congenital deafness are born to parents who have typical hearing, there is no simple and obvious path for socialization and cultural inclusion.
Cochlear implants have had a very positive effect on the language and academic development of children who are deaf (e.g., Geers et al., 2017). Nonetheless, they have not provided full access to spoken language. Evidence from numerous studies is converging on the general finding that for most language measures children with CIs achieve an ability level of about one standard deviation below their peers with normal hearing sensitivity (see, for example, van Wieringen & Wouters, 2015). As might be predicted by the sparse acoustic representation of speech through a CI, language skills requiring access to complex phonological structure is more challenging for children using a CI than are other language skills such as lexical or syntactic abilities (e.g., Nittrouer et al., 2018). Numerous studies have shown that literacy skills for children with severe to profound hearing loss still lag those of their peers, even if they do get a CI (Harris et al., 2017). In general, cochlear implants have not resulted in dramatic widespread academic performance increases for children with profound hearing loss (Marschark et al., 2015).
The debate about whether to use sign language with children who are deaf remains lively. Several retrospective studies of children with and without early sign language exposure suggest that those children without exposure to sign language tend to score better on measures of spoken language than those who allowed access to sign (e.g., Geers et al., 2017). However, standardized measures of spoken language should not be considered the only measure of communication and interaction. For a child who is deaf, the fundamental need is language access—that is, the child must be able to perceptually receive and cognitively process the auditory and visual signals in the communication setting (e.g., Hall, 2020). Children also use language to express and receive emotions and to establish trust and other social connections—abilities that are not necessarily measured by standardized language tests but which are potentially aided by nonauditory forms of communication. Longitudinal studies of language abilities in school-age children support the idea that earlier implantation leads to better outcomes on average, but there is a very wide range of variability, such that outcomes cannot be predicted solely based on the time of implantation (or any other singular factor). Despite the uncertainty of prognosis, some practices have become standard, such as the tendency to evaluate a child’s abilities based on hearing age rather than on chronological age.
9. Social Impacts and Global Outcome Measures
To understand the social impact of cochlear implants, one must appreciate the difference between deaf—someone who does not have typical hearing ability—and Deaf—being a member of a fully-realized community of people whose communication style is central to their identity. In the early years of cochlear implants, the response of the Deaf community was to reject them in favor of sign language (see National Association of the Deaf [NAD] position statement from 1999). Cochlear implants and the emphasis on spoken language was perceived as a threat to Deaf culture and to American Sign Language. Additionally, CIs could be described as a “treatment” for something that was not believed to be a problem. In the decades that have passed since the NAD position statement, there still remains some antagonism toward CIs by the Deaf community, but their use is substantially less controversial. In recent years, the NAD has recognized that “deafness is diverse, especially in the choices that deaf adults and parents of deaf children continue to make about the range of communication and assistive technology options.” The amount and quality of the language input matters greatly, and variation in language input may be one of the important factors predicting communication success (e.g., Holt et al., 2020). It would be fair to say that CIs are generally not controversial among people who grew up using spoken language, because the implant would restore an ability that was lost. But the Deaf community primarily includes people who never used spoken language, and for whom a CI may fundamentally change rather than restore a part of their identity.
At the Laurent Clerc National Deaf Education Center at Gallaudet University, cochlear implants are now a part of the conversation, where in the past they may have been rejected:
There is no single language and communication approach appropriate for the diverse children who are using cochlear implant technology. Effective language and communication strategies following implantation should reflect the complex interaction of characteristics specific to each child and family. (Gallaudet Clerc Center FAQ)
10. Conclusions and Future Directions
Cochlear implants can substantially improve quality of life among adults who have lost their hearing, and they can also support near-normal language development in children with congenital hearing loss. The perception of speech is difficulty because of various distortions (especially to the spectral peaks and pitch-related cues). Any speech sounds that contrast mainly by frequency (e.g., place of articulation contrasts) will be especially difficult. However, perception of speech with a CI is not simply a less accurate version of speech perception with typical hearing. CI recipients rely heavily on contextual cues, visual cues, and other compensation strategies in situations where they opt in to spoken communication. On many occasions, they will also be economical with their effort, choosing instead to abandon spoken interaction because it is too draining. CIs were once considered highly controversial but are now much more widely accepted.
There are a number of recent and ongoing developments in CI technology to anticipate in the coming years. There are efforts to address the major shortcomings of current devices by partially restoring access to cues for voice pitch, and introducing better specificity of the electrical stimulation. Additionally, the candidacy criteria for CIs has gradually become less stringent over the years, so that one needs not have complete deafness to be considered for implantation. Many individuals have substantial residual hearing, inviting the use of a hearing aid in addition to the CI. Fortunately the hearing aid is complementary to the CI in that it transmits low-frequency voice pitch information that complements the high-frequency phonetic information delivered by the implant. In recent years, there have been a large number of implanted patients who can even use acoustic hearing in the same ear that is implanted, and who can take advantage of so-called hybrid devices. Bilateral implantation has become the standard of care in many places.
There are more invasive treatment approaches that share a lineage with the cochlear implant. The auditory brainstem implant bypasses the cochlea altogether and directly stimulates the cochlear nucleus; this option is suitable for patients who lack a cochlear structure or lack auditory nerve integrity.
The current frontier of CI innovation is the expansion of candidacy, opening up the doors for adults and children to receive a CI in cases where they would have been denied in past years. Those who retain usable hearing in one or both ears are now routinely considered to have potential benefit from a CI. This even includes individuals who have normal hearing in the ear opposite to the implanted ear, giving rise to a small population of so-called single-sided deaf implant recipients who integrate an acoustic signal in one ear with an electrical signal in the other ear. CIs are also increasingly becoming recognized for their use as a potential treatment of severe tinnitus (ringing in the ears) and auditory nerve dyssynchrony.
In the span of less than 50 years, the idea to electrically stimulate the auditory nerve has evolved from a radial proposal to a routine medical treatment. This has had a transformational effect on quality of life for those with hearing loss, for children acquiring spoken language, and for scientists who study the auditory system. The patterns of speech and language development—especially speech perception—have also bestowed valuable lessons to linguists who can marvel at the robustness and flexibility of human language.
Links to Digital Materials
- Croghan, N. B. H., Duran, S. I., & Smith, Z. M. (2017). Re-examining the relationship between number of cochlear implant channels and maximal speech intelligibility. Journal of the Acoustical Society of America, 142(6), EL537–EL543.
- Niparko, J. K., Tobey, E. A., Thal, D. J., Eisenberg, L. S., Wang, N.-Y., Quittner, A. L., & Fink, N. E. (2010). Spoken language development in children following cochlear implantation. JAMA, 303(15), 1498–1506.
- Vermeire, K., Brokx, J. P. L., Wuyts, F. L., Cochet, E., Hofkens, A., & Van de Heyning, P. H. (2005). Quality-of-life benefit from cochlear implantation in the elderly. Otology & Neurotology, 26(2), 188–195.
- Winn, M., & Teece, K. (2021). Slower speaking rate reduces listening effort among listeners with cochlear implants. Ear and Hearing, 42(3), 584–595. [Advance online publication]
- Blamey, P., Artieres, F., Baskent, D., Bergeron, F., Beynon, A., Burke, E., Dillier, N., Dowell, R. C., Fraysse, B., Gallego, S., Govaerts, B., Green, K., Huber, A., Kleine-Punte, A., Maat, B., Marx, M., Mawman, D., Isabelle, M., O’Connor, A. F., … Lazard, D. S. (2013). Factors affecting auditory performance of postlinguistically deaf adults using cochlear implants: An update with 2251 patients. Audiology & Neurotology, 18(1), 36–47.
- Canfarotta, M. W., Dillon, M. T., Buss, E., Pillsbury, H. C., Brown, K. D., & O’Connell, B. P. (2020). Frequency-to-place mismatch: Characterizing variability and the influence on speech perception outcomes in cochlear implant recipients. Ear and Hearing, 41(5), 1349–1361.
- Chatterjee, M., & Peng, S.-C. (2008). Processing F0 with cochlear implants: Modulation frequency discrimination and speech intonation recognition. Hearing Research, 235(1–2), 143–156.
- Cochlear Corp. (2020). Cochlear implant candidacy criteria.
- DiNino, M., Arenberg, J., Duchen, A., & Winn, M. B. (2020). Effects of age and cochlear implantation on spectrally cued speech categorization. Journal of Speech, Language, and Hearing Research, 63(7), 2425–2440.
- Dudley, H. (1939). The automatic synthesis of speech. Proceedings of the National Academy of Sciences, 25(7), 377–383.
- Friesen, L., Shannon, R., Başkent, D., & Wang, X. (2001). Speech recognition in noise as a function of the number of spectral channels: Comparison of acoustic hearing and cochlear implants. Journal of the Acoustical Society of America, 110(2), 1150–1163.
- Fuller, C., Gaudrain, E., Clarke, J., Galvin, J., Fu, Q.-J., Free, R., & Başkent, D. (2014). Gender categorization is abnormal in cochlear implant users. Journal of the Association for Research in Otolaryngology, 15(6), 1037–1048.
- Geers, A., Mitchell, C. M., Warner-Czyz, A., Wang, N.-Y., & Eisenberg, L. (2017). Early sign language exposure and cochlear implantation benefits. Pediatrics, 140(1), e20163489.
- Gifford, R. H., & Dorman, M. F. (2018). Bimodal hearing or bilateral cochlear implants? Ask the patient. Ear and Hearing, 40(3), 501–516.
- Hall, M. L. (2020). The input matters: Assessing cumulative language access in deaf and hard of hearing individuals and populations. Frontiers in Psychology, 11, 1407.
- Harnsberger, J. D., Svirsky, M. A., Kaiser, A. R., Pisoni, D. B., Wright, R., & Meyer, T. A. (2001). Perceptual “vowel spaces” of cochlear implant users: Implications for the study of auditory adaptation to spectral shift. The Journal of the Acoustical Society of America, 109(5), 2135–2145.
- Harris, M., Terlektsi, E., & Kyle, F. E. (2017). Literacy outcomes for primary school children who are deaf and hard of hearing: A cohort comparison study. Journal of Speech, Language, and Hearing Research, 60(3), 701–711.
- Holt, R. F., Beer, J., Kronenberger, W., Pisoni, D., Lalonde, K., & Mulinaro, L. (2020). Family environment in children with hearing aids and cochlear implants: Associations with spoken language, psychosocial functioning, and cognitive development. Ear and Hearing, 41(4), 762–774.
- Hughes, S., Hutchings, H., Rapport, F., McMahon, C., & Boisvert, I. (2018). Social connectedness and perceived listening effort in adult cochlear implant users: A grounded theory to establish content validity for a new patient-reported outcome measure. Ear and Hearing, 39(5), 922–934.
- Humphries, T., Kushalnagar, P., Mathur, G., Napoli, D. J., Padden, C., Rathmann, C., & Smith, S. R. (2012). Language acquisition for deaf children: Reducing the harms of zero tolerance to the use of alternative approaches. Harm Reduction Journal, 9, 16.
- Jones, H., Kan, A., & Litovsky, R. (2014). Comparing sound localization deficits in bilateral cochlear-implant users and vocoder simulations with normal-hearing listeners. Trends in Hearing, 18, 1–16.
- Kan, A., & Litovsky, R. (2014). Binaural hearing with electrical stimulation. Hearing Research, 322, 127–137.
- Kronenberger, W. G., Bozell, H., Henning, S. C., Montgomery, C. J., Ditmars, A. M., & Pisoni, D. B. (2020). Functional hearing quality in prelingually deaf school-age children and adolescents with cochlear implants. International Journal of Audiology, 60(4), 282–292.
- Landsberger, D. M., Svrakic, M., Roland, J. T., Jr., & Svirsky, M. (2015). The relationship between insertion angles, default frequency allocations, and spiral ganglion place pitch in cochlear implants. Ear and Hearing, 36(5), e207–e213.
- Lazard, D., Vincent, C., Venail, F., Van de Heyning, P., Truy, E., Sterkers, O., Skarzynski, P. H., Skarzynski, H., Schauwers, K., O’Leary, S., Mawman, D., Maat, B., Kleine-Punte, A., Huber, A. M., Green, K., Govaerts, P. J., Fraysse, B., Dowell, R., Dillier, N., … Blamey, P. (2012). Pre-, per-, and postoperative factors affecting performance of postlinguistically deaf adults using cochlear implants: A new conceptual model over time. PLoS One, 7(11), e48739.
- Marschark, M., Shaver, D. M., Nagle, K. M., & Newman, L. A. (2015). Predicting the academic achievement of deaf and hard-of-hearing students from individual, household, communication, and educational factors. Exceptional Children, 81(3), 350–369.
- Mehta, A., & Oxenham, A. (2017). Vocoder simulations explain complex pitch perception limitations experienced by cochlear implant users. Journal of the Association for Research in Otolaryngology, 18(6), 789–802.
- Moberly, A. C., Lowenstein, J. H., Tarr, E., Caldwell-Tarr, A., Welling, D. B., Shahin, A. J., & Nittrouer, S. (2014). Do adults with cochlear implants rely on different acoustic cues for phoneme perception than adults with normal hearing? Journal of Speech, Language, and Hearing Research, 57(2), 566–582.
- Moore, B. C. J., & Carlyon, R. P. (2005). Perception of pitch by people with cochlear hearing loss and by cochlear implant users. In C. J. Plack, A. J. Oxenham, R. R. Fay, & A. N. Popper (Eds.), Pitch: Neural coding and perception (pp. 234–277). New York, NY: Springer.
- Nelson, P. B., & Jin, S.-H. (2004). Factors affecting speech understanding in gated interference: Cochlear implant users and normal-hearing listeners. Journal of the Acoustical Society of America, 115(5), 2286–2294.
- NIDCD. (2020). Cochlear implants.
- Nittrouer, S., Caldwell-Tarr, A., Moberly, A. C., & Lowenstein, J. H. (2014). Perceptual weighting strategies of children with cochlear implants and normal hearing. Journal of Communication Disorders, 52, 111–133.
- Nittrouer, S., & Lowenstein, J. H. (2015). Weighting of acoustic cues to a manner distinction by children with and without hearing loss. Journal of Speech, Language, and Hearing Research, 58(3), 1077–1092.
- Nittrouer, S., Muir, M., Tietgans, K., Moberly, A., & Lowenstein, J. (2018). Development of phonological, lexical, and syntactic abilities in children with cochlear implants across the elementary grades. Journal of Speech, Language, and Hearing Research, 61(10), 2561–2577.
- O’Neill, E. (2020). Understanding factors contributing to variability in outcomes of cochlear-implant users (Doctoral dissertation). University of Minnesota Digital Conservancy.
- O’Neill, E., Kreft, H., & Oxenham, A. (2019). Cognitive factors contribute to speech perception in cochlear-implant users and age-matched normal-hearing listeners under vocoded conditions. Journal of the Acoustical Society of America, 146(1), 195–210.
- Oxenham, A. J., & Kreft, H. A. (2014). Speech perception in tones and noise via cochlear implants reveals influence of spectral resolution on temporal processing. Trends in Hearing, 18, 2331216514553783.
- Rødvik, A. K., von Koss Torkildsen, J., Wie, O. B., Storaker, M. A., & Silvola, J. T. (2018). Consonant and vowel identification in cochlear implant users measured by nonsense words: A systematic review and meta-analysis. Journal of Speech, Language, and Hearing Research, 61(4), 1023–1050.
- Shannon, R. V., Zeng, F.-G., Kamath, V., Wygonski, J., & Ekelid, M. (1995). Speech recognition with primarily temporal cues. Science, 270(5234), 303–304.
- Studdert-Kennedy, M. (1983). Limits on alternative auditory representations of speech. Annals of the New York Academy of Sciences, 405(1), 33–38.
- van Wieringen, A., & Wouters, J. (2015). What can we expect of normally-developing children implanted at a young age with respect to their auditory, linguistic and cognitive skills? Hearing Research, 322, 171–179.
- Wilson, B. S., Finley, C. C., Lawson, D. T., Wolford, R. D., & Zerbi, M. (1993). Design and evaluation of a continuous interleaved sampling (CIS) processing strategy for multichannel cochlear implants. Journal of Rehabilitation Research and Development, 30(1), 110–116.
- Winn, M. B., Chatterjee, M., & Idsardi, W. J. (2012). The use of acoustic cues for phonetic identification: Effects of spectral degradation and electric hearing. Journal of the Acoustical Society of America, 131(2), 1465–1479.
- Winn, M. B., & Litovsky, R. Y. (2015). Using speech sounds to test functional spectral resolution in listeners with cochlear implants. Journal of the Acoustical Society of America, 137(3), 1430–1442.
- Winn, M. B., & Moore, A. (2018). Pupillometry reveals that context benefit in speech perception can be disrupted by later-occurring sounds, especially in listeners with cochlear implants. Trends in Hearing, 22, 1–22.
- Winn, M. B., Rhone, A. E., Chatterjee, M., & Idsardi, W. J. (2013). The use of auditory and visual context in speech perception by listeners with normal hearing and listeners with cochlear implants. Frontiers in Psychology, 4, 824.
- Zeng, F.-G. (2002). Temporal pitch in electric hearing. Hearing Research, 174(1–2), 101–106.
- Zeng, F.-G., Grant, G., Niparko, J., Galvin, J., Shannon, R., Opie, J., & Segel, P. (2002). Speech dynamic range and its effect on cochlear implant performance. Journal of the Acoustical Society of America, 111(1), 377–386.
- Zheng, Y., Koehnke, J., Besing, J., & Spitzer, J. (2011). Effects of noise and reverberation on virtual sound localization for listeners with bilateral cochlear implants. Ear and Hearing, 32(5), 569–572.