Show Summary Details

Page of

PRINTED FROM the OXFORD RESEARCH ENCYCLOPEDIA, NEUROSCIENCE ( (c) Oxford University Press USA, 2020. All Rights Reserved. Personal use only; commercial use is strictly prohibited (for details see Privacy Policy and Legal Notice).

date: 23 February 2020

Neural Population Coding of Natural Sounds in Non-flying Mammals

Summary and Keywords

Understanding the principles by which sensory systems represent natural stimuli is one of the holy grails of neuroscience. In the auditory system, the study of the coding of natural sounds has a particular prominence. Indeed, the relationships between neural responses to simple stimuli (usually pure tone bursts)—often used to characterize auditory neurons—and complex sounds (in particular natural sounds) may be complex. Many different classes of natural sounds have been used to study the auditory system. Sound families that researchers have used to good effect in this endeavor include human speech, species-specific vocalizations, an “acoustic biotope” selected in one way or another, and sets of artificial sounds that mimic important features of natural sounds.

Peripheral and brainstem representations of natural sounds are relatively well understood. The properties of the peripheral auditory system play a dominant role, and further processing occurs mostly within the frequency channels determined by these properties. At the level of the inferior colliculus, the highest brainstem station, representational complexity increases substantially due to the convergence of multiple processing streams. Undoubtedly, the most explored part of the auditory system, in term of responses to natural sounds, is the primary auditory cortex. In spite of over 50 years of research, there is still no commonly accepted view of the nature of the population code for natural sounds in the auditory cortex. Neurons in the auditory cortex are believed by some to be primarily linear spectro-temporal filters, by others to respond to conjunctions of important sound features, or even to encode perceptual concepts such as “auditory objects.” Whatever the exact mechanism is, many studies consistently report a substantial increase in the variability of the response patterns of cortical neurons to natural sounds. The generation of such variation may be the main contribution of auditory cortex to the coding of natural sounds.

Keywords: Auditory system, natural sounds, auditory cortex, inferior colliculus, electrophysiology, neurons, sensory representations

Which Natural Sounds?

This article centers on the auditory system of nonflying mammals. Coding of natural sounds in birds mostly involves responses to bird songs, with strong emphasis on the bird’s own song. The bat auditory system has been probed mostly with echolocation calls. Both of these fields of study are rich and varied, and they merit their own review.

To discuss the representation of natural sounds, it is first necessary to decide what we mean by this term. As an illustration of the problems associated with this term, consider (1) a call of a blackbird in a field in France, (2) the recording of this same call played from a loudspeaker in a lab in Jerusalem, and (3) the stylized version of the blackbird call played on a flute during a performance of “Le merle noir,” the well-known piece for flute and piano by the French composer Olivier Messiaen. While (1) is obviously a natural sound, (2) is a reproduction of such a sound in an environment in which it is certainly not natural, while (3) is arguably natural but in a very different sense. For neuroscientists, the distinction between (1) and (2) is a simple nuisance, while (3) obviously belongs to a different class of sounds. However, for the lab rat who is habituated to human contact, (1) is an experience she will never undergo, while (2) and (3) may be similarly artificial.

As this simple example suggests, the use of natural sounds to study the auditory system requires a good rationale as to which sounds are to be used. For example, it has been argued (e.g., Horikawa & Suga, 1991; Suga, 1992; Theunissen & Elie, 2014) that auditory systems evolved in order to process biologically important sounds. While this is undoubtedly true, such an argument leaves open the question of which natural sounds are biologically important. Cats probably evolved in the Near East, in association with agricultural settlements (Driscoll et al., 2007), but do very well in modern urbanized environments in spite of the huge difference in the acoustic ecology between the two.

More concretely, researchers argued that species-specific vocalizations are of particular importance for animals and would therefore be a good set of natural sounds for studying their auditory system. Some important successes of this approach include the study of bat echolocation calls initiated by Nobuo Suga, leading to the discovery of the combination-sensitive neurons, and the discovery of the special status of the bird's own song (Doupe & Konishi, 1991; Margoliash & Konishi, 1985) in the auditory system of songbirds. The arguments in support of the use of species-specific vocalizations are, again, undoubtedly correct, but ignore the fact that the auditory system is important for much more than just intraspecies communication. For example, Figure 1 illustrates the importance of auditory systems, and what would be considered by most as natural sounds, in a noncommunication task—that of crossing a busy street in a modern city.

Neural Population Coding of Natural Sounds in Non-flying Mammals

Figure 1. The importance of hearing for processing noncommunication sounds. A. A street sign warning car drivers about deaf children crossing the street. The building on the left is the Hattie Friedland (Ki’ach) School for the deaf (Borochov Street, Jerusalem, Israel). B. Part of a road safety campaign, this bus advertisement advises pedestrians preparing to cross a street to (1) stop, (2) listen, and (3) look for approaching cars (Jerusalem, Israel).

A third set of natural sounds that has been used extensively in animal studies is human speech. The rationale for studying the coding of human speech in nonhuman animal models includes the acoustic similarity between human speech and vocalizations of other animals (Elliott & Theunissen, 2009), the assumption that early sensory stations are similar in different mammalian species, including humans, and—at least most compellingly for me—the simple observation that speech consists of a family of complex but acoustically well-understood sounds, making it easy to manipulate and control, whether or not it is natural to the animal model in which it is used.

A different though less prominent research thread consists of employing selected sets of natural sounds that are believed to be a good sample of the general concept of a “natural sound.” Aertsen, Johannesma, and their colleagues, who initiated this concept (Aertsen, Smolders, & Johannesma, 1979; Smolders, Aertsen, & Johannesma, 1979), coined the name “acoustic biotope” for such a set of sounds. The question of what “the” acoustic biotope is for different species remains unsolved—it would obviously depend on the environment, on the auditory system under consideration (through, e.g., its frequency and temporal selectivity), but also on behavioral patterns of the specific species under consideration, which may result in an active selection of a preferred acoustic environment. The richness and complexity of the full acoustic biotope (if such an object can be characterized at all) requires researchers to make many arbitrary choices when following this approach. For example, the set of studies by Nelken and colleagues that will be described in more detail below started with a set of tonal frequency-modulated bird chirps from field recordings, and the very small set of examples used there included typical representatives for the extent and rate of frequency modulation found in these recordings (Bar-Yosef, Rotman, & Nelken, 2002). Similarly, Liu and Schreiner characterized in detail a subset of mouse pup vocalizations—namely, the ultrasonic isolation calls emitted while pups are separated from their nests—and demonstrated that these were different from adult ultrasonic vocalizations (Liu, Miller, Merzenich, & Schreiner, 2003). They then tested a selected subset of pup vocalizations in electrophysiological experiments (Liu & Schreiner, 2007). While the use of representatives is in line with the ideas of an acoustic biotope, the exclusive use of recordings dominated by a single frequency-modulated chirp (Nelken and colleagues) or a single type of call (Liu and Schreiner) is not—such events are in fact rather rare in natural recordings.

A different approach to the study of natural sounds consists of using artificial sounds that carry acoustic features of natural sounds. As an example of this line of work, Schnupp and colleagues studied the joint coding of pitch, timbre, and spatial location. They used four different values for each feature (pitch, timbre, and spatial location) and all possible combinations these, resulting in a set of 64 sounds that they used to map all fields of ferret auditory cortex (Bizley, Walker, Silverman, King, & Schnupp, 2009). Studies using this approach will be occasionally mentioned here, in spite of the fact that strictly speaking, they do not employ natural sounds.

Finally, what makes natural sounds natural? The discussion above suggests that natural sounds may be natural because the auditory system evolved to process them efficiently. However, an alternative view is possible: natural sounds may be natural because animals are exposed to them; this exposure, coupled with the plasticity of the nervous system, shapes the auditory system to process them efficiently. In the auditory system, developmental plastic mechanisms are obviously at work even at the level of the midbrain (e.g., Anderson, Parbery-Clark, White-Schwoch, & Kraus, 2015; Krizman et al., 2015), and plastic mechanisms operates in adults as well, at least at forebrain structures (Suga & Ma, 2003; Weinberger, 2004, 2011). Plasticity and the coding of natural sounds will be shortly discussed towards the end of this article.

Peripheral Population Representations of Natural Sounds—Spectral, Temporal, and Otherwise

The standard model of the peripheral auditory system in vertebrates includes a bank of filters whose bandwidth is substantially narrower than the full frequency range of hearing of each species, followed by logarithmic compression of the resulting amplitude information (J. Schnupp, Nelken, & King, 2011, Chapter 2) and potentially additional nonlinear mechanisms. Computational implementations are available and, while not perfect, may give a good sense of the general features of the expected activity patterns in the array of auditory nerve fibers (among many others, see Bleeck, Ives, & Patterson, 2004; Meddis et al., 2013; Slaney, 1998; Zilany, Bruce, & Carney, 2014).

Thus, to a first, rough approximation, natural sounds, like all other sounds, are represented on the array of auditory nerve fibers according to their spectro-temporal content. This representation, by time in an array of overlapping frequency channels, is somewhat comparable to a spectrogram and is often called a “neurogram” (J. Schnupp et al., 2011, Chapter 2). While both neurograms and spectrograms are time–frequency representations, the two differ in a number of important ways.

First, spectrograms are computed using filters of equal width, while the auditory filters have a constant Q—this means that their width is approximately a constant fraction of their center frequency (1/6 octave in humans, perhaps 1/2 octave in mice; King et al., 2015; Moore, 1993). This creates filters of wildly different bandwidth at low and high frequencies, affecting among other things their temporal properties: thus high-frequency filters typically have much better temporal resolution than low-frequency filters (J. Schnupp et al., 2011, Chapter 2).

Second, a spectrogram is usually computed by filtering the signal with sine and cosine phases (“in quadrature”), keeping the amplitude but ignoring the phase of the filtered signal. In the peripheral auditory system, the phase is important, at least at sufficiently low frequencies. The traveling wave along the basilar membrane creates phase differences between filters at different frequencies, and further phase shifts are imposed by the nonlinear mechanisms related to the cochlear amplifier. The resulting phase profiles may be important for the processing of natural sounds—for example, Shamma and Klein (2000) suggested a scheme by which spectral pitch templates can form in a natural way, with a significant role for the peculiarities of the pattern of phase changes along the basilar membrane.

Third, nonlinear cochlear mechanisms may contribute essentially to peripheral representations of natural sounds. For example, Young and Sachs, in a pair of classic papers (Sachs & Young, 1979; Young & Sachs, 1979), studied the representation of human vowels in the auditory nerve of cats. In general, vowel identity can be inferred from the formant frequencies (the peaks in its spectral envelope). Young and Sachs found that at low sound levels, the firing rates of auditory nerve fibers whose best frequencies corresponded to formant frequencies were larger than the firing rates of auditory nerve fibers whose best frequencies corresponded to spectral minima. Therefore, vowel identity could be inferred from the rate patterns of auditory nerve fibers, with formants corresponding to the best frequencies of the highly activated auditory nerve fibers. However, at higher sound levels, the firing rate of most auditory nerve fibers reached saturation, and the resulting rate profiles lost their similarity to the spectral envelope of the vowels. In consequence, rate profiles became less useful for identifying vowels.

To demonstrate that the activity patterns of auditory nerve fibers still carried information about vowel identity in spite of saturation of the firing rates, Young and Sachs made use of the temporal response patterns of auditory nerve fibers, and specifically of the nonlinear phenomenon known as synchrony capture. Synchrony capture occurs when multiple pure tones, whose frequencies are such that the auditory nerve fiber responds to each of them when played individually, are played together. The responses of an auditory nerve fiber follow the periodicity of the loudest pure tone, and the periodicities corresponding to the other tones may be strongly attenuated in the resulting temporal firing patterns. Thus, the firing pattern of the auditory nerve fiber is “captured” by the stronger tone within its excitatory bandwidth. Because of synchrony capture, all auditory nerve fibers with best frequencies in the neighborhood of a given formant would tend to display the periodicity corresponding to the formant peak, independent of their own best frequency. Young and Sachs used a measure they called Average Synchronized Local Rate (ASLR): the locally smoothed version of the synchrony of the firing rate of each auditory nerve fiber with a tone at its BF. Since fibers tended to synchronize with formant peaks rather than with their own BFs, this measure tended to be maximal near formant peaks, providing the relevant information needed to identify the vowel.

This result used in crucial ways the tendency of low-BF auditory nerve fibers to phase lock to a periodic sound (J. Schnupp et al., 2011, Chapter 2). Above 4 kHz, auditory nerve fibers, even in cats, do not phase-lock any more, and nevertheless cats are capable of doing fine spectral discriminations even at high frequencies. For example, in spatial hearing, elevation information is carried by spectral distortions caused by the filtering properties of the pinnae together with the head, shoulders, and torso. These spectral distortions, well characterized in the cat (J. J. Rice, May, Spirou, & Young, 1992) as well as in many other species, are most distinct at high frequencies. Cats can use them to identify the elevation of sound sources (Huang & May, 1996), although phase locking is absent. Young and colleagues demonstrated that the details of the spectral envelope at high frequencies can be extracted from rate profiles using a somewhat indirect procedure, not requiring phase locking (J. J. Rice, Young, & Spirou, 1995). They recorded the responses to broadband noise shaped to include these elevation-dependent spectral cues from a number of different elevations, then compared the resulting rate profiles. As in the case of vowels, at low sound levels the rate profile of the array of auditory nerve fibers corresponded nicely with the spectral distortions in the sound. At high sound levels, the rate profiles saturated and lost their similarity to the spectral envelope of the sound. However, differences between rate profiles evoked by different sounds were still similar to the differences between the spectral envelopes of the same sounds. This effect again most probably involves cochlear nonlinearities.

A striking example for the importance of nonlinear cochlear mechanisms in coding species-specific vocalizations may occur in the context of ultrasonic mouse vocalizations (Portfors & Roberts, 2014; Portfors, Roberts, & Jonson, 2009; Roberts & Portfors, 2015). While mouse ultrasonic vocalizations include frequencies as high as 100 kHz, peripheral frequency representations in the mouse, as tested with pure tones, are largely limited to 60 kHz or so (Garcia-Lazaro, Shepard, Miranda, Liu, & Lesica, 2015; Roberts & Portfors, 2015). Nevertheless, many neurons in the mouse dorsal cochlear nucleus, as well as the inferior colliculus, respond to high-frequency vocalizations, even when the frequency content of the vocalization does not overlap with their frequency tuning at all. Portfors and colleagues concluded that these responses are most likely due to the generation of combination tones in the cochlea. Such effect accounts both for the sensitivity of low-frequency neurons to high-frequency vocalizations and for the specific timing of the neuronal responses, which are often locked to acoustic events that are likely to generate combination tones.

Thus, the array of auditory nerve fibers carries information about complex sounds using a number of different codes. Rate codes are mostly applicable at low sound levels. At higher sound levels, temporal codes may operate, at least within the phase-locking regime. In both frequency ranges, differences between spectral envelopes are encoded by differences between rate profiles (Conley & Keilson, 1995; J. J. Rice et al., 1995) even at high sound levels, so that a central station that can integrate the firing rates over time may be able to faithfully follow the dynamical changes in spectral shape as they occur. Finally, cochlear (and potentially other) nonlinearities play a crucial role in shaping the representation of complex sounds in the auditory system.

The Problem of the IC

This relatively neat picture of the coding of natural sounds in the auditory nerve can be carried a bit further. For example, Sachs and colleagues studied the coding of vowels in neurons of the cochlear nucleus, showing that one specific class of neurons, so-called chopper cells, carries a rate representation with a wide dynamic range (Blackburn & Sachs, 1990). They suggested that this is a consequence of the integration of inputs from multiple auditory nerve fibers, that have different spontaneous firing rates (correlated with their thresholds and dynamic ranges), by such neurons (Lai, Winslow, & Sachs, 1994).

Between the cochlear nucleus and the inferior colliculus, the large midbrain auditory center, auditory information is extensively processed in the superior olive and the nuclei of the lateral lemniscus. Although much is known about the physiology of these stations, they have not been tested in a systematic way with natural sounds. As a rule, these stations are believed to extract sound features that are important for processing in downstream auditory stations. The best understood of these processes is the extraction of interaural disparities (interaural time and level differences) by the major nuclei of the superior olive. Interaural disparities are important for spatial hearing—interaural time differences are crucial for azimuthal localization at low frequencies, and interaural level differences at high frequencies. How these processes are reflected in the processing of natural or naturalistic binaural disparities is largely unknown.

Essentially all the numerous auditory brainstem representations converge at the level of the inferior colliculus (IC). Here I consider the best-understood part of the IC, which is its central nucleus (ICc).

One way to think about the ICc is as the functional equivalent of primary visual cortex (V1). Like the inputs to V1, the different projections from the brainstem to ICc are at least partially segregated, but may also be integrated on single neurons (Casseday, Fremouw, & Covey, 2002). Thus, the same neuron in the ICc may respond selectively to sound features that are carried by different brainstem representations. At the population level, the organization may be neatly described as a superposition of multiple parameter maps (Chase & Young, 2005). Within this organization, single neuron responses may be quite complex.

A number of families of natural sounds have been used to study the ICc. One of the best understood cases is the integration of spatial cues. Young and colleagues, especially Chase and Young, studied the coding of noise bursts shaped by all spatial cues—interaural time and level differences as well as spectral shape—by neurons in the ICc. Information about interaural disparities reaches the ICc through projections from the medial superior olive (MSO, carrying mostly low-frequency ITD information) and the lateral superior olive (LSO, mostly ILD information), while spectral information is carried by projections from both the ventral and the dorsal cochear nuclei (Davis, Ramachandran, & May, 2003; Ramachandran & May, 2002). Further brainstem projections may be involved as well, the best understood being the projection from the dorsal nucleus of the lateral lemniscus (DNLL) (Burger & Pollak, 2001; Kelly & Li, 1997; Kidd & Kelly, 1996). Indeed, neurons in ICc show sensitivity to all of these spatial cues, and many neurons are selective for more than one cue. It would have been highly satisfying to know that neurons that are sensitive to multiple spatial cues, presumably integrating information arriving from multiple projections, do so in a way that makes sense—for example, by being sensitive to the ITD, ILD, and spectral structure of sounds arriving from the same direction in space. Remarkably, this is not so—the parameters to which such neurons are sensitive do not match spatially (Chase & Young, 2005, 2006). Thus, neurons in ICc code for spatial cues, but not for the spatial location, of sound sources.

The coding of species-specific vocalizations in ICc has been extensively studied in mice (Woolley & Portfors, 2013). As discussed above, ultrasonic mouse vocalizations evoke responses in many neurons with mismatched frequency tuning. The consequence is a widespread, nontonotopic representation of these vocalizations in the early auditory system (Roberts & Portfors, 2015). In the IC, neurons may be rate-selective (responding to some, but not all, vocalizations) and show temporal response patterns that are also vocalization-specific. Rate responses are at least partially shaped by inhibitory processing in the IC, since pharmacological manipulations of inhibition may change rate selectivity (Mayko, Roberts, & Portfors, 2012). Temporal response properties are not affected much by pharmacological manipulations of inhibition within IC (Dimitrov, Cummins, Mayko, & Portfors, 2014) and are therefore presumably mostly determined by responses in earlier processing stages.

As suggested from these examples, the representation of natural sounds in ICc is rich, based on many different acoustic features. On the other hand, although ICc may contain a privileged representation of some classes of natural sounds (species-specific vocalizations, for example), it is quite clear that the processing mechanisms that generate these representations are not specialized. Instead, general-purpose auditory mechanisms, including cochlear nonlinearities and the complex circuitry of the brainstem as well as that of the IC itself, are responsible for generating this representation (Garcia-Lazaro et al., 2015; Portfors et al., 2009). One consequence of this picture is that, while showing great variability in their responses to natural sounds, there is a certain simplicity to the integration mechanisms at the level of the ICc. For example, although sensitive to complex spectro-temporal features, neurons in ICc tend to be sensitive to a single one (Atencio, Sharpee, & Schreiner, 2012).

What are the relationships between the neural representation at the level of the IC and behavioral performance, e.g., the ability of animals to discriminate between two sounds? While this question doesn’t yet have a full answer, it seems that perceptual discrimination is related to neural discrimination at the level of the IC (Kettner & Thompson, 1985; Neilans, Holfoth, Radziwon, Portfors, & Dent, 2014). In this sense, responses in ICc may form a bottleneck between sound and perception—whatever higher processing does with sounds is based on the neural representation at the level of the IC.

Many other questions remain for understanding natural sound representations in the IC. For example, why does the IC represent sounds the way it does? Is it optimal in any sense? Is this representation hard-wired or shaped by experience? And if plastic, what are the rules that shape it? These questions are certain to guide future studies of natural sound coding in the ICc.

Neurons in stations above the IC may inherit their response properties from the IC. Thus, claims about the special status of neuronal properties anywhere above the IC also require showing that these properties are formed in situ rather than being inherited from below. Given the richness and variety of IC sound representations, this may not be an easy task.

Primary Auditory Cortex

The IC projects to the auditory thalamus, which in turn projects to auditory cortex. At least in rodents (Lee & Sherman, 2010) and carnivores (Andersen, Knight, & Merzenich, 1980), these projections form two parallel pathways, the so-called lemniscal pathway that includes ICc and eventually reaches primary auditory cortex (A1), and the nonlemniscal pathway that includes other parts of the IC and eventually reaches mostly nonprimary auditory cortex. The rest of this review will concentrate on the role of A1 in the coding of natural sounds. The relevant literature is very extensive, and it is very hard to do justice to all contributions. I will discuss here three general issues related to the coding of natural sounds in the auditory cortex: first, the nature of the neural code; second, the relationships between neural responses and behavioral discrimination capabilities; and finally, plasticity in the coding of natural sounds in auditory cortex.

The experimental conditions under which auditory cortex is studied turn out to be of some importance. Most studies discussed above have been performed under anesthesia. The auditory cortex is strongly affected by certain anesthetic agents (and less by others—e.g., Moshitch, Las, Ulanovsky, Bar-Yosef, & Nelken, 2006). When unanesthetized, animals may be awake or asleep, engaged in a task or not, attentive to some stimuli and not to others. All of these behavioral parameters may influence the neuronal responses. Nevertheless, the accumulating information about differences between all of these states suggest that the basic sensory coding mechanisms are rather similar among them (even under anesthesia). Instead, wakefulness and attention act as a “gain”—they increase or decrease baseline rates or sensory responses that are present anyway (Maunsell & Treue, 2006; Nir, Vyazovskiy, Cirelli, Banks, & Tononi, 2015). Thus, at the risk of some bias, studies in both anesthetized and awake animals will be discussed together below.

What Are Neurons in Auditory Cortex Doing?

There are many different conceptualizations for the role of auditory cortex in processing natural sounds. In an attempt to impose some order on a rich and varied literature, these can be broadly classified into three families.

The filterbank. Neurons in auditory cortex are considered as spectro-temporal filters with parameters that vary within a range that is relevant for the coding of natural sounds. In consequence, cortical activity patterns form a parametric representation of the incoming sounds, useful for the purpose of coding important families of natural sounds.

A good representative of this approach to auditory cortex can be found in the work of Shamma and colleagues. In their view, the filters operate on the spectro-temporal envelopes of sounds. The filters are characterized by their joint preference for spectral and temporal fluctuation rates. The joint sensitivity to temporal and spectral changes causes neurons to be selective for time-varying spectro-temporal features, such as frequency sweeps. Shamma and colleagues developed a set of artificial sounds that make it possible to rapidly estimate the parameters of these filters (temporally orthogonal ripple combinations, TORCs; Klein, Simon, Depireux, & Shamma, 2006). Cortical neurons can follow only relatively slow temporal fluctuation rates—up to a few 10s of Hz at most, but with a strong representation of much slower rates, of the order of 1 Hz (Joris, Schreiner, & Rees, 2004). Similarly, preferred spectral fluctuation rates are low—a few cycles per octave at most. These properties correspond to the range of spectro-temporal fluctuation rates found in many species-specific vocalizations, including human speech (Elliott & Theunissen, 2009; Ru, Chi, & Shamma, 2003). The similarity between neuronal dynamics and natural sound dynamics has been studied in the auditory cortex of macaque monkeys as well (Chandrasekaran, Turesson, Brown, & Ghazanfar, 2010).

Shamma’s work is not the only one along these lines. Working mostly with birds (and therefore not discussed here as extensively as it merits), Theunissen and his coworkers developed a rich understanding of spectro-temporal stimulus processing along the songbird auditory system (Theunissen & Elie, 2014). Fishbach and colleagues (Fishbach, Nelken, & Yeshurun, 2001; Fishbach, Yeshurun, & Nelken, 2003) developed a somewhat similar model around the concept of spectro-temporal edges. Their work was motivated by the experimental observation that neural responses tended to occur near fast-changing spectro-temporal events. Similarly, a series of studies of Xiaoqin Wang and his colleagues demonstrated that neurons in the awake marmoset auditory cortex are selective to many sound features, representing ranges of relevant parameters such as flutter (temporal modulation rates <100 Hz), spectral contrast, and harmonicity (Barbour & Wang, 2003; Bendor & Wang, 2005; Wang, 2013). As in other studies of these kinds, the range of parameters that are represented often corresponds to those found in natural sounds, particularly in species-specific vocalizations.

Combination-sensitive neurons. Nobuo Suga developed the concept of combination-sensitive neurons in his classic studies of the coding of echolocation calls in bat auditory cortex (reviewed in Suga, 1989). The bat auditory cortex has neurons that respond specifically to a combination of the call and its echo, coding relevant parameters such as echo delay (related to the distance to the target) and Doppler shift (related to the speed relative to the target).

In an influential paper, Wang and colleagues (2005) showed that by tuning a number of parameters of sounds, such as frequency, amplitude modulation rate, and frequency modulation rate, they could find, for many neurons in the awake marmoset auditory cortex, preferred combinations that evoked sustained firing rates over many seconds of sound presentation. In contrast, nonpreferred combinations tended to evoke responses that were much more transient. In the halothane-anesthetized cat auditory cortex, Moshitch and Nelken (2016) found results with similar flavor: neurons showed exquisite sensitivity to interaural time differences of high-frequency, amplitude-modulated sounds, but only when the carrier and amplitude rate were selected appropriately. In many of their neurons, the preferred combination of interaural time difference, carrier, and amplitude modulation rate resulted in sustained responses.

These studies, while employing complex sounds, rarely used natural sounds. Thus, although combination sensitivity may be an important feature of cortical neurons, its contribution to the actual coding of natural sounds is unclear. While sounds with optimal parameter combinations may evoke sustained firing rates over many seconds, natural sounds are predominantly nonstationary. For example, the phonemic rate in human speech is a few Hz, so that the spectro-temporal envelope of speech changes every few 100s of ms (Elliott & Theunissen, 2009). Sustained responses engendered by optimal combinations of parameters may or may not be relevant under these conditions.

Auditory object representations. Nelken and colleagues suggested a third view of auditory cortex function, based on studies of cortical responses to mixtures of sounds. They used two types of mixtures to study neuronal responses in the auditory cortex of halothane-anesthetized cats. One type consisted of recordings of bird chirps that also included echoes and background rustling (Bar-Yosef et al., 2002). The other type consisted of combinations of loud amplitude-modulated maskers with low-level tones, motivated by an analysis of a large set of natural sounds that suggested the ubiquitous presence of wideband comodulation in background sounds (Nelken, Rotman, & Bar Yosef, 1999). In both cases, many neurons responded to mixtures with the same temporal patterns that were evoked by one of the components, ignoring other components of the mixture. Remarkably, in many cases the ignored component was an acoustically-dominant component of the sound, resulting in a prominent representation of low-level acoustic components.

Thus, in both cats (Las, Stern, & Nelken, 2005) and rats (Hershenhoren and Nelken, 2016) low-level tones disrupted the locking of the neuronal responses to the envelope of the amplitude-modulated masker even at highly unfavorable signal-to-noise ratios (−20 dB in the rat; even lower in the cat). In consequence, neurons responded to the combination of a loud masker with a low-level tone just as they would respond to the low-level tone by itself, ignoring the masker. In the case of the bird chirps, the dominant acoustic component was the chirp itself, but many neurons responded to the natural sound with the same temporal patterns that were evoked by the background components by themselves (Bar-Yosef & Nelken, 2007; Bar-Yosef et al., 2002; Chechik & Nelken, 2012).

Nelken and colleagues interpreted these results as sensitivity to the ethologically relevant components of the sounds, which they called auditory objects (Nelken & Bar-Yosef, 2008). The term auditory object doesn’t have a unique definition in the literature; in spite of some brave attempts to define it explicitly (Bizley & Cohen, 2013; Griffiths & Warren, 2004; Winkler, Denham, & Nelken, 2009), different authors use it with different meanings. In the studies by the Nelken group, the term was used for clearly distinguishable acoustic components, such as the loud bird chirp and the soft wideband background, or the loud wideband amplitude-modulated masker and the soft pure tone added to it. These components are clearly defined within the specific experimental context, but may be difficult to define in more general settings.

Sensory Representations and Behavioral Discrimination Limits

We are not interested in neural representations only for their own sake. Rather, we are usually interested in the relationships between neural representations and behavior. As discussed above, there is some evidence that neural representations in the IC are linked to behavior in the sense that there is an inverse relationship between similarity of neural representations and behavioral discrimination performance (Neilans et al., 2014).

The situation in A1 is less clear. Single neurons may have somewhat idiosyncratic response properties. At the level of the auditory nerve or even the IC, frequency selectivity has a dominant role in determining neural responses to complex sounds (Chechik et al., 2006), although the frequency selectivity interacts with peripheral nonlinearities as discussed above. In contrast, neurons in A1, even with similar frequency selectivity, may have idiosyncratic responses even to very similar sounds (Chechik et al., 2006). In that study, neurons in A1 usually responded to a fair number of the stimuli (bird chirps and their modifications), but showed essentially no signal correlation—that is, the fact that two neurons responded to one of these stimuli didn’t predict the similarity of their responses to any other sound in that set.

In terms of behavioral discrimination abilities, this is in a way very good news—it means that different sounds, even very similar to each other, may activate essentially independent subsets of neurons in the auditory cortex, leading to a potentially very high level of discriminability. Unfortunately, behavioral capabilities have not been tested with the sounds used by Chechik and colleagues (Chechik et al., 2006).

Kilgard and his colleagues conducted one of the most extensive studies comparing neural representations of natural sounds and behavioral abilities (Engineer et al., 2008). They used human speech sounds but conducted their experiments in rats. I discussed above some possible reasons for using such a combination of species with the vocalizations of another species. In this case, the main importance of the use of speech is the fact that it provides a set of complex sounds that is discriminable by at least one auditory system, and whose structure is well understood.

Kilgard and colleagues found that in general, neural response patterns elicited by distinct speech sounds in either awake or anesthetized rats were distinct, and, as in the studies in mouse IC, the degree of similarity of spike patterns evoked by two stimuli across a population of neurons was associated with the degree to which rats could discriminate between these sounds (Centanni, Engineer, & Kilgard, 2013). This was true for both consonants and vowels (Shetake et al., 2011). In addition, they found that the discrimination of consonants using spike patterns required the use of spike timing—spike rates were not sufficient. Thus, contra studies in other sensory systems (Romo & Salinas, 2003), in the auditory cortex, spike timing seems to be important for behavioral performance.

Later studies by the same group provided additional support and substantially expanded these initial findings. For example, presenting speech in noise resulted in degraded response patterns and degraded behavioral discrimination capabilities, with a correlation between the two (Shetake et al., 2011). Similarly, a number of transgenic rat models of human diseases were associated with degradation of both neural response patterns and behavioral discrimination capabilities (e.g., Engineer et al., 2015). Furthermore, Kilgard and coworkers compared response patterns in IC, A1, and higher auditory fields of the rat. They found that while in IC, as expected, frequency sensitivity to a large extent determined responses, neurons in A1 showed a substantially more diverse set of response patterns, and they argued that this increased diversity is responsible for the good behavioral performance of rats tested with human speech (Ranasinghe, Vrana, Matney, & Kilgard, 2013). These experimental findings compare well with the conclusion of Chechik et al. (2006) in the cat auditory cortex. The representations of speech sounds in higher auditory cortical fields, while showing some quantitative differences, shared many properties with representations in A1, including the importance of spike timing for coding vowels (Centanni et al., 2013).

Temporal patterns may be of importance in the coding of nonhuman vocalizations as well, although the experimental results are somewhat equivocal. Wang and colleagues studied the coding of twitter calls emitted by marmoset monkeys in the auditory cortex of anesthetized marmosets and compared responses to the original vocalization with control stimuli that included time-reversed twitter calls, as well as calls that were speeded-up and slowed-down. In many neurons, these modified vocalizations evoked weaker responses than the original one, suggesting the existence of rate selectivity for the parameters of the natural call. Thus, at least in this case, rate code was involved (Wang, Merzenich, Beitel, & Schreiner, 1995). Schnupp and colleagues (J. W. Schnupp, Hall, Kokelaar, & Ahmed, 2006) tested the same calls in the auditory cortex of anesthetized ferrets and found no rate selectivity to the marmoset twitter calls—rather, all versions of the twitter call evoked similar firing rates. Nevertheless, different calls evoked different temporal response patterns, which could be reliably discriminated from each other. Thus, in ferret auditory cortex, marmoset calls were coded with a time code. However, the marmoset calls had no behavioral meaning to the ferrets, and conceivably, it could be the case that the ferrets were incapable of discriminating between the original and the modified calls. In consequence, Schnupp and colleagues went on to show that ferrets can be trained to discriminate behaviorally between marmoset twitter calls and their modified versions. Nevertheless, even in auditory cortex of trained ferrets, the neural code was based on temporal response patterns, rather than on overall spike rates which remained similar for different versions of the same call.

Interestingly, while in animals tested with vocalizations of a different species, time codes seem to dominate the auditory cortex (rats with human speech or ferrets with marmoset calls), the one study that used species-specific vocalization, that of Wang and coworkers, identified the presence of a rate code. Whether this is a general finding remains to be seen. One interesting piece of evidence comes from the guinea pig auditory cortex. The responses of neurons in guinea pig auditory cortex to guinea pig vocalizations have been studied by a number of groups (Gaucher et al., 2013; Grimsley, Shanbhag, Palmer, & Wallace, 2012; Huetz, Gourevitch, & Edeline, 2011; Huetz, Philibert, & Edeline, 2009; Wallace, Shackleton, Anderson, & Palmer, 2005), and the major finding in these studies is the presence of a diversity of evoked temporal response patterns. Thus, at least in the guinea pig model, there is evidence for the importance of a time code in encoding species-specific vocalizations.

Species-specific vocalizations have been extensively used in studying the mouse auditory cortex. Mice vocalize at ultrasonic frequencies (Ehret, 2005; Holy & Guo, 2005; Ise & Ohta, 2009; Portfors, 2007), and vocalizations play an important role in some intersubject interactions. Pup calls elicit maternal behavior in mothers, but not necessarily in age-matched nonmothers; the neural correlates will be discussed in the context of auditory cortex plasticity. Males emit courtship calls, which have been likened to songs since they show some indication of nontrivial syntax (Holy & Guo, 2005). The coding of these calls has been studied to some extent, but the resulting codes are as of today not entirely clear (e.g., Carruthers, Natan, & Geffen, 2013 for an attempt). Anyway, correlations between behavioral performance and neural representations have not been studied in this model.

Plasticity in the Representations of Natural Calls in Auditory Cortex

Auditory cortex is highly plastic. For example, a number of manipulations are available for modifying the best frequencies of cortical neurons, going from the highly artificial (coupling activation of the basal forebrain, which cause release of acetylcholine in auditory cortex, with sounds) to some lab version of natural learning (Aizenberg & Geffen, 2013; Bieszczad & Weinberger, 2010; Engineer et al., 2013; Kilgard & Merzenich, 1998; Suga & Ma, 2003; Weinberger, 2004). Unfortunately, natural sounds have not been extensively used in the context of cortical plasticity—Schnupp et al. (2006) is one of the few papers that documented changes in the adult brain caused by auditory discrimination training of natural sounds, and has been discussed in some detail above.

There is one important exception: the changes that occur in auditory cortex following parturition (giving birth) in female mice. Liu and coworkers analyzed a large set of mouse pup vocalizations (Liu et al., 2003) and then used typical and atypical exemplars to study neurons in auditory cortices of anesthetized mothers and nonmothers. They found that mice mothers show preferential responses to pup calls (Liu, Linden, & Schreiner, 2006; Liu & Schreiner, 2007). In particular, they showed that typical exemplars were more discriminable than atypical exemplars using neuronal responses from the auditory cortex of mothers, but that the same advantage did not exist in the neural responses from the auditory cortex of female mice that hadn't experienced motherhood.

The differences between the auditory cortex of mothers and nonmother mice extends beyond this result. Rothschild et al. (2013) showed a very large increase in so-called noise correlations in auditory cortex of mothers. Noise correlations refer to joint fluctuations of multiple neurons around their mean response to each stimulus. Noise correlations are usually positive in auditory cortex (Rothschild, Nelken, & Mizrahi, 2010) and are often considered as a liability for stimulus coding by population of neurons although under some conditions they do not limit population performance (Shamir & Sompolinsky, 2006). In neurons that are very close to each other (within ~100 microns, recorded using 2-photon microscopy), Rothschild et al. (2013) found average noise correlations of about 0.2. In mothers, noise correlations increased by almost a factor of 2, resulting in very large fluctuation of the total population responses from one stimulus presentation to the next. Interestingly, the fidelity of decoding stimulus identity from the population activity remained essentially the same in mothers and nonmothers. This finding has important implications on our understanding of population dynamics in cortex (Moreno-Bote et al., 2014), but such issues are outside the scope of the current review.


As this selective and partial review shows, we know a lot about the coding of natural sounds by populations of neurons in the auditory system. I believe that the outline of answers to some fundamental questions can already be perceived.

Probably the most obvious question is whether natural sounds have a special status with respect to the auditory system. As discussed in the introductory section, while in echo-locating bats and songbirds this may be so—the auditory systems of such species show clear biases towards the processing of their own vocalizations—the large majority of studies in other mammals do not support such bias. An important consequence is that to study the coding of natural sounds, it may be enough to use well-controlled, complex artificial sounds. Such an approach has been used for example by Schnupp and collaborators (J. W. Schnupp et al., 2006).

Nevertheless, natural sounds are useful to study by virtue of their complexity. Indeed, while there is only weak evidence for specializations for the processing of natural sounds, it is still plausible to assume that general mechanisms for processing complex sounds that do exist in the auditory system should be engaged by natural sounds (Theunissen & Elie, 2014). Thus, the use of natural sounds may be an efficient way of uncovering such mechanisms. The study of the coding of bird songs in the cat auditory cortex by Nelken and collaborators, resulting in the finding of preferred responses to weak components in sound mixtures, may be an example.

How are natural sounds coded in the auditory system? The evidence discussed above suggests that early representations are determined to a large extent “within channel” (that is, information is encoded in narrow frequency bands organized along the tonotopic axis). Such within channel representation can be quite complex, shaped at least in part by nonlinear cochlear mechanisms. Nevertheless, the cortical representation of natural sounds is very different. A number of studies in cats and rats suggest that in the cortex, natural sounds produce a large diversity of response patterns (Chechik et al., 2006; Ranasinghe et al., 2013). Thus, one possible role of the auditory cortex may be the generation of response diversity. All three conceptualizations of the function of auditory cortex discussed above could result in the generation of response diversity, so that the mechanisms that underlie this process are not understood at this time.

The role of such diversity is also unknown—Nelken’s suggestion that neurons are sensitive to auditory objects is a possibility, but the data supporting it is sparse. The generation of diversity may be useful for a higher station of the auditory system to discriminate between sounds, in the spirit of support vector machines (Noble, 2006), or alternatively the responses of neurons in auditory cortex may represent multiple random measurements on sounds in the spirit of methods of compressed sensing (Ganguli & Sompolinsky, 2012).

Finally, are processing mechanisms of natural sounds innate or learned? I would expect a contribution of both nature and nurture in shaping these mechanisms. The primary auditory cortex is notably plastic (Weinberger, 2004), and it is thus likely to be shaped by life experiences. Precisely how the plastic mechanisms in auditory cortex interact with life-long exposure to the world of natural sounds is, however, unknown.

In the end, the study of the coding of natural is a fascinating link between a mechanistic study of the auditory system on the one hand, and its role in guiding behavior on the other hand. This linking role can be observed throughout the recent history of auditory neuroscience and will certainly continue in the future, to the great profit of our understanding of brains and behaviors.


Aertsen, A. M., Smolders, J. W., & Johannesma, P. I. (1979). Neural representation of the acoustic biotope: On the existence of stimulus-event relations for sensory neurons. Biological Cybernetics, 32(3), 175–185.Find this resource:

Aizenberg, M., & Geffen, M. N. (2013). Bidirectional effects of aversive learning on perceptual acuity are mediated by the sensory cortex. Nature Neuroscience, 16(8), 994–996.Find this resource:

Andersen, R. A., Knight, P. L., & Merzenich, M. M. (1980). The thalamocortical and corticothalamic connections of AI, AII, and the anterior auditory field (AAF) in the cat: Evidence for two largely segregated systems of connections. Journal of Comparative Neurology, 194(3), 663–701.Find this resource:

Anderson, S., Parbery-Clark, A., White-Schwoch, T., & Kraus, N. (2015). Development of subcortical speech representation in human infants. Journal of the Acoustical Society of America, 137(6), 3346–3355.Find this resource:

Atencio, C. A., Sharpee, T. O., & Schreiner, C. E. (2012). Receptive field dimensionality increases from the auditory midbrain to cortex. Journal of Neurophysiology, 107(10), 2594–2603.Find this resource:

Barbour, D. L., & Wang, X. (2003). Contrast tuning in auditory cortex. Science, 299(5609), 1073–1075.Find this resource:

Bar-Yosef, O., & Nelken, I. (2007). The effects of background noise on the neural responses to natural sounds in cat primary auditory cortex. Frontiers in Computational Neuroscience, 1, 3.Find this resource:

Bar-Yosef, O., Rotman, Y., & Nelken, I. (2002). Responses of neurons in cat primary auditory cortex to bird chirps: Effects of temporal and spectral context. Journal Neuroscience, 22(19), 8619–8632.Find this resource:

Bendor, D., & Wang, X. (2005). The neuronal representation of pitch in primate auditory cortex. Nature, 436(7054), 1161–1165.Find this resource:

Bieszczad, K. M., & Weinberger, N. M. (2010). Representational gain in cortical area underlies increase of memory strength. Proceedings of the National Academy of Sciences of the United States of America, 107(8), 3793–3798.Find this resource:

Bizley, J. K., & Cohen, Y. E. (2013). The what, where and how of auditory-object perception. Nature Reviews Neuroscience, 14(10), 693–707.Find this resource:

Bizley, J. K., Walker, K. M., Silverman, B. W., King, A. J., & Schnupp, J. W. (2009). Interdependent encoding of pitch, timbre, and spatial location in auditory cortex. Journal of Neuroscience, 29(7), 2064–2075.Find this resource:

Blackburn, C. C., & Sachs, M. B. (1990). The representations of the steady-state vowel sound/e/in the discharge patterns of cat anteroventral cochlear nucleus neurons. Journal of Neurophysiology, 63(5), 1191–1212.Find this resource:

Bleeck, S., Ives, T., & Patterson, R. D. (2004). Aim-mat: The auditory image model in MATLAB. Acta Acustica, 90(4), 781–787.Find this resource:

Burger, R. M., & Pollak, G. D. (2001). Reversible inactivation of the dorsal nucleus of the lateral lemniscus reveals its role in the processing of multiple sound sources in the inferior colliculus of bats. Journal of Neuroscience, 21(13), 4830–4843.Find this resource:

Carruthers, I. M., Natan, R. G., & Geffen, M. N. (2013). Encoding of ultrasonic vocalizations in the auditory cortex. Journal of Neurophysiology, 109(7), 1912–1927.Find this resource:

Casseday, J. H., Fremouw, T., & Covey, E. (2002). The inferior colliculus: A hub for the central auditory system. In D. Oertel, R. R. Fay, & A. N. Popper (Eds.), Integrative functions in the mamalian auditory pathway (Vol. 15, pp. 238–318). New York: Springer.Find this resource:

Centanni, T. M., Engineer, C. T., & Kilgard, M. P. (2013). Cortical speech-evoked response patterns in multiple auditory fields are correlated with behavioral discrimination ability. Journal of Neurophysiology, 110(1), 177–189.Find this resource:

Chandrasekaran, C., Turesson, H. K., Brown, C. H., & Ghazanfar, A. A. (2010). The influence of natural scene dynamics on auditory cortical activity. Journal of Neuroscience, 30(42), 13919–13931.Find this resource:

Chase, S. M., & Young, E. D. (2005). Limited segregation of different types of sound localization information among classes of units in the inferior colliculus. Journal of Neuroscience, 25(33), 7575–7585.Find this resource:

Chase, S. M., & Young, E. D. (2006). Spike-timing codes enhance the representation of multiple simultaneous sound-localization cues in the inferior colliculus. Journal of Neuroscience, 26(15), 3889–3898.Find this resource:

Chechik, G., Anderson, M. J., Bar-Yosef, O., Young, E. D., Tishby, N., & Nelken, I. (2006). Reduction of information redundancy in the ascending auditory pathway. Neuron, 51(3), 359–368.Find this resource:

Chechik, G., & Nelken, I. (2012). Auditory abstraction from spectro-temporal features to coding auditory entities. Proceedings of the National Academy of Sciences of the United States of America.Find this resource:

Conley, R. A., & Keilson, S. E. (1995). Rate representation and discriminability of second formant frequencies for/epsilon/-like steady-state vowels in cat auditory nerve. Journal of the Acoustical Society of America, 98(6), 3223–3234.Find this resource:

Davis, K. A., Ramachandran, R., & May, B. J. (2003). Auditory processing of spectral cues for sound localization in the inferior colliculus. Journal of the Association for Research in Otolaryngology, 4(2), 148–163.Find this resource:

Dimitrov, A. G., Cummins, G. I., Mayko, Z. M., & Portfors, C. V. (2014). Inhibition does not affect the timing code for vocalizations in the mouse auditory midbrain. Frontiers in Physiology, 5, 140.Find this resource:

Doupe, A. J., & Konishi, M. (1991). Song-selective auditory circuits in the vocal control system of the zebra finch. Proceedings of the National Academy of Sciences of the United States of America, 88(24), 11339–11343.Find this resource:

Driscoll, C. A., Menotti-Raymond, M., Roca, A. L., Hupe, K., Johnson, W. E., Geffen, E., et al. (2007). The Near Eastern origin of cat domestication. Science, 317(5837), 519–523.Find this resource:

Ehret, G. (2005). Infant rodent ultrasounds: A gate to the understanding of sound communication. Behavior Genetics, 35(1), 19–29.Find this resource:

Elliott, T. M., & Theunissen, F. E. (2009). The modulation transfer function for speech intelligibility. PLoS Computational Biology, 5(3), e1000302.Find this resource:

Engineer, C. T., Perez, C. A., Carraway, R. S., Chang, K. Q., Roland, J. L., & Kilgard, M. P. (2013). Speech training alters tone frequency tuning in rat primary auditory cortex. Behavioural Brain Research.Find this resource:

Engineer, C. T., Perez, C. A., Chen, Y. H., Carraway, R. S., Reed, A. C., Shetake, J. A., et al. (2008). Cortical activity patterns predict speech discrimination ability. Nature Neuroscience, 11(5), 603–608.Find this resource:

Engineer, C. T., Rahebi, K. C., Borland, M. S., Buell, E. P., Centanni, T. M., Fink, M. K., et al. (2015). Degraded neural and behavioral processing of speech sounds in a rat model of Rett syndrome. Neurobiology of Disease, 83, 26–34.Find this resource:

Fishbach, A., Nelken, I., & Yeshurun, Y. (2001). Auditory edge detection: A neural model for physiological and psychoacoustical responses to amplitude transients. Journal of Neurophysiology, 85(6), 2303–2323.Find this resource:

Fishbach, A., Yeshurun, Y., & Nelken, I. (2003). Neural model for physiological responses to frequency and amplitude transitions uncovers topographical order in the auditory cortex. Journal of Neurophysiology, 90(6), 3663–3678.Find this resource:

Ganguli, S., & Sompolinsky, H. (2012). Compressed sensing, sparsity, and dimensionality in neuronal information processing and data analysis. Annual Review of Neuroscience, 35, 485–508.Find this resource:

Garcia-Lazaro, J. A., Shepard, K. N., Miranda, J. A., Liu, R. C., & Lesica, N. A. (2015). an overrepresentation of high frequencies in the mouse inferior colliculus supports the processing of ultrasonic vocalizations. PLoS ONE, 10(8), e0133251.Find this resource:

Gaucher, Q., Huetz, C., Gourevitch, B., Laudanski, J., Occelli, F., & Edeline, J. M. (2013). How do auditory cortex neurons represent communication sounds? Hearing Research, 305, 102–112.Find this resource:

Griffiths, T. D., & Warren, J. D. (2004). What is an auditory object? Nature Reviews Neuroscience, 5(11), 887–892.Find this resource:

Grimsley, J. M., Shanbhag, S. J., Palmer, A. R., & Wallace, M. N. (2012). Processing of communication calls in Guinea pig auditory cortex. PLoS One, 7(12), e51646.Find this resource:

Hershenhoren, I., & Nelken, I. (2016). Detection of Tones Masked by Fluctuating Noise in Rat Auditory Cortex. Cerebral Cortex.Find this resource:

Holy, T. E., & Guo, Z. (2005). Ultrasonic songs of male mice. PLoS Biology, 3(12), e386.Find this resource:

Horikawa, J., & Suga, N. (1991). Neuroethology of auditory cortex. Japanese Journal of Physiology, 41(5), 671–691.Find this resource:

Huang, A. Y., & May, B. J. (1996). Sound orientation behavior in cats. II. Mid-frequency spectral cues for sound localization. Journal of the Acoustical Society of America, 100(2 Pt 1), 1070–1080.Find this resource:

Huetz, C., Gourevitch, B., & Edeline, J. M. (2011). Neural codes in the thalamocortical auditory system: From artificial stimuli to communication sounds. Hearing Research, 271(1–2), 147–158.Find this resource:

Huetz, C., Philibert, B., & Edeline, J. M. (2009). A spike-timing code for discriminating conspecific vocalizations in the thalamocortical system of anesthetized and awake guinea pigs. Journal of Neuroscience, 29(2), 334–350.Find this resource:

Ise, S., & Ohta, H. (2009). Power spectrum analysis of ultrasonic vocalization elicited by maternal separation in rat pups. Brain Research, 1283, 58–64.Find this resource:

Joris, P. X., Schreiner, C. E., & Rees, A. (2004). Neural processing of amplitude-modulated sounds. Physiological Reviews, 84(2), 541–577.Find this resource:

Kelly, J. B., & Li, L. (1997). Two sources of inhibition affecting binaural evoked responses in the rat’s inferior colliculus: The dorsal nucleus of the lateral lemniscus and the superior olivary complex. Hearing Research, 104(1–2), 112–126.Find this resource:

Kettner, R. E., & Thompson, R. F. (1985). Cochlear nucleus, inferior colliculus, and medial geniculate responses during the behavioral detection of threshold-level auditory stimuli in the rabbit. Journal of the Acoustical Society of America, 77(6), 2111–2127.Find this resource:

Kidd, S. A., & Kelly, J. B. (1996). Contribution of the dorsal nucleus of the lateral lemniscus to binaural responses in the inferior colliculus of the rat: Interaural time delays. Journal of Neuroscience, 16(22), 7390–7397.Find this resource:

Kilgard, M. P., & Merzenich, M. M. (1998). Cortical map reorganization enabled by nucleus basalis activity [see comments]. Science, 279(5357), 1714–1718.Find this resource:

King, J., Insanally, M., Jin, M., Martins, A. R., D’Amour J, A., & Froemke, R. C. (2015). Rodent auditory perception: Critical band limitations and plasticity. Neuroscience, 296, 55–65.Find this resource:

Klein, D. J., Simon, J. Z., Depireux, D. A., & Shamma, S. A. (2006). Stimulus-invariant processing and spectrotemporal reverse correlation in primary auditory cortex. Journal of Computational Neuroscience, 20(2), 111–136.Find this resource:

Krizman, J., Tierney, A., Fitzroy, A. B., Skoe, E., Amar, J., & Kraus, N. (2015). Continued maturation of auditory brainstem function during adolescence: A longitudinal approach. Clinical Neurophysiology, 126(12), 2348–2355.Find this resource:

Lai, Y. C., Winslow, R. L., & Sachs, M. B. (1994). A model of selective processing of auditory-nerve inputs by stellate cells of the antero-ventral cochlear nucleus. Journal of Computational Neuroscience, 1(3), 167–194.Find this resource:

Las, L., Stern, E. A., & Nelken, I. (2005). Representation of tone in fluctuating maskers in the ascending auditory system. Journal of Neuroscience, 25(6), 1503–1513.Find this resource:

Lee, C. C., & Sherman, S. M. (2010). Topography and physiology of ascending streams in the auditory tectothalamic pathway. Proceedings of the National Academy of Sciences of the United States of America, 107(1), 372–377.Find this resource:

Liu, R. C., Linden, J. F., & Schreiner, C. E. (2006). Improved cortical entrainment to infant communication calls in mothers compared with virgin mice. European Journal of Neuroscience, 23(11), 3087–3097.Find this resource:

Liu, R. C., Miller, K. D., Merzenich, M. M., & Schreiner, C. E. (2003). Acoustic variability and distinguishability among mouse ultrasound vocalizations. Journal of the Acoustical Society of America, 114(6 Pt 1), 3412–3422.Find this resource:

Liu, R. C., & Schreiner, C. E. (2007). Auditory cortical detection and discrimination correlates with communicative significance. PLoS Biology, 5(7), e173.Find this resource:

Margoliash, D., & Konishi, M. (1985). Auditory representation of autogenous song in the song system of white-crowned sparrows. Proceedings of the National Academy of Sciences of the United States of America, 82(17), 5997–6000.Find this resource:

Maunsell, J. H., & Treue, S. (2006). Feature-based attention in visual cortex. European Journal of Neuroscience, 29(6), 317–322.Find this resource:

Mayko, Z. M., Roberts, P. D., & Portfors, C. V. (2012). Inhibition shapes selectivity to vocalizations in the inferior colliculus of awake mice. Front Neural Circuits, 6, 73.Find this resource:

Meddis, R., Lecluyse, W., Clark, N. R., Jurgens, T., Tan, C. M., Panda, M. R., et al. (2013). A computer model of the auditory periphery and its application to the study of hearing. Advances in Experimental Medicine and Biology, 787, 11–19; discussion 19–20.Find this resource:

Moore, B. C. J. (1993). Frequency analysis and pitch perception. In W. A. Yost, A. N. Popper, & R. R. Fay (Eds.), Human Psychophysics (pp. 56–115). New York: Springer-Verlag.Find this resource:

Moreno-Bote, R., Beck, J., Kanitscheider, I., Pitkow, X., Latham, P., & Pouget, A. (2014). Information-limiting correlations. Nature Neuroscience, 17(10), 1410–1417.Find this resource:

Moshitch, D., Las, L., Ulanovsky, N., Bar-Yosef, O., & Nelken, I. (2006). Responses of neurons in primary auditory cortex (A1) to pure tones in the halothane-anesthetized cat. Journal of Neurophysiology, 95(6), 3756–3769.Find this resource:

Moshitch, D., & Nelken, I. (2016). The representation of interaural time differences in high-frequency auditory cortex. Cerebral Cortex, 26(2), 656–668.Find this resource:

Neilans, E. G., Holfoth, D. P., Radziwon, K. E., Portfors, C. V., & Dent, M. L. (2014). Discrimination of ultrasonic vocalizations by CBA/CaJ mice (Mus musculus) is related to spectrotemporal dissimilarity of vocalizations. PLoS ONE, 9(1), e85405.Find this resource:

Nelken, I., & Bar-Yosef, O. (2008). Neurons and objects: The case of auditory cortex. Frontiers of Neuroscience, 2(1), 107–113.Find this resource:

Nelken, I., Rotman, Y., & Bar Yosef, O. (1999). Responses of auditory-cortex neurons to structural features of natural sounds. Nature, 397(6715), 154–157.Find this resource:

Nir, Y., Vyazovskiy, V. V., Cirelli, C., Banks, M. I., & Tononi, G. (2015). Auditory responses and stimulus-specific adaptation in rat auditory cortex are preserved across NREM and REM sleep. Cerebral Cortex, 25(5), 1362–1378.Find this resource:

Noble, W. S. (2006). What is a support vector machine? Nature Biotechnology, 24(12), 1565–1567.Find this resource:

Portfors, C. V. (2007). Types and functions of ultrasonic vocalizations in laboratory rats and mice. Journal of the American Association for Laboratory Animal Science, 46(1), 28–34.Find this resource:

Portfors, C. V., & Roberts, P. D. (2014). Mismatch of structural and functional tonotopy for natural sounds in the auditory midbrain. Neuroscience, 258, 192–203.Find this resource:

Portfors, C. V., Roberts, P. D., & Jonson, K. (2009). Over-representation of species-specific vocalizations in the awake mouse inferior colliculus. Neuroscience, 162(2), 486–500.Find this resource:

Ramachandran, R., & May, B. J. (2002). Functional segregation of ITD sensitivity in the inferior colliculus of decerebrate cats. Journal of Neurophysiology, 88(5), 2251–2261.Find this resource:

Ranasinghe, K. G., Vrana, W. A., Matney, C. J., & Kilgard, M. P. (2013). Increasing diversity of neural responses to speech sounds across the central auditory pathway. Neuroscience, 252, 80–97.Find this resource:

Rice, J. J., May, B. J., Spirou, G. A., & Young, E. D. (1992). Pinna-based spectral cues for sound localization in cat. Hearing Research, 58, 132–152.Find this resource:

Rice, J. J., Young, E. D., & Spirou, G. A. (1995). Auditory-nerve encoding of pinna-based spectral cues: Rate representation of high-frequency stimuli. Journal of the Acoustical Society of America, 97, 1764–1776.Find this resource:

Roberts, P. D., & Portfors, C. V. (2015). responses to social vocalizations in the dorsal cochlear nucleus of mice. Frontiers in Systems Neuroscience, 9, 172.Find this resource:

Romo, R., & Salinas, E. (2003). Flutter discrimination: Neural codes, perception, memory and decision making. Nature Reviews Neuroscience, 4(3), 203–218.Find this resource:

Rothschild, G., Cohen, L., Mizrahi, A., & Nelken, I. (2013). Elevated correlations in neuronal ensembles of mouse auditory cortex following parturition. Journal of Neuroscience, 33(31), 12851–12861.Find this resource:

Rothschild, G., Nelken, I., & Mizrahi, A. (2010). Functional organization and population dynamics in the mouse primary auditory cortex. Nature Neuroscience, 13(3), 353–360.Find this resource:

Ru, P., Chi, T., & Shamma, S. (2003). The synergy between speech production and perception. Journal of the Acoustical Society of America, 113(1), 498–515.Find this resource:

Sachs, M. B., & Young, E. D. (1979). Encoding of steady-state vowels in the auditory nerve: Representation in terms of discharge rate. Journal of the Acoustical Society of America, 66, 470–479.Find this resource:

Schnupp, J., Nelken, I., & King, A. J. (2011). Auditory neuroscience: Making sense of sound. Cambridge, MA: MIT Press.Find this resource:

Schnupp, J. W., Hall, T. M., Kokelaar, R. F., & Ahmed, B. (2006). Plasticity of temporal pattern codes for vocalization stimuli in primary auditory cortex. Journal of Neuroscience, 26(18), 4785–4795.Find this resource:

Shamir, M., & Sompolinsky, H. (2006). Implications of neuronal diversity on population coding. Neural Computation, 18(8), 1951–1986.Find this resource:

Shamma, S., & Klein, D. (2000). The case of the missing pitch templates: How harmonic templates emerge in the early auditory system. Journal of the Acoustical Society of America, 107(5 Pt 1), 2631–2644.Find this resource:

Shetake, J. A., Wolf, J. T., Cheung, R. J., Engineer, C. T., Ram, S. K., & Kilgard, M. P. (2011). Cortical activity patterns predict robust speech discrimination ability in noise. European Journal of Neuroscience, 34(11), 1823–1838.Find this resource:

Slaney, M. (1998). Auditory toolbox Ver2. Technical report.Find this resource:

Smolders, J. W., Aertsen, A. M., & Johannesma, P. I. (1979). Neural representation of the acoustic biotope. A comparison of the response of auditory neurons to tonal and natural stimuli in the cat. Biological Cybernetics, 35(1), 11–20.Find this resource:

Suga, N. (1989). Principles of auditory information-processing derived from neuroethology. Journal of Experimental Biology, 146, 277–286.Find this resource:

Suga, N. (1992). Philosophy and stimulus design for neuroethology of complex-sound processing. Philosophical Transactions of the Royal Society B, 336, 423–428.Find this resource:

Suga, N., & Ma, X. (2003). Multiparametric corticofugal modulation and plasticity in the auditory system. Nature Reviews Neuroscience, 4(10), 783–794.Find this resource:

Theunissen, F. E., & Elie, J. E. (2014). Neural processing of natural sounds. Nature Reviews Neuroscience, 15(6), 355–366.Find this resource:

Wallace, M. N., Shackleton, T. M., Anderson, L. A., & Palmer, A. R. (2005). Representation of the purr call in the guinea pig primary auditory cortex. Hearing Research, 204(1–2), 115–126.Find this resource:

Wang, X. (2013). The harmonic organization of auditory cortex. Frontiers in Systems Neuroscience, 7, 114.Find this resource:

Wang, X., Lu, T., Snider, R. K., & Liang, L. (2005). Sustained firing in auditory cortex evoked by preferred stimuli. Nature, 435(7040), 341–346.Find this resource:

Wang, X., Merzenich, M. M., Beitel, R., & Schreiner, C. E. (1995). Representation of a species-specific vocalization in the primary auditory cortex of the common marmoset: Temporal and spectral characteristics. Journal of Neurophysiology, 74(6), 2685–2706.Find this resource:

Weinberger, N. M. (2004). Specific long-term memory traces in primary auditory cortex. Nature Reviews Neuroscience, 5(4), 279–290.Find this resource:

Weinberger, N. M. (2011). The medial geniculate, not the amygdala, as the root of auditory fear conditioning. Hearing Research, 274(1–2), 61–74.Find this resource:

Winkler, I., Denham, S. L., & Nelken, I. (2009). Modeling the auditory scene: Predictive regularity representations and perceptual objects. Trends Cognitive Sciences, 13(12), 532–540.Find this resource:

Woolley, S. M., & Portfors, C. V. (2013). Conserved mechanisms of vocalization coding in mammalian and songbird auditory midbrain. Hearing Research, 305, 45–56.Find this resource:

Young, E. D., & Sachs, M. B. (1979). Representation of steady-state vowels in the temporal aspects of the discharge patterns of populations of auditory-nerve fibers. Journal of the Acoustical Society of America, 66, 1381–1403.Find this resource:

Zilany, M. S., Bruce, I. C., & Carney, L. H. (2014). Updated parameters and expanded simulation options for a model of the auditory periphery. Journal of the Acoustical Society of America, 135(1), 283–286.Find this resource: