Audiovisual Speech Perception and the McGurk Effect
- Lawrence D. RosenblumLawrence D. RosenblumDepartment of Psychology, University of California, Riverside
Research on visual and audiovisual speech information has profoundly influenced the fields of psycholinguistics, perception psychology, and cognitive neuroscience. Visual speech findings have provided some of most the important human demonstrations of our new conception of the perceptual brain as being supremely multimodal. This “multisensory revolution” has seen a tremendous growth in research on how the senses integrate, cross-facilitate, and share their experience with one another.
The ubiquity and apparent automaticity of multisensory speech has led many theorists to propose that the speech brain is agnostic with regard to sense modality: it might not know or care from which modality speech information comes. Instead, the speech function may act to extract supramodal informational patterns that are common in form across energy streams. Alternatively, other theorists have argued that any common information existent across the modalities is minimal and rudimentary, so that multisensory perception largely depends on the observer’s associative experience between the streams. From this perspective, the auditory stream is typically considered primary for the speech brain, with visual speech simply appended to its processing. If the utility of multisensory speech is a consequence of a supramodal informational coherence, then cross-sensory “integration” may be primarily a consequence of the informational input itself. If true, then one would expect to see evidence for integration occurring early in the perceptual process, as well in a largely complete and automatic/impenetrable manner. Alternatively, if multisensory speech perception is based on associative experience between the modal streams, then no constraints on how completely or automatically the senses integrate are dictated. There is behavioral and neurophysiological research supporting both perspectives.
Much of this research is based on testing the well-known McGurk effect, in which audiovisual speech information is thought to integrate to the extent that visual information can affect what listeners report hearing. However, there is now good reason to believe that the McGurk effect is not a valid test of multisensory integration. For example, there are clear cases in which responses indicate that the effect fails, while other measures suggest that integration is actually occurring. By mistakenly conflating the McGurk effect with speech integration itself, interpretations of the completeness and automaticity of multisensory may be incorrect. Future research should use more sensitive behavioral and neurophysiological measures of cross-modal influence to examine these issues.