Al Bregman's Website

What is auditory scene analysis?

The long-term aim of our research has been to understand "auditory scene analysis" (ASA), a process in which the auditory system takes the mixture of sound that it derives from a complex natural environment and sorts it into packages of acoustic evidence in which each package probably has arisen from a single source of sound. This grouping helps pattern recognition not to mix information from different sources.

We use our sense of hearing to understand the properties of sound-producing events. Often, we are interested in a single stream of events, such as a violin playing, a person talking, or a car approaching. In a natural listening environment, however, the acoustic energy produced by each event sequence is mixed, at the listener's ears, with all the energy arising from concurrent events.

In our research we wanted to understand how the brain could build separate perceptual descriptions of sound-generating events despite this mixing of evidence.

It appears that the first thing it does is to analyze the incoming array of sound into a large number of frequency components. But then it is left with the following problem: which combination of these components has arisen from a particular source of sound, such as the voice of a particular person continuing over time? Only by putting together the right set of frequency components over time can the identity of the signals be recognized. Otherwise, for example, the recognizer might combine the syllables of two talkers to make a spurious word.