Al Bregman's Website

Findings about ASA

The research in the McGill lab uncovered a number of principles of grouping, some similar to those in vision, others unique to auditory perception, that deal with the ASA problem. Some of them resemble the principles of grouping discovered by the Gestalt psychologists in the early twentieth century. Here are some of the findings:

Cues used by the ASA process

The perceptual segregation of different subsets of sounds in a sequence into separate streams depends upon differences in their frequencies, pitches, timbres (spectral envelopes), center frequencies (of noise bands), amplitudes, and locations, and upon the suddenness of the changes of these variables from one sound to the next. Segregation also increases as the duration of silence between sounds with the same properties gets shorter;
The perceptual fusion of simultaneous components to form a single perceived sound depends on their onset and offset synchrony, frequency separation, regularity of spectral spacing, binaural frequency matches, harmonic relations, parallel amplitude modulation, and parallel gliding of components;
Different cues for stream segregation compete to control the grouping, and different cues have different strengths;
Primitive "bottom-up" grouping occurs even when the frequency and timing of the sequence is unpredictable (although a regular rhythm may allow top-down processes to function more effectively);
An increased biasing toward forming a distinct stream builds up with longer exposure to sounds in the same frequency region. (This may be a property of top-down processes controlled by attention.);
Stream segregation is context-dependent, involving the competition of alternative organizations.

Effects of ASA on perception

A change in perceptual grouping can alter the perception of rhythms, melodic patterns, and overlap of sounds;
Patterns of sounds whose members are distributed into more than one auditory stream are much harder to perceive than those wholly contained within a single stream;
Perceptual organization can affect perceived loudness and spatial location, and can decompose the spectrally complex sensory inputs received by the low-level auditory system, into a number of separately perceived sounds;
The rules of ASA try to prevent the crossing of streams in frequency, whether the acoustic material is a sequence of discrete tones or continuously gliding tones;
Known principles of ASA can predict the camouflage of melodies and rhythms when interfering sounds are interspersed or mixed with a to-be-recognized sequence of sounds;
The apparent continuity of sounds through masking noise depends on the parts before and after the noise being perceived as parts of the same sound. This, in turn, depends on ASA principles. The stimuli we have used have included frequency glides, amplitude-varying tones, and narrow-band noises;
A perceptual stream can alter another one by capturing some of its elements;
The apparent spatial position of a sound can be altered if some of its energy becomes grouped with other sounds;
The phenomenon known as comodulation masking release (CMR) does not make the presence of the target more discriminable by simply altering the timbre of the target-masker mixture. It actually increases the subjective experience that the target is present;
Sequential capturing can affect the perception of speech, specifically the integration of perceptually isolated components in speech-sound identification;
The segregation of superimposed vowels increases when they have different pitches and different pitch transitions. We have looked at synthetic vowels that do or do not have harmonic relations between frequency components;
ASA principles help explain the construction of music, e.g., rules of voice leading;
ASA principles are used intuitively by composers to control dissonance in polyphonic music;
The segregation of concurrent streams of visual apparent motion (the "beta" motion of Max Wertheimer) works in exactly the same way as auditory stream segregation.

Competition for a frequency component (partial)

So far I have described sequential and simultaneous grouping as if they were two independent processes. But that is not true. They interact in partitioning the complex and changing mixture of frequencies that reaches our ears. In some cases they may compete. For example a given partial (D) could either be part of a series of sounds, A, B, C, that has led up to, and is continuing within, the current mixture, or else part of a new sound, D1, that has just entered the mixture. This decision will be based on whether D is a better fit with A, B, and C, or with the other partials of D1.

The "old-plus-new" strategy

The auditory system seems to use a powerful principle of organization that I have named "the old-plus-new heuristic". This strategy, carried out by the ASA system, affects the grouping of partials. It works as follows: When a spectrum gets more complex, or parts of it get more intense, especially if this occurs suddenly, the ASA system tries to interpret it as a continuing (old) spectrum to which new components have been added. If the new spectrum can be interpreted in this way, the ASA system can guess which parts of the current sound are the continuation of the old one. Then it can subtract these "old" components out of the mixture and get a clearer picture of what the newly added partials are. For this reason the onset of a new sound is a very crucial point at which its properties can be determined, because its onset triggers this decomposition of the set of components present at that time. The onset of the sound is very important for another reason. The ASA system can determine which partials started at the same time, and use this relation to group them. An important consequence of the old-plus-new analysis is that the new spectrum is heard a a separate sound, joining those that were already present in the old spectrum.

Other auditory phenomena

A number of auditory phenomena have been related to the grouping of sounds into auditory streams. They include speech perception, the perception of the order and other temporal properties of sound sequences, the combining of evidence from the two ears, the perception of numerosity, the perception of patterns by infants, the detection of patterns embedded in other sounds, the perception of simultaneous "layers" of sounds (e.g., in music), the perceived continuity of sounds through interrupting noise, perceived timbre and rhythm, and the perception of tonal sequences (reviewed in Bregman, 1990/1994).