A Book On Auditory Scene Analysis

Bregman, Albert S., Auditory Scene Analysis: The Perceptual Organization of sound. Cambridge, Massachusetts: The MIT Press, 1990 (hardcover)/1994 (paperback).

In this book, I attempted to integrate the existing research on the perceptual organization of sound by connecting it with the "scene analysis" problem encountered in machine vision. I wanted to show how a large number of auditory phenomena could be viewed as parts of the process of auditory scene analysis (ASA) and could be explained through a limited number of principles of auditory grouping. Separate chapters applied these principles to the study of music and of speech. Chapters 1 and 8 can be read alone by anyone who wants to get a general idea of ASA without all the details.
A large part of this book can be found online HERE


Chapter 1 introduces the idea of auditory scene analysis (ASA) and links it to the scene analysis problem in vision. The concept of the auditory stream is introduced and a comparison made with the notion of the "object" in vision. ASA is dependent on two kinds of integration (and segregation) of auditory information: sequential and simultaneous. These are discussed in Chapters 2 and 3. Chapter 1 also introduces the idea that the grouping of sounds can be affected by the attentional processes of listeners, guided by their schemas (knowledge of types of sounds and their properties). It is also proposed that the more "primitive" or basic processes of auditory integration and segregation are innate.
Chapter 2 discusses sequential integration in detail, describing the history, the methods and the findings that bear on this aspect of ASA. The role of sequential integration is to perceptually connect a subset of the auditory information, collected over time, into a stream that represents a single environmental source of sound. The "streaming" phenomenon is viewed as a laboratory demonstration that exposes many of the principles of sequential integration. The factors that affect it, and its consequences for how listeners will perceive sounds are described, as well as the competition among alternative perceptual organizations, and the build-up of grouping over time. Finally theories of sequential organization are discussed.
Chapter 3 is devoted to the study of the integration of simultaneous events. The role of simultaneous integration is to partition the spectral information received at the same time to form one or more concurrent sounds, each with its own qualities. There is a discussion of the factors that influence this process and also of the perceptual consequences of this type of integration. Such causal factors as harmonic relations, spatial and spectral separations are described. A powerful principle of grouping, "the old-plus new heuristic" is shown to be involved in such phenomena as the illusory continuity of softer sounds through louder, interfering sounds.
Chapter 4 discusses the role of attention, expectation and schemas in ASA and uses these ideas to explain a number of research findings. Bottom-up and top-down influences on ASA work differently and have to be distinguished.
Chapter 5 uses the concepts developed in the earlier chapters to show how "primitive" auditory organization is involved in creating the architecture or "texture" of a piece of music. The same principles of grouping and segregation that we have studied in the laboratory can throw light on a number of musical phenomena: melodic coherence, compound melodic lines, phrasing, rhythm, the phenomenal dependency of one note on another, harmony, fusion of organ stops, the rules of counterpoint, the crossing of musical lines, the "control of dissonance", and other issues.
Chapter 6 discusses the role of perceptual organization in speech perception, showing how the acoustic bases of auditory organization influence this process. Research is cited that shows how the vowel quality of sounds can be affected by the grouping of spectral components and how the pitch trajectory contributes to the perceived continuity of an utterance. Concurrent speech sounds can be segregated by differences in fundamental frequency, harmonic relations, and spatial separations. The role of speech schemas in the segregation of concurrent sounds is also considered.
Chapter 7 continues the discussion of speech perception by considering the general question of "exclusive allocation" (in which a single piece of auditory input can contribute to the mental description of only one sound at any given moment). The phenomenon of duplex perception of speech seems to violate the principle of exclusive allocation. We examine whether the perception of two concurrent percepts using the same acoustic information can only occur when the two percepts are being built by different mental "modules", such as the speech perception module and a separate sounds-in-space module.
Chapter 8 summarizes the earlier chapters and draws conclusions. It also sets out directions for future research.
At the end. An extensive set of references, a glossary and an index.

Ordering Information

Search for auditory scene analysis at MIT Press

Valid XHTML 1.0 Transitional   Copyright ©2008 - Al Bregman   Valid CSS!