Elsevier

Hearing Research

Volume 151, Issues 1–2, January 2001, Pages 167-187
Hearing Research

Neural correlates of auditory stream segregation in primary auditory cortex of the awake monkey

https://doi.org/10.1016/S0378-5955(00)00224-0Get rights and content

Abstract

An important feature of auditory scene analysis is the perceptual organization of sequential sound components, or ‘auditory stream segregation’. Auditory stream segregation can be demonstrated by presenting a sequence of high and low frequency tones in an alternating pattern, ABAB. When the tone presentation rate (PR) is slow or the frequency separation (ΔF) between the tones is small (<10%), a connected alternating sequence ABAB is perceived. When the PR is fast or the ΔF is large, however, the alternating sequence perceptually splits into two parallel auditory streams, one composed of interrupted ‘A’ tones, and the other of interrupted ‘B’ tones. The neurophysiological basis of this perceptual phenomenon is unknown. Neural correlates of auditory stream segregation were examined in A1 of the awake monkey using neuronal ensemble techniques (multiunit activity and current source density). Responses evoked by alternating frequency sequences of tones, ABAB, were studied as a function of PR (5, 10, 20 and 40 Hz). ‘A’ tones corresponded to the best frequency (BF) of the cortical site, while ‘B’ tones were situated away from the BF by an amount ΔF. At slow PRs, ‘A’ and ‘B’ tones evoked responses that generated an overall pattern of activity at the stimulus PR. In contrast, at fast PRs, ‘B’ tone responses were differentially suppressed, resulting in a pattern of activity consisting predominantly of ‘A’ tone responses at half the PR. The magnitude of ‘B’ tone response suppression increased with ΔF. Differential suppression of BF and non-BF tone responses at high PRs can be explained by physiological principles of forward masking. The effect of ΔF is explained by the hypothesis that responses to tones distant from the BF are more susceptible to suppression by BF tones than responses to tones near the BF. These results parallel human psychoacoustics of auditory stream segregation and suggest a cortical basis for the perceptual phenomenon.

Introduction

Acoustic components generated by multiple sound sources often impinge upon the ear simultaneously. A primary task of the auditory system is to determine which elements in the acoustic mixture originate from which sound source, thereby constructing perceptual representations of the original sources. The ease with which the brain assigns sound components to their appropriate sources is illustrated, for example, at a cocktail party: speakers’ voices, music, etc. are perceived as distinct auditory objects, despite the fact that the input to the ear is a complex sound wave arising from the summation of these acoustic signals.

Auditory scene analysis is the process by which the auditory system groups and segregates components of acoustic mixtures to construct perceptual representations of sound sources, or ‘auditory images’ (Bregman, 1990). These auditory images in turn reflect the brain’s determinations of the individuality of the sources generating the auditory signals (Bregman, 1990, Fay, 1998). Auditory scene analysis can be divided into two inter-dependent classes of processes, dealing with the perceptual organization of simultaneously and sequentially occurring acoustic elements, respectively (Bregman, 1990). Many of these processes are considered automatic, or ‘primitive’, in that they are thought to be based upon lower level neurophysiological mechanisms not dependent on learning or attention (Bregman, 1990). Acoustic features utilized by the auditory system in sound source determination are analogous to cues utilized in visual Gestalt perception. For example, acoustic elements arising from different spatial locations, or that are far apart in frequency or time, tend in nature to be generated by different sources and are perceptually segregated by the brain; sound components that are harmonically related or that rise and fall in intensity together (i.e. are co-modulated) tend to arise from a single source and are perceptually grouped. It has been maintained that scene analysis is the essence of hearing (Bregman, 1990, Yost, 1991, Fay, 1998). This assertion rests on the assumption that the world consists of distinct physical objects and events whose perceptual reconstruction from the complex flux of sensory input would clearly be of adaptive value to all organisms (Bregman, 1990).

A classic psychoacoustic phenomenon reflecting sequential organization in auditory scene analysis is called ‘auditory stream segregation’. This phenomenon is illustrated by listening to a sequence of temporally non-overlapping high and low frequency tones in an alternating pattern, ABAB. When the frequency separation (ΔF) between the tones is small (<10%), or their presentation rate (PR) is slow (<10 Hz), listeners perceive a connected and coherent alternating sequence of high and low tones (Fig. 1A). In contrast, when the ΔF is large or the PR is fast, coherence is lost and the alternating sequence perceptually splits into two parallel auditory streams, one composed of interrupted ‘A’ tones, and the other of interrupted ‘B’ tones, each perceived at half the PR (Miller and Heise, 1950, Bregman and Campbell, 1971, Van Noorden, 1975, Anstis and Saida, 1985). An important consequence of sequential stream segregation is that the perceived order of auditory events no longer corresponds to the physical order of acoustic elements (Bregman and Campbell, 1971). Van Noorden (1975) identified three psychoacoustic regions describing the perception of ABAB alternating tone sequences, the boundaries of which were dependent on the temporal and spectral proximity of the tones. The ‘temporal coherence boundary’ defines the ΔF above which the alternating sequence invariably splits into separate perceptual streams. This boundary is dependent on PR such that faster PRs require smaller ΔFs for stream segregation to occur. In contrast, the ‘fission boundary’ defines the ΔF below which segregation is impossible, irrespective of PR, and the alternating sequence is invariably perceived as a connected whole. The region lying between these two boundaries is perceptually ambiguous, since either a segregated or an integrated percept may occur.

The neural basis of auditory sequential stream segregation is unknown. Lesions of auditory cortex in animals and humans are associated with impairments in processing of auditory temporal patterns, thus implicating cortical mechanisms in sequential stream segregation (Jerison and Neff, 1953, Cowey and Weiskrantz, 1976, Colombo et al., 1996, Kelly et al., 1996, Griffiths et al., 1997, Liegeois-Chauvel et al., 1998). Several theoretical models of auditory sequential streaming have been proposed, however, these models are yet uncorroborated by direct physiological evidence (Beauvois and Meddis, 1991, Beauvois and Meddis, 1996, McCabe and Denham, 1997, Beauvois, 1985). Physiological studies have described inhibitory and facilitatory interactions between responses to components of temporally structured sounds in A1 (Shamma and Symmes, 1985, Phillips et al., 1989, Calford and Semple, 1995, Brosch and Schreiner, 1997, Brosch et al., 1999) and support theoretical models based on PR-dependent interactions between responses to successive acoustic elements within and across topographically organized frequency channels (McCabe and Denham, 1997). However, the relevance of these physiological phenomena to auditory stream segregation has not been specifically examined.

A number of investigators have emphasized a role for A1 in encoding onsets of acoustic events via synchronous transient ‘On’ responses (e.g. Creutzfeldt et al., 1980, Phillips, 1993, Phillips, 1995, Eggermont, 1994). If A1 functions as an acoustic ‘event detector’, utilizing concerted activity of neuronal populations as a basic encoding strategy, techniques measuring synchronized activity of neuronal ensembles may be well suited for examining responses to auditory temporal patterns potentially relevant for auditory stream segregation and perceptual ordering of acoustic events. Concerted activity of neuronal ensembles has been shown to reliably represent the functional organization of A1 in the absence of changes in single-unit firing rate and has been proposed as a fundamental mechanism of cortical information processing and transmission (Eggermont, 1994, deCharms and Merzenich, 1996, Lisman, 1997).

Accordingly, the present study utilizes two complementary techniques to measure synchronized synaptic and action potential activity of neuronal populations as they relate to the cortical representation of auditory temporal patterns: multiunit activity (MUA) and current source density (CSD). The goal of the study is to test whether patterns of transient neuronal ensemble responses evoked by alternating tone sequences in A1 of the awake macaque monkey parallel human psychoacoustic data on auditory stream segregation. Such a correspondence would thus support the involvement of A1 in the perceptual organization of sequential sound components. Specifically, we hypothesize that at slow PRs and moderate ΔFs, ‘A’ and ‘B’ tones in alternating sequences should produce temporal response patterns in A1 that reflect the actual PR of sequence tone elements. In contrast, at fast PRs, responses evoked by one of the tones in the alternating sequence should predominate, resulting in an overall pattern of activity at each tonotopic site consisting exclusively of ‘A’ tone or ‘B’ tone responses occurring at half the PR. This PR-dependent segregation of ‘A’ tone and ‘B’ tone responses would thus parallel the perceptual splitting of the alternating sequence into two separate streams, consisting of interrupted ‘A’ tones and interrupted ‘B’ tones, respectively, each occurring at half the PR.

While auditory stream segregation has not been behaviorally demonstrated in monkeys, its demonstration in humans, starlings (MacDougall-Shackleton et al., 1998), and goldfish (Fay, 1998) suggests that it is a biologically relevant, widespread perceptual phenomenon likely shared by non-human primates as well. Macaques and humans display similar psychophysical thresholds and discrimination abilities for spectral and temporal characteristics of auditory stimuli that would be relevant for processing auditory sequences (Gourevitch, 1970, Pfingst, 1993, Moody, 1994). Moreover, basic features of macaque primary auditory cortical anatomy and physiology are comparable to those of humans (e.g. Galaburda and Sanides, 1980, Steinschneider et al., 1998). These considerations justify the use of macaques as animal models for examining cortical mechanisms contributing to auditory stream segregation. Moreover, since primitive stream segregation is an automatic process, not dependent on learning or attention, its neural correlates can be studied under passive-listening conditions in naive animals. Identifying cortical mechanisms involved in stream segregation may not only provide important physiological constraints on theoretical models of the phenomenon, but may offer insights into the nature of the temporal processing deficits exhibited by some individuals with dyslexia who display aberrant auditory stream segregation (Helenius et al., 1999). Preliminary results of this study have been published in abstract form (Fishman et al., 1999).

Section snippets

Materials and methods

Three adult male macaque monkeys (Macaca fascicularis) were studied using methods described previously (Steinschneider et al., 1992, Steinschneider et al., 1994, Steinschneider et al., 1998). All animals were housed in our AAALAC-accredited Animal Institute under daily veterinary supervision. Using sterile techniques under barbiturate anesthesia, small holes were made in the skull to accommodate epidural matrices comprised of adjacently placed 18-gauge stainless steel tubes that served as

Results

Results are based on 12 electrode penetrations into A1 of three monkeys. The BFs of the cortical sites sampled ranged from 0.5 to 17 kHz. Four additional sites failed to exhibit clearly identifiable responses to BF tones under 20 and 40 Hz SF or AF conditions and were excluded from analysis, since these responses could not be reliably quantified.

Discussion

The present study examined whether patterns of synchronized neuronal ensemble responses evoked by alternating tone sequences in A1 parallel human psychoacoustics of auditory stream segregation. In general, we hypothesized that neural correlates of perceptual streaming in A1 would be characterized by the preferential processing of one tone in an alternating sequence over the other under acoustic stimulation conditions promoting stream segregation. As a consequence of this preferential

Acknowledgements

This research was supported by grants DC00657 and HD01799, and the Institute for the Study of Music and Neurologic Function of Beth Abraham Hospital. We are grateful to Dr. Steven Walkley, May Huang, Linda O’Donnell, Shirley Seto and Dr. Elena Zotova for providing excellent technical, secretarial and histological assistance.

References (64)

  • M. Steinschneider et al.

    Cellular generators of the cortical auditory evoked potential initial component

    Electroencephalogr. Clin. Neurophysiol.

    (1992)
  • W.A. Yost

    Auditory image perception and analysis: the basis for hearing

    Hear. Res.

    (1991)
  • C. Alain et al.

    Signal clustering modulates auditory cortical activity in humans

    Percept. Psychophys.

    (1994)
  • S. Anstis et al.

    Adaptation to auditory streaming of frequency-modulated tones

    J. Exp. Psychol. Hum. Percept. Perf.

    (1985)
  • Arezzo, J.C., Vaughan, H.G., Jr., Kraut, M.A., Steinschneider, M., Legatt, A.D., 1986. Intracranial generators of...
  • M.W. Beauvois

    The effect of tone duration on auditory stream formation

    Percept. Psychophys.

    (1985)
  • M.W. Beauvois et al.

    A computer model of auditory stream segregation

    Q. J. Exp. Psychol.

    (1991)
  • M.W. Beauvois et al.

    Computer simulation of auditory stream segregation in alternating-tone sequences

    J. Acoust. Soc. Am.

    (1996)
  • M.-C. Botte et al.

    Perceptual attenuation of nonfocused auditory streams

    Percept. Psychophys.

    (1997)
  • Bregman, A.S., 1990. Auditory Scene Analysis: the Perceptual Organization of Sound. MIT Press, Cambridge,...
  • A.S. Bregman et al.

    Primary auditory stream segregation and perception of order in rapid sequences of tones

    J. Exp. Psychol.

    (1971)
  • M. Brosch et al.

    Stimulus-dependent modulations of correlated high-frequency oscillations in cat visual cortex

    Cereb. Cortex.

    (1997)
  • M. Brosch et al.

    Time course of forward masking tuning curves in cat primary auditory cortex

    J. Neurophysiol.

    (1997)
  • M. Brosch et al.

    Processing of sound sequences in macaque auditory cortex: response enhancement

    J. Neurophysiol.

    (1999)
  • M.B. Calford et al.

    Monaural inhibition in cat auditory cortex

    J. Neurophysiol.

    (1995)
  • M. Colombo et al.

    The effects of superior temporal cortex lesions on the processing and retention of auditory information in monkeys (Cebus apella)

    J. Neurosci.

    (1996)
  • O. Creutzfeldt et al.

    Thalamocortical transformation of responses to complex auditory stimuli

    Exp. Brain Res.

    (1980)
  • H. Dai et al.

    Effective attenuation of signals in noise under focused attention

    J. Acoust. Soc. Am.

    (1991)
  • R.C. deCharms et al.

    Primary cortical representation of sounds by the coordination of action-potential timing

    Nature

    (1996)
  • J.J. Eggermont

    Neural interaction in cat primary auditory cortex II. Effects of sound stimulation

    J. Neurophysiol.

    (1994)
  • Y.I. Fishman et al.

    Neural correlates of auditory stream segregation in primary auditory cortex (A1) of the awake monkey

    Soc. Neurosci. Abstr.

    (1999)
  • Y.I. Fishman et al.

    Complex tone processing in primary auditory cortex of the awake monkey. I. Neural ensemble correlates of roughness

    J. Acoust. Soc. Am.

    (2000)
  • Cited by (208)

    View all citing articles on Scopus
    View full text