Integration and Segregation in Auditory Stream Analysis

14
8/11/2019 Integration and Segregation in Auditory Stream Analysis http://slidepdf.com/reader/full/integration-and-segregation-in-auditory-stream-analysis 1/14 Integration and segregation in auditory scene analysis Elyse S. Sussman a) Department of Neuroscience, Albert Einstein College of Medicine, 1410 Pelham Parkway South, Bronx, New York 10461 Received 12 January 2004; revised 29 November 2004; accepted 10 December 2004 Assessment of the neural correlates of auditory scene analysis, using an index of sound change detection that does not require the listener to attend to the sounds a component of event-related brain potentials called the mismatch negativity MMN , has previously demonstrated that segregation processes can occur without attention focused on the sounds and that within-stream contextual factors inuence how sound elements are integrated and represented in auditory memory. The current study investigated the relationship between the segregation and integration processes when they were called upon to function together. The pattern of MMN results showed that the integration of sound elements within a sound stream occurred after the segregation of sounds into independent streams and, further, that the individual streams were subject to contextual effects. These results are consistent with a view of auditory processing that suggests that the auditory scene is rapidly organized into distinct streams and the integration of sequential elements to perceptual units takes place on the already formed streams. This would allow for the exibility required to identify changing within-stream sound patterns, needed to appreciate music or comprehend speech. © 2005 Acoustical Society of America. DOI: 10.1121/1.1854312 PACS numbers: 43.66.Ba, 43.66.Mk, 43.64.Bt SEM Pages: 1285–1298 I. INTRODUCTION From infancy we experience a complex auditory envi- ronment made up of several simultaneously active sound sources that often overlap in many acoustic parameters. Yet we experience a coherent auditory environment made up of identiable auditory events. Auditory scene analysis ASA involves the ability to disentangle the mixture of sound in- put, integrating sensory inputs that belong together and seg- regating those inputs that originate from different sources Bregman, 1990 . Accordingly, segregation and integration are two fundamental aspects of ASA. The focus of the cur- rent paper is on the interaction between these two important auditory processes in ASA. A. Auditory memory A crucial part of the auditory scene analysis process is the ability to connect elements over time integration pro- cesses , allowing us to recognize a series of footsteps or to understand spoken speech. Transient memory plays a critical role in this process especially because auditory processing relies so heavily on the temporal domain. This is easily dem- onstrated by thinking about how we understand a spoken sentence. Once each word of a sentence is spoken i.e., the physical source of auditory information is completed , only the neural trace of the physical sound information is retained. To get to the meaning of the sentence we have to access the memory of the previous words that were spoken. 1 Consider, however, that processing a speech stream most often occurs within the context of other simultaneous sound streams, whether on a city street, at the ofce, or in a department store. The focus of the current study is on understanding the relationship between the integration and segregation pro- cesses, specically, to focus on the order in which the fol- lowing processes are hypothesized to occur: a the segrega- tion of input into distinct streams perceived environmental sources and then b the integration of elements necessary for making sense of the sequential patterns within a particu- lar sound stream. In understanding how this memory oper- ates in facilitating ASA, it is important to understand the relationship between these processes in terms of how audi- tory events are represented and stored in memory. Event- related brain potentials ERPs were used in the current in- vestigation to explore the contents of transient auditory memory and to take advantage of the ability to use this method to observe the relationship between these processes when the listener has no task involving the sounds. B. Event-related brain potentials ERPs are noninvasive electrophysiological measures of cortical responses to sensory events that provide distinctive signatures for stimulus change. Change is an important cue for the auditory system. Because ERPs provide such high temporal resolution in the order of milliseconds and are time-locked to stimulus events, we can gain information about the timing of certain cognitive processes evoked by a given sound. Of particular importance for observing unat- tended sound processing is the mismatch negativity MMN component of ERPs Na ¨ a ¨ ta ¨ nen, 1992 . MMN reects sound change detection and is elicited even when the sounds have no relevance to ongoing behavior. The change detection pro- cess underlying MMN generation is dependent upon auditory memory. It uses sensory representations of the acoustic regu- larities extracted from the sound sequence often called the ‘‘standard’’ . These sensory representations of the standard form the basis for the change detection process. Incoming a Electronic mail: [email protected] 1285 J. Acoust. Soc. Am. 117 (3), Pt. 1, March 2005 0001-4966/2005/117(3)/1285/14/$22.50 © 2005 Acoustical Society of America

Transcript of Integration and Segregation in Auditory Stream Analysis

Page 1: Integration and Segregation in Auditory Stream Analysis

8/11/2019 Integration and Segregation in Auditory Stream Analysis

http://slidepdf.com/reader/full/integration-and-segregation-in-auditory-stream-analysis 1/14

Integration and segregation in auditory scene analysisElyse S. Sussman a)

Department of Neuroscience, Albert Einstein College of Medicine, 1410 Pelham Parkway South, Bronx, New York 10461

Received 12 January 2004; revised 29 November 2004; accepted 10 December 2004

Assessment of the neural correlates of auditory scene analysis, using an index of sound changedetection that does not require the listener to attend to the sounds a component of event-relatedbrain potentials called the mismatch negativity MMN , has previously demonstrated that

segregation processes can occur without attention focused on the sounds and that within-streamcontextual factors inuence how sound elements are integrated and represented in auditory memory.The current study investigated the relationship between the segregation and integration processeswhen they were called upon to function together. The pattern of MMN results showed that theintegration of sound elements within a sound stream occurred after the segregation of sounds intoindependent streams and, further, that the individual streams were subject to contextual effects.These results are consistent with a view of auditory processing that suggests that the auditory sceneis rapidly organized into distinct streams and the integration of sequential elements to perceptualunits takes place on the already formed streams. This would allow for the exibility required toidentify changing within-stream sound patterns, needed to appreciate music or comprehendspeech. © 2005 Acoustical Society of America. DOI: 10.1121/1.1854312

PACS numbers: 43.66.Ba, 43.66.Mk, 43.64.Bt SEM Pages: 1285–1298

I. INTRODUCTION

From infancy we experience a complex auditory envi-ronment made up of several simultaneously active soundsources that often overlap in many acoustic parameters. Yetwe experience a coherent auditory environment made up of identiable auditory events. Auditory scene analysis ASAinvolves the ability to disentangle the mixture of sound in-put, integrating sensory inputs that belong together and seg-regating those inputs that originate from different sourcesBregman, 1990 . Accordingly, segregation and integration

are two fundamental aspects of ASA. The focus of the cur-rent paper is on the interaction between these two importantauditory processes in ASA.

A. Auditory memory

A crucial part of the auditory scene analysis process isthe ability to connect elements over time integration pro-cesses , allowing us to recognize a series of footsteps or tounderstand spoken speech. Transient memory plays a criticalrole in this process especially because auditory processingrelies so heavily on the temporal domain. This is easily dem-onstrated by thinking about how we understand a spokensentence. Once each word of a sentence is spoken i.e., thephysical source of auditory information is completed , onlythe neural trace of the physical sound information is retained.To get to the meaning of the sentence we have to access thememory of the previous words that were spoken. 1 Consider,however, that processing a speech stream most often occurswithin the context of other simultaneous sound streams,whether on a city street, at the ofce, or in a departmentstore. The focus of the current study is on understanding therelationship between the integration and segregation pro-

cesses, specically, to focus on the order in which the fol-lowing processes are hypothesized to occur: a the segrega-tion of input into distinct streams perceived environmentalsources and then b the integration of elements necessaryfor making sense of the sequential patterns within a particu-lar sound stream. In understanding how this memory oper-ates in facilitating ASA, it is important to understand therelationship between these processes in terms of how audi-tory events are represented and stored in memory. Event-related brain potentials ERPs were used in the current in-vestigation to explore the contents of transient auditorymemory and to take advantage of the ability to use thismethod to observe the relationship between these processeswhen the listener has no task involving the sounds.

B. Event-related brain potentials

ERPs are noninvasive electrophysiological measures of cortical responses to sensory events that provide distinctivesignatures for stimulus change. Change is an important cuefor the auditory system. Because ERPs provide such hightemporal resolution in the order of milliseconds and aretime-locked to stimulus events, we can gain informationabout the timing of certain cognitive processes evoked by agiven sound. Of particular importance for observing unat-tended sound processing is the mismatch negativity MMNcomponent of ERPs Na atanen, 1992 . MMN reects soundchange detection and is elicited even when the sounds haveno relevance to ongoing behavior. The change detection pro-cess underlying MMN generation is dependent upon auditorymemory. It uses sensory representations of the acoustic regu-larities extracted from the sound sequence often called the‘‘standard’’ . These sensory representations of the standardform the basis for the change detection process. Incominga Electronic mail: [email protected]

1285J. Acoust. Soc. Am. 117 (3), Pt. 1, March 2005 0001-4966/2005/117(3)/1285/14/$22.50 © 2005 Acoustical Society of America

Page 2: Integration and Segregation in Auditory Stream Analysis

8/11/2019 Integration and Segregation in Auditory Stream Analysis

http://slidepdf.com/reader/full/integration-and-segregation-in-auditory-stream-analysis 2/14

sounds that deviate from the neural trace of the standardelicit MMN, which is generated in the auditory cortex and isusually evoked within 200 ms of change detection. Thus,MMN represents an early process of change detection basedupon a memory of the previous sound stimulation. An indi-cation that the mechanisms underlying the MMN system in-volve memory processes is found in evidence that MMN

generation relies on current ow through active N-methyl-D-aspartate NMDA channels Javitt et al. , 1996 . Javitt et al.showed that blockage of NMDA, which has been associatedwith the cellular mechanisms involved in memory processesCompte et al. , 2000; Kauer et al. , 1988; Pulvirenti, 1992 ,

prevented MMN generation without disturbing elicitation of the prior obligatory cortical response to sound onsets. Thissuggests a link between working memory processes and thecortical auditory information-processing network involved inMMN elicitation.

The main neural generators of the MMN are locatedbilaterally in the supratemporal plane, as determined bydipole-modeling of electric Scherg et al. , 1989; Ha et al. ,2003 and magnetic responses Sams and Hari, 1991;Woldorff et al. , 1998 , scalp-current density maps of scalp-recorded ERPs Giard et al. , 1990 , functional magneticresonance imaging fMRI; Opitz et al. , 1999 , and intracor-tical ERP recordings Halgren et al. , 1995; Kropotov et al. ,1995 . The location of the generators in the auditory cortexaccounts for the observed scalp topography of the waveform,which is maximally negative over the fronto-central scalplocations and inverts in polarity below the Sylvian ssure.

C. Assessing neural representations of soundorganization

MMN can be used to probe the neural representationsextracted from the ongoing sound input because the responseto a particular sound is based upon the memory of the pre-vious sounds. This is illustrated in Fig. 1. In the well-knownauditory oddball paradigm, an ‘‘oddball’’ or infrequent soundis presented randomly among instances of a frequently re-peating sound. The oddball elicits MMN because it is de-tected as deviating, in some sound feature frequency, inten-sity, duration, spatial location , from the frequently repeatingsound. In the top panel of Fig. 1, a regular auditory oddballparadigm is shown. The letter ‘‘X’’ represents a tone of one

frequency and the letter ‘‘O’’ represents a tone of a differentfrequency. The oddball O would elicit MMN because it hasa different frequency than that of the standard X . In thebottom panel of Fig. 1, the same ratio of O to X sounds ispresented, but instead of presenting the O tone randomly, itoccurs every fth tone in the sequence. If the brain detectsthe regularity, in this case the ve-tone repeating pattern(X-X-X-X- O -X-X-X-X- O ¯ ), then no MMN will be elic-ited by the O tone the ‘‘oddball’’ in the randomized contextbecause the O tone is part of the standard repeating ve-tone regularity. It is not deviant. Thus, it is important to notethat MMN is not ipso facto elicited when there is a situationof a frequent and an infrequent tone presented in a soundsequence. What is crucial to its elicitation is what is detectedas the regularity in the sound sequence and stored in auditorymemory. Thus, MMN is highly dependent upon the contextof the stimuli, whether that context is detected without atten-tion focused on the sounds Sussman et al. , 1999a, 1998a orwhether the context is inuenced by attentional control Sus-sman et al. , 1998b; 2002a .

Thus, the change detection process is far more complexthan is supercially evident from the auditory oddball para-

digm, the most common paradigm used to elicit MMN.Simple sensory discrimination of two tones i.e., being ableto detect whether one tone is different from another in anysound feature requires only a single presentation of eachsound. Repetition is not necessary for this type of discrimi-nation. However, the MMN process is based upon repetitionand the detection of the regularities in the acoustic input.Even with the seemingly simple auditory oddball paradigm,deviance detection is based upon the memory of the largercontext of the sounds and not simply on the detection of twodifferent tones in the sound sequence. Specically becauseMMN relies on the larger auditory context sequential pro-cessing of auditory events , it can be used to assess how the

acoustic information is structured and stored in memory.

D. Segregation of sounds to create distinct sources

Segregation of sounds to distinct auditory sources is akey function of the auditory system. Consider that the acous-tic information entering one’s ears is a mixture of all thesounds in the environment, without separation. Decomposingthe auditory input is therefore a crucial step, allowing us todetect a single voice in a crowd or to distinguish a voicecoming from the left or right in the proverbial ‘‘cocktailparty’’ setting. For that reason, the process of disentanglingthe sound into distinct sources plays a critical role in how weexperience the auditory environment. Although there remainscontroversy over the role of attention in the stream segrega-tion process Botte et al. , 1997; Bregman, 1990; Brochardet al. , 1999; Carlyon et al. , 2001; Jones et al. , 1978; 1999;Macken et al. , 2003; Sussman et al. , 1999a; Winkler et al. ,2003 , there is considerable ERP evidence to suggest that thesegregation of auditory input to distinct sound streams canoccur without attention focused on the sounds, and that au-ditory memory can hold information about multiple soundstreams independently Sussman et al. , in press, submitted,1999a; Ritter et al. , 2000; Winkler et al. , 2003 . This basicnding has been replicated in various paradigms using dif-

FIG. 1. Contextual effects on MMN . The top row displays a typical ‘‘audi-tory oddball paradigm,’’ with an ‘‘oddball’’ tone represented by the letter‘‘O’’ presented randomly 20% amongst a frequent tone represented bythe letter ‘‘X;’’ 80% . The bottom row displays the same percentage of oddballs O together with the frequent tones 80% except that the O toneis presented in a recurring pattern—it occurs every fth tone. See text forfurther discussion.

1286 J. Acoust. Soc. Am., Vol. 117, No. 3, Pt. 1, March 2005 Elyse S. Sussman: Integration and segregation

Page 3: Integration and Segregation in Auditory Stream Analysis

8/11/2019 Integration and Segregation in Auditory Stream Analysis

http://slidepdf.com/reader/full/integration-and-segregation-in-auditory-stream-analysis 3/14

ferent segregation cues, such as frequency proximity Suss-man et al. , 2001; Winkler et al. , 2003 , temporal proximitySussman et al. , 1999a , and a combination of spatial loca-

tion, frequency, and intensity Ritter et al. , 2000 . Thesestudies provide evidence in humans that auditory stream seg-regation mechanisms are part of an early, primitive process.This is consistent with evidence from animal studies demon-strating that the basic stream segregation mechanisms arepart of vertebrates’ auditory systems Fay, 2000; Hulse et al. ,1997; Fishman et al. , 2001 . It is functionally parsimoniousthat the basic stream segregation process would occur auto-matically calculated on the basis of the acoustic character-istics of the sensory input , so that attention, a limited humanresource, could be used to identify patterns within the al-ready organized streams, needed for understanding speech orlistening to music.

E. Integration of sound elements within a singlestream

Auditory integrative processes have predominantly fo-cused on how successive elements inuence a single perceptof sound, such as with the phenomena of loudness summa-

tion Zwislowski, 1969 and auditory recognition maskingMassaro, 1975 . The perception of a sound event is often

determined by the sounds that surround it. Recently, how-ever, it was demonstrated that the larger context of a se-quence of sounds could affect processing of the individualsound components even when not in direct temporal proxim-ity Sussman et al. , 2002b; Sussman and Winkler, 2001 . It isthis context-dependent integration process that is the focus of the current study and described in detail below.

The paradigm of the current study was based on previ-ous work Sussman et al. , 1999b, 2002b; Sussman and Win-kler, 2001 showing that two different deviations from thesame repetitive standard, occurring successively within a lessthan 200-ms interval called a ‘‘double deviant’’ , elicit onlyone MMN. However, when double deviants were presentedwithin a stimulus sequence that also contained single devi-ants i.e., sounds that differ from the repetitive standard inone of the two ways of the double deviants , the doubledeviants elicited two discrete MMNs, which were separatedby the temporal distance of the successive deviations a sche-matic of the basic paradigm used in these studies is shown inFig. 2 . Initially, we hypothesized that the successive devi-ants were integrated into a single deviant event because theywere indistinguishable as discrete events when subjects ig-nored the sounds. This led to a prediction that if subjectscould actively discriminate the two successive deviants in theblocked context, then the double deviants would elicit twoMMNs. However, this turned out not to be the case. Wefound that even when subjects were trained to discriminatethe two successive deviants, only one MMN was still elicitedby them Sussman et al. , 2002b . The integration phenom-enon occurred solely as a function of changing the stimulus-driven context i.e., blocked versus mixed irrespective of whether attention was focused on the sounds.

The results of these studies are consistent with a context-based interpretation of auditory event formation in that therelevant information inuencing event formation can be cal-

culated over a longer time. In the blocked context, the sec-

ond deviant is fully predicted by the rst. Every time a de-viant occurs another one follows it. Thus, the second of thesuccessive deviants does not provide new information about‘‘deviancy.’’ In contrast, when the double deviants are mixedwith single deviants in the sequence, in the mixed context,the second successive deviant is not fully predicted; it carriesnew information such that the second deviant distinguishesthe two types of deviants presented in the block. The pres-ence or absence of single deviants in the sound sequencecreates different contexts for the evaluation of the doubledeviants and, therefore, the double deviants are processedeither as unitary events or as two successive events depend-ing on which context they occur in. Additionally, the MMNprocess appears to be predominantly inuenced by the con-text of a sequence of sounds in a longer-term period thanpreviously thought estimated to be within 20 s in Sussmanand Winkler 2001 . This global context effect on MMN hasalso been shown in Sussman et al. 2003 .

These data, individually assessing the neural correlatesof segregation and integration in ASA, suggest that 1 seg-regation can occur without attention focused on the soundsand that 2 within-stream contextual factors inuence howevents are represented in auditory memory.

In the current study, three experiments were conductedto investigate the relationship between the segregation andintegration processes when they were called upon to functiontogether—segregation of sounds, by spectral experiments 1and 2 or nonspectral experiment 3 cues, and integration of double deviants within a stream. The blocked and mixedcondition paradigms of Sussman et al. 2002b; see Fig. 2were used within an alternating sequence of high- and low-frequency sounds experiments 1 and 2 and within an alter-nating sequence of bandpass-ltered noise and harmoniccomplexes experiment 3 . The sounds were presented at afast rate that would induce stream segregation and the inte-gration paradigm was imposed within the low-frequencysound stream experiment 1 , within both high- and low-

FIG. 2. A schematic representation showing the blocked and mixed se-quences used in Sussman et al. 1999b, 2002b . In the blocked presentation,every time a deviant occurs another deviant follows it termed ‘‘doubledeviants’’ . In the mixed presentation, both single and double deviants occurrandomly in the sequence. See text for further discussion.

1287J. Acoust. Soc. Am., Vol. 117, No. 3, Pt. 1, March 2005 Elyse S. Sussman: Integration and segregation

Page 4: Integration and Segregation in Auditory Stream Analysis

8/11/2019 Integration and Segregation in Auditory Stream Analysis

http://slidepdf.com/reader/full/integration-and-segregation-in-auditory-stream-analysis 4/14

frequency streams experiment 2 , and within the complextone stream experiment 3 . I hypothesized that the acousticcharacteristics of the input the stimulus-driven cues wouldbe used to separate sounds into distinct sources prior to theintegration of elements and the formation of sound events. If this were true, then the context of the within-stream soundsshould inuence the response to the double deviants suchthat one MMN would be elicited in the blocked within-stream context and two MMNs should be elicited in themixed within-stream context. If integration operates at a glo-bal level, then this pattern of results should not be observedbecause the intervening high tones between the double devi-ants should interfere with the integration process due to thealternation of the high and low tones—tested in the controlconditions of experiment 1 .

II. EXPERIMENT 1. SEGREGATION AND INTEGRATION„LOW STREAM …

Four conditions were presented in experiment 1, inwhich the context was manipulated within the low tonesstream. In one condition the low stream of sounds had onlydouble deviants alternating-blocked and in another condi-tion double deviants were presented along with single devi-ants alternating-mixed . Two control conditions were usedto ascertain the effect of an intervening sound within a singlestream low only—standard intervening condition and to de-termine whether the integration of elements was initiated bya time constant or by the number of elements occurring suc-cessively low only—triple deviant condition .

A. Participants

Thirteen adults ve males, eight females between theages of 23 and 42 years, with no reported hearing loss orneurological disorders, were paid to participate in the study.All subjects passed a hearing screening. Participants wereinformed of the experimental protocol before they signed aconsent form. The study was conducted at the Albert EinsteinCollege of Medicine.

B. Stimuli

Three tones one high frequency and two low frequencywere used to create the stimulus sequences. They were 50 msin duration with an intensity of 80 dB SPL calibrated withBru el & Kjær 2209 sound-level meter . The high-frequencytone was 1568 Hz H and the low-frequency tones were 440Hz L and 494 Hz LD . Onset and offset ramps of 7.5 mswere created using a Hanning window for all tones. Stimuliwere presented in four conditions: ( a ) alternating-blocked ,( b ) alternating-mixed , (c ) low only—standard intervening,and ( d ) low only—triple deviant . In the two alternating con-ditions, H and L alternated in a regular fashion at a rate of 75ms SOA onset to onset , except when the deviant LD oc-curred. Thus, a low tone or a high tone occurred once every150 ms. In the alternating blocked condition , the deviantLD always occurred twice in succession in the low streame.g., H L H L H L H LD H LD H L H L ; these will be

called ‘‘double deviants’’ see Fig. 3 . Double deviants oc-curred randomly 7.5% of the time. In the alternating mixed

condition , LD was randomly presented both singly thesewill be called ‘‘single deviants’’ 5% and twice in succession5% each e.g., H L H L H L H LD H LD H L H L H LD HL; see Fig. 3 . LD was presented 15% of the time overall inboth conditions. Thus, the ‘‘blocked’’ and ‘‘mixed’’ nomen-clature refers to the low stream only, specically to the de-viants and whether there are only double deviants in the se-quence blocked or whether there are mixed single deviantswith double deviants mixed in the same sequence.

C. Procedures

Participants sat in a comfortable chair approximately 1m from a TV monitor in a sound-attenuated booth. Theywere instructed to watch a closed-captioned video, itself con-taining no sounds, and to ignore the sounds. Three thousandstimuli were presented in each of the four conditions (3000

4 12 000 stimuli in all . There were two runs of 1500stimuli presented, totaling eight runs. The eight runs werepresented randomly, counterbalanced across subjects toavoid any possible order effects.

D. Electroencephalogram „EEG … recording and dataanalysis

EEG was recorded with an electrode cap using the fol-lowing scalp locations 10–20 system : Fz, Cz, Pz, F3, F4,C3, C4, and both mastoids LM and RM, for the left andright mastoids, respectively . Horizontal eye movementswere measured by recording the horizontal electro-oculogram HEOG between electrodes F7 and F8 using abipolar montage and vertical using a bipolar montage be-tween FP1 and an external electrode placed beneath the lefteye VEOG . The common reference electrode was attachedto the tip of the nose. Eye movement was monitored withVEOG and HEOG. HEOG was monitored for eye saccade

FIG. 3. A schematic diagram of the stimulus paradigm used in experiment 1 .The alternating pattern of low L and high H tones that was used to createthe alternating conditions is shown in the left column. ‘‘L’’ represents a toneof one frequency and the ‘‘LD’’ represents a tone of a slightly higher pitchthan the L tones. The ‘‘H’’ represents tones of much higher frequency thanthe L and LD tones. The critical variable across conditions is the response tothe ‘‘double deviants,’’ the two successive ‘‘LD’’ tones occurring in the lowstream of the alternating conditions. Two control conditions are displayed inthe right column. The low-only conditions were presented using the samefrequency tones as in the alternating conditions; however, no H tones werepresented in these conditions.

1288 J. Acoust. Soc. Am., Vol. 117, No. 3, Pt. 1, March 2005 Elyse S. Sussman: Integration and segregation

Page 5: Integration and Segregation in Auditory Stream Analysis

8/11/2019 Integration and Segregation in Auditory Stream Analysis

http://slidepdf.com/reader/full/integration-and-segregation-in-auditory-stream-analysis 5/14

movement to ensure that participants were reading the cap-tions on the video. The EEG was digitized Nicolet ampli-ers at a sampling rate of 250 Hz bandpass 0.05–100 Hzand then off-line ltered between 1 and 30 Hz. Epochs of 600 ms, with a 100-ms prestimulus and 500-ms poststimulusperiod, were averaged separately for the standards anddouble deviants. The double deviants and triple deviantwere averaged from the onset of the rst of the successivedeviants, thus the response to the successive deviants couldbe observed within this period. Epochs with an electricalchange exceeding 100 V were excluded from further aver-aging to remove the trials contaminated by artifacts of non-cortical origin.

For each participant, the artifact-free epochs were thenaveraged separately for the four conditions and two stimulustypes standard and deviant . The mean amplitude in the100-ms prestimulus period was subtracted from each point of the averaged ERP responses, which served as the referencefrom which the evoked responses were measured.

The difference waveforms were obtained by subtractingthe ERP response to the standard e.g., L-H-L in the alternat-ing condition from the ERP response to the deviants. In the

mixed conditions, double deviants LD-H-LD were aver-aged separately from single deviants LD-H-H . The inter-vals used for statistically evaluating for the presence of thepotential MMNs were determined from the group meandeviant-minus-standard waveforms at Fz the site of greatestsignal to noise ratio , taking a 40-ms window centered on thepeak of the MMNs from the conditions in which two succes-sive MMNs were elicited the low only—standard interven-ing and alternating-mixed conditions . The mean amplitudeobtained in the intervals used for the two low-only condi-tions range 1: 116–156 ms and range 2: 304–344 ms andfor the two alternating conditions range 1: 140–180 ms andrange 2: 320–360 ms were measured for each subject indi-

vidually and were then used to verify the presence of MMN,for each range and condition separately with one-sample,one-tailed t -tests using the data from the Fz electrode site.Analysis of variance ANOVA for repeated measures withfactors of condition blocked versus mixed and range range1 versus range 2 was used to compare the mean voltage.Greenhouse–Geisser correction was reported when appropri-ate. Tukey HSD posthoc tests were calculated.

E. Results and discussion

The ERP results for the low-only and alternating condi-tions are displayed in Figs. 4 and 5, respectively also see

Table I .1. Low-only conditions

In the standard-intervening condition , the two deviantselicited separate MMNs ( t 12 6.14, p 0.01, t 12 2.11, p

0.03; ranges 1 and 2, respectively approximately 150 msfrom each other see Fig. 4, top . This demonstrates that thetwo deviants were processed as separate events when therewas just one standard tone intervening between them withinthe single stream. In the triple-deviant condition , the triple-deviant elicited only one MMN ( t 12 6.87, p 0.01, t 12 1, p 0.72; ranges 1 and 2, respectively; see Fig. 4, bottom ,

suggesting that event formation within the integrating periodoperates independently of the number of elements fallingwithin the window. The single presentation of a standardtone between the two deviants e.g., LD-L-LD resulted inthe deviants being processed as two separate events in thestandard-intervening condition, compared to the triple-deviant condition in which all three of the elements pre-sented in the same time window were processed as a singledeviant event LD-LD-LD . This provides evidence thatevent formation operates on the basis of the type of elementsfalling within the integrating period, not simply integrating

FIG. 4. Experiment 1—Low tones only conditions . Grand mean event-related brains potentials elicited by the L thin line and LD thick linetones in the low-only conditions are displayed in the left column. Thedeviant-minus-standard difference curves are displayed in the right columnwith Fz thick line overlaid with the left mastoid LM; thin line . Arrowspoint to signicant MMNs. These two traces Fz and LM provide the re-cording sites with the largest signal-to-noise ratio for the MMN componentas well as indicating the topographical information needed to demonstrate

that the source of the response was generated in the auditory cortex see theIntroduction . The response to the successive deviants can be observedwithin the epoch.

FIG. 5. Experiment 1—Low tones in the alternating conditions . Grand meanevent-related brains potentials elicited by the L thin line and LD thick line tones in the alternating conditions are displayed in the left column. Thedeviant-minus-standard difference curves are displayed in the right columnwith Fz thick line overlaid with the left mastoid LM; thin line . Arrowspoint to signicant MMNs.

1289J. Acoust. Soc. Am., Vol. 117, No. 3, Pt. 1, March 2005 Elyse S. Sussman: Integration and segregation

Page 6: Integration and Segregation in Auditory Stream Analysis

8/11/2019 Integration and Segregation in Auditory Stream Analysis

http://slidepdf.com/reader/full/integration-and-segregation-in-auditory-stream-analysis 6/14

all elements by a time constant. The integration or segrega-tion of successive elements within the low stream was gov-erned by the nature of the information: elements carrying thesame information were integrated into a single auditoryevent, whereas elements that differed within that same timeperiod were distinguished from each other. This suggests thatthe underlying time constant that tends to integrate informa-tion operates on the basis of the distinctiveness of the infor-

mation in the larger context.

2. Alternating conditions

In the blocked condition , double deviants in the lowstream elicited one MMN ( t 12 2.52, p 0.02, t 12 1, p

0.50; ranges 1 and 2, respectively; see Fig. 5, top row .The double deviants were integrated into a single deviantevent even though there was a tone intervening betweenthem. This indicates that the streams were segregated prior tothe integration process. In contrast, in the mixed condition ,the same double deviant events elicited two MMNs ( t 12

2.33, p 0.02, t 12 5.05, p 0.01; ranges 1 and 2, respec-tively; Fig. 5, bottom row when single deviants also oc-curred within the low stream. ANOVA comparing the meanvoltages of the difference waveforms in ranges 1 and 2 re-vealed an interaction between condition blocked versusmixed and range F (1,12) 6.48, p 0.03 . Posthoc analy-ses showed that this interaction was due to larger mean am-plitude for range 1 than range 2 in the blocked condition andno difference between mean amplitude in ranges 1 and 2 of the mixed condition. This conrms that one MMN was elic-ited in the blocked condition and two MMNs were elicited inthe mixed condition. This result demonstrates that the pres-ence of the single deviants in the low stream altered the

MMN response to the double deviants in the same way asoccurs when no intervening high tones are presented in thestimulus block i.e., when the low stream is presented aloneat the same stimulus rate e.g., Sussman et al. , 2002b .

The double deviants in the alternating-blocked conditionelicited one MMN but they elicited two MMNs in thealternating-mixed condition . Double deviants in both alter-nating conditions should have elicited two MMNs if the suc-

cessive deviant-standard-deviant pattern had been processedas a single stream, as demonstrated by results of the low-onlyconditions. However, the presence of the single deviantswithin the low stream of the mixed condition altered theMMN response to the double deviants. This difference in theMMN response to the same double deviants for the blockedand mixed conditions demonstrates that the response wasdependent upon the context of the other tones within the lowstream, exactly as when the low stream is presented alonee.g., Sussman et al. , 2002a, b . That is, this contextual effect

occurred despite the fact that there were intervening hightones between the low sounds. Thus, the results indicate thatthe segregation into distinct frequency streams occurred

prior to the integration of elements within streams.

III. EXPERIMENT 2. SEGREGATION ANDINTEGRATION IN BOTH HIGH AND LOW STREAMS

A. Rationale

The within-stream contextual effects observed in experi-ment 1 occurred within only one of the two possible streams.This leaves open the question of whether integration effectsoccurred within-stream because there was only one streamwith changing information. Would the complexity having

TABLE I. Experiment 1. Mean amplitude values in V and standard deviations in parentheses of thegrand-averaged deviant-minus-standard difference waveforms in ranges 1 and 2 for all conditions at allelectrode sites.

Electrode

Condition

Low only—Standard

intervening

Low only—Triple

deviantAlternating-

BlockedAlternating-

Mixed

Range1

Range2

Range1

Range2

Range1

Range2

Range1

Range2

Fz 1.901.36

0.721.05

2.591.12

0.101.22

0.771.47

0.051.20

0.750.99

0.820.58

Cz 1.461.15

0.501.08

1.950.84

0.191.04

0.621.20

0.090.90

0.540.84

0.910.59

Pz 0.660.92

0.120.95

0.870.58

0.320.78

0.260.99

0.080.87

0.380.82

0.820.67

F3 1.541.22

0.580.93

2.210.95

0.051.03

0.601.34

0.191.18

0.530.90

0.800.57

F4 1.831.39

0.511.09

2.461.18

0.211.28

0.851.51

0.081.06

0.851.07

0.850.70

C3 1.101.05

0.270.85

1.450.65

0.110.88

0.341.20

0.250.89

0.430.83

0.790.51

C4 1.331.21

0.231.05

1.600.89

0.070.98

0.571.07

0.140.81

0.600.77

0.840.60

LM 0.66

0.60

0.19

0.69

0.76

0.76

0.52

0.65

0.58

0.62

0.02

0.64

0.03

0.54

0.43

0.61RM 0.85

0.860.320.82

1.260.60

0.300.45

0.650.56

0.130.58

0.010.59

0.330.74

1290 J. Acoust. Soc. Am., Vol. 117, No. 3, Pt. 1, March 2005 Elyse S. Sussman: Integration and segregation

Page 7: Integration and Segregation in Auditory Stream Analysis

8/11/2019 Integration and Segregation in Auditory Stream Analysis

http://slidepdf.com/reader/full/integration-and-segregation-in-auditory-stream-analysis 7/14

deviants in both streams created by simultaneously chang-ing contexts within the high and low streams modify pro-cessing of the sounds? If complexity could negate the inte-gration effects mediated by changing the contextualinformation, then a global effect on sound processing couldaccount for the results and could not be excluded as an in-terpretation of the results of experiment 1. If, however, thesame pattern of integration effects were found for doubledeviants occurring within both the low and high streamsdouble deviants eliciting one MMN in blocked and two

MMNs in mixed contexts then the conclusions of experi-ment 1 would be expanded and strengthened. To test this, thesame basic paradigm of experiment 1 was used except thatdeviants occurred in both streams.

B. Participants

Twelve adults ve males, seven females between theages of 22 and 37 years, with no reported hearing loss orneurological disorders, were paid to participate in the study.All subjects passed a hearing screening. Participants wereinformed of the experimental protocol before they signed aconsent form. The study was conducted at the Albert Einstein

College of Medicine. Two subjects’ data were removed dueto excessive eye artifact. One subject’s data were removeddue to experimental error. The data from the remaining ninesubjects are reported. None of the subjects that participatedin experiment 2 participated in experiment 1.

C. Stimuli

Four tones two high frequency and two low frequency ,50 ms in duration tone intensity was calibrated with Bru ¨el &Kjær 2209 sound-level meter at a level of 80 dB SPL , wereused to create the stimulus sequences. The high-frequencytones were 1568 Hz H and 1760 Hz HD and the low-

frequency tones were 440 Hz L and 494 Hz LD . Onsetand offset ramps of 7.5 ms were created with a Hanningwindow for all tones. Stimuli were presented in two condi-tions: a alternating-blocked and b alternating-mixed . Hand L alternated in a regular fashion at a rate of 75 ms SOAonset-to-onset . Thus, a low tone or a high tone occurred

once every 150 ms. In the alternating-blocked conditions,double deviants LD or HD always occurred twice in suc-cession presented randomly 6% of the time in their respec-tive stream but deviants never overlapped across streams. Inthe alternating-mixed conditions , HD and LD were presentedrandomly as single deviants 4% and twice in succession4% each. Thus, HD and LD were presented 12% overall in

both conditions. The ‘‘blocked’’ and ‘‘mixed’’ nomenclaturerefers to the presentation of the deviants in both the high-and low-frequency streams.

D. Procedures

Participants sat in a comfortable chair approximately 1m from a TV monitor in a sound-attenuated booth. Theywere instructed to watch a closed-captioned video, itself con-taining no sounds, and to pay no attention to the sounds. Sixthousand stimuli were presented in the blocked condition intwo runs of 3000 stimuli, and 15 000 stimuli in ve runs of

3000 stimuli were presented in the alternating condition, to-taling seven runs. The seven runs were presented randomly,counterbalanced across subjects to avoid any possible ordereffects.

E. Electroencephalogram „EEG … recording and dataanalysis

Recording parameters matched those of experiment 1with the following exceptions. The EEG was digitized Neu-

roscan Synamps ampliers at a sampling rate of 500 Hzbandpass 0.05–100 Hz and then off-line ltered between 1and 15 Hz. Epochs with an electrical change from baselineexceeding 75 V were excluded from further averaging.

For each participant, the artifact-free epochs were thenaveraged separately for two conditions blocked and mixed ,two frequency ranges high and low , and two stimulus typesstandards and double deviants .

The difference waveforms were obtained by subtractingthe ERP response to the standard from the ERP response tothe deviant. The intervals used for statistically evaluating forthe presence of the potential MMNs were determined fromthe group mean deviant-minus-standard waveforms at Fz

the site of greatest signal-to-noise ratio , taking a 40-mswindow centered on the peak of the MMNs for range 1 andrange 2, which can be observed in the epoch. Range 2 inter-vals for the blocked conditions were obtained from the cor-responding mixed conditions, in which two MMNs wereelicited. The mean amplitude obtained in the intervals usedfor the blocked—high condition was range 1: 106–146 msand range 2: 270–310 ms; for the blocked—low conditionwas Range 1: 126–166 ms and range 2: 304–344 ms; for themixed—high condition was range 1: 100–140 ms and range2: 270–310 ms; and for the mixed–low condition was range1: 126–166 ms and range 2: 304–344 ms. The intervals weremeasured for each subject individually and were then used toverify the presence of MMN. Because the predictions for theMMNs were a priori , one-sample one-tailed Student’s t -testswere used to determine whether the mean amplitudes ob-tained in these latency ranges were signicantly greater thanzero, separately for each range and condition. ANOVA forrepeated measures with factors of condition blocked versusmixed , frequency high versus low and range range 1 ver-sus range 2 was used to compare mean voltage.Greenhouse–Geisser correction was reported when appropri-ate. Tukey HSD posthoc tests were calculated.

F. Results and discussion

The ERP results for the alternating-blocked andalternating-mixed conditions are displayed in Figs. 6 and 7,respectively also see Table II .

1. Alternating-blocked condition

Double deviants elicited one MMN in the low stream( t 9 5.1, p 0.01, t 9 1, p 0.43; ranges 1 and 2, respec-tively; see Fig. 6, top row and one MMN in the high stream( t 9 2.0, p 0.05, t 9 1.0, p 0.16; ranges 1 and 2, respec-tively; see Fig. 7, top rows . Consistent with the results of experiment 1, blocked condition double deviants elicited oneMMN.

1291J. Acoust. Soc. Am., Vol. 117, No. 3, Pt. 1, March 2005 Elyse S. Sussman: Integration and segregation

Page 8: Integration and Segregation in Auditory Stream Analysis

8/11/2019 Integration and Segregation in Auditory Stream Analysis

http://slidepdf.com/reader/full/integration-and-segregation-in-auditory-stream-analysis 8/14

2. Alternating-mixed condition When single deviants also occurred within respective

streams, double deviants elicited two MMNs in the lowstream ( t 9 4.8, p 0.01, t 9 4.8, p 0.01; ranges 1 and 2,respectively; see Fig. 6, bottom row and two MMNs in thehigh stream ( t 9 1.8, p 0.05, t 9 2.0, p 0.05; ranges 1and 2, respectively; see Fig. 7, bottom row . The ANOVAcomparing the mean voltage revealed an interaction of con-dition and range F (1,8) 6.37, p 0.04 and no interactionwith frequency. Post hoc analyses showed that the meanvoltage of range 1 was larger than that in range 2 for theblocked conditions but the mean voltages of the two ranges

were not signicantly different from each other in the mixedconditions. The presence of the single deviants in the mixed

condition altered the MMN response to the double deviants,as compared to the blocked conditions, independently ineach stream.

The results of experiment 2 rule out complexity as anexplanation for the within-stream integration effects found inexperiment 1, and extend the results of experiment 1 to showthat integration processes are carried out concurrently onmultiple streams of auditory information.

IV. EXPERIMENT 3. SEGREGATION BYNONSPECTRAL CUE

A. Rationale

Experiments 1 and 2 used pure tone stimuli with a large15 semitone difference between high and low sounds. It is

therefore possible to consider that the integration effects mayhave taken place by frequency range, and were not mediatedby streaming. An alternative interpretation would be thatwith no overlapping excitation of hair cells within the co-chlea, frequency effects rather than segregation effects mayexplain the integration effects of experiments 1 and 2. Thisinterpretation of the current results would characterize fre-quency as the dominant factor inuencing the integrationprocesses, independent of segregation. Thus, another experi-ment was conducted to assess the integration effect whenstream segregation was cued by a factor other than spectralinformation. The same basic paradigm of experiment 1 wasused, except that the alternating sequence of sounds includedcomplex tones and narrowband noise bursts with overlappingspectral composition. The noise and tones were expected tosegregate to separate streams Singh and Bregman, 1997;Iverson, 1995; Cusack and Roberts, 1999, 2000 . If the inte-gration were a frequency-specic phenomenon in the previ-ous two experiments, that is, if the integration process did

not depend on the initial segregation of the information, thentwo MMNs should be elicited in both blocked and mixedcontexts, because of the effect of the intervening tones seeexperiment 1, discussion of low-only control conditions .This result may be consistent with a peripheral channelingexplanation Beauvois and Meddis, 1991; Hartmann andJohnson, 1991; Rose and Moore, 2000 . On the other hand, if a similar pattern of MMN results occurs as did in experi-ments 1 and 2, then it could be concluded that the segrega-tion process preceded the integration process. Or, stated dif-ferently, we can conclude that the integration of sequentialsound elements takes place within the distinct sound streams.

B. Participants

Twelve adults ve males, seven females between theages of 22 and 37 years, with no reported hearing loss orneurological disorders, were paid to participate in the study.All subjects passed a hearing screening. Participants wereinformed of the experimental protocol before they signed aconsent form. The study was conducted at the Albert EinsteinCollege of Medicine. Two subjects’ data were removed dueto excessive eye artifact. One subject’s data were removeddue to experimental error. The data from the remaining nine

FIG. 6. Experiment 2—Low tones in the alternating conditions . Grand meanevent-related brains potentials elicited by the L thin line and LD thick line tones in the blocked conditions are displayed in the left column. Thedeviant-minus-standard difference curves are displayed in the right columnwith Fz thick line overlaid with the left mastoid LM; thin line . Arrowspoint to signicant MMNs.

FIG. 7. Experiment 2—High tones in the alternating conditions . Grandmean event-related brains potentials elicited by the H thin line and HDthick line tones in the alternating conditions are displayed in the left col-

umn. The deviant-minus-standard difference curves are displayed in theright column with Fz thick line overlaid with the left mastoid LM; thinline . Arrows point to signicant MMNs.

1292 J. Acoust. Soc. Am., Vol. 117, No. 3, Pt. 1, March 2005 Elyse S. Sussman: Integration and segregation

Page 9: Integration and Segregation in Auditory Stream Analysis

8/11/2019 Integration and Segregation in Auditory Stream Analysis

http://slidepdf.com/reader/full/integration-and-segregation-in-auditory-stream-analysis 9/14

subjects are reported. Eight of these subjects also partici-pated in experiment 2. None of the subjects participated inexperiment 1.

C. Stimuli

Two types of complex sounds were used created with Adobe Audition 1.0 software : harmonic complexes called‘‘tones’’ and bandpass-ltered noise bursts called ‘‘noise’’ .The harmonic complexes had the same fundamental fre-quency ( f 0) as the low tones of experiments 1 and 2 440 Hzfor the standard sound and 494 Hz for the deviant sound andwere composed of the rst three harmonics with equal am-plitudes. Tone 1 T consisted of 440, 880, and 1320 Hz, andtone 2 TD consisted of 494, 988, and 1482 Hz. The othertype of complex tone N was created by narrow-band fastFourier transform FFT lter performed on broadband noisebetween 440 and 1320 Hz. Thus, it overlapped the samespectral region 440–1320 as the standard complex tone T .Tone duration for all tones was 50 ms, with onset and offsetramps of 7.5 ms made with a Hanning window. Sound inten-sity of the tones and noise were equated and presented at 77dB SPL Bru el & Kjær 2209 sound-level meter .

Stimuli were presented in four conditions: tones-onlyblocked , tones-only mixed , alternating-blocked andalternating-mixed . The tones-only conditions were controlconditions to verify that the auditory context would have asimilar effect on complex tones as they did with the pure-tone stimuli in previous studies i.e., double deviants elicit-ing one MMN in blocked context and two MMNs in mixedcontext . In the event that the effects of context on integra-tion of sequential sounds were specic to pure tones, thecontrol conditions were needed to determine this. The soundsequence was presented at 150 ms onset-to-onset pace, the

pace that the complex tones would occur in the alternatingconditions alternating-blocked and alternating-mixed of the main experiment. Three thousand stimuli were presentedin two runs of 1500 stimuli each, for the tone-only blocked condition and 6000 stimuli in four runs of 1500 stimuli forthe tone-only mixed condition .

In the alternating conditions, tone and noise stimuli werealternated T-N-T-N-T-N ¯ at an onset-to-onset pace of 75ms. The alternating conditions used the same basic paradigm

as experiment 1, with the context changes occurring only inthe tones stream. Six thousand stimuli were presented in tworuns of 3000 stimuli for the alternating-blocked conditionand 24 000 stimuli in four runs for the alternating-mixed con-dition. Deviants TD occurred 15% overall in all conditions7.5% occurrence of double deviants in the blocked condi-

tions and 8% single deviants and 7% double deviants in themixed conditions .

D. Procedures

Participants sat in a comfortable chair approximately 1m from a TV monitor in a sound attenuated booth. Theywere instructed to watch a silent closed-captioned video andto pay no attention to the sounds. Three thousand stimuliwere presented in two runs of 1500 stimuli each, for thetone-only blocked condition and 6000 stimuli in four runs of 1500 stimuli for the tone-only mixed condition . For the alter-nating conditions, 6000 stimuli were presented in two runs of 3000 stimuli for the alternating-blocked condition and24 000 stimuli in four runs for the alternating-mixed condi-tion . Deviants occurred 15% overall in all conditions 7.5%occurrence of double deviants in the blocked conditions and8% single deviants and 7% double deviants in the mixed

TABLE II. Experiment 2. Mean amplitude values in V and standard deviations in parentheses of thegrand-averaged deviant-minus-standard difference waveforms in ranges 1 and 2 for all conditions at allelectrode sites.

Electrode

Condition

Alternating-Blocked high

Alternating-Blocked low

Alternating—Mixed high

Alternating—Mixed low

Range1

Range2

Range1

Range2

Range1

Range2

Range1

Range2

Fz 0.671.01

0.250.74

0.970.57

0.040.73

0.350.62

0.430.64

1.130.70

0.590.37

Cz 0.091.15

0.171.05

0.600.43

0.160.77

0.210.76

0.170.81

0.890.58

0.320.59

Pz 0.680.95

0.040.96

0.290.68

0.460.60

0.261.06

0.230.88

0.131.11

0.081.00

F3 0.540.98

0.130.65

0.750.45

0.120.71

0.320.53

0.270.60

0.890.69

0.390.43

F4 0.550.86

0.210.56

0.870.69

0.200.77

0.210.67

0.210.63

0.980.63

0.520.42

C3 0.020.88

0.040.87

0.350.31

0.310.58

0.070.65

0.050.82

0.670.70

0.170.70

C4 0.131.03

0.050.79

0.350.60

0.350.81

0.060.87

0.040.83

0.620.60

0.130.62

LM 1.070.77

0.200.66

1.240.91

0.520.31

0.860.69

0.560.55

0.791.01

0.450.84

RM 0.850.50

0.310.50

1.110.62

0.540.58

0.680.79

0.550.60

0.750.77

0.420.73

1293J. Acoust. Soc. Am., Vol. 117, No. 3, Pt. 1, March 2005 Elyse S. Sussman: Integration and segregation

Page 10: Integration and Segregation in Auditory Stream Analysis

8/11/2019 Integration and Segregation in Auditory Stream Analysis

http://slidepdf.com/reader/full/integration-and-segregation-in-auditory-stream-analysis 10/14

conditions . The order of the four conditions was counterbal-anced across subjects to avoid any possible order effects.

E. Electroencephalogram „EEG … recording and dataanalysis

Data recording parameters were the same as those usedin experiment 2. For each participant, the artifact-free epochswere averaged separately for the complex tones by stimulustype standards T and double deviants TD , separately, in

the four conditions tones-only blocked, tones-only mixed,alternating-blocked, and alternating-mixed .

The difference waveforms were obtained by subtractingthe ERP response to the standard from the ERP response tothe deviant. The intervals used for statistically evaluating forthe presence of the potential MMNs were determined fromthe group mean deviant-minus-standard waveforms at Fzthe site of greatest signal-to-noise ratio , taking a 40-ms

window centered on the peak of the MMNs where elicited.For the blocked conditions, range 2 intervals from the mixedconditions were used to measure the presence of MMN inrange 2 of the blocked conditions because MMN was ex-pected to be elicited in the mixed conditions. For the tones-only blocked condition the mean amplitude was obtained inthe following intervals, range 1: 104–144 ms and range 2:282–322 ms; for the tones-only mixed condition the inter-vals were for range 1: 100–140 ms and range 2: 282–322ms; for the alternating-blocked condition the intervals werefor range 1: 104–144 ms and range 2: 258–298; and for thealternating-mixed condition the intervals were for range 1:100–140 ms and range 2: 258–298. Intervals were measuredfor each subject individually and then used to verify the pres-ence of the MMN using a repeated measures ANOVA withfactors of condition alternating-blocked, tones-only blocked,alternating-mixed, tones-only mixed peak range 1 versusrange 2 and stimulus type deviant versus standard to de-termine in which conditions and which peaks the standardand deviant waveforms differed from each. ANOVA withfactors of condition blocked versus mixed , control alter-nating versus tones-only , and range range 1 versus range 2was calculated on the difference waveforms to comparemean voltages. Greenhouse–Geisser correction was reportedwhen appropriate. Tukey HSD post hoc tests were calcu-lated.

F. Results and discussion

The ERP results for the tones-only and alternating con-ditions are displayed in Figs. 8 and 9, respectively also seeTable III .

The ANOVA revealed a three-way interaction betweencondition, stimulus type, and range F (3,24) 6.25, p

0.01 , showing that the signicant difference between thestandard and deviant MMN was dependent upon the con-dition and range. Tukey HSD post hoc comparisons revealedthat MMNs were elicited in both range 1 and range 2 of themixed conditions alternating mixed and tones-only mixed ;see Figs. 8 bottom row and 9 bottom row , whereasMMNs were elicited only in range 1 of the blocked condi-tions alternating blocked and tones-only blocked ; see Figs.

8 top row and 9 top row . This pattern of results is con-sistent with experiments 1 and 2, in which two MMNs wereelicited by double deviants in the mixed context and oneMMN was elicited in the blocked context.

The ANOVA comparing mean voltage of the deviant-minus-standard difference waveforms revealed an interactionbetween condition and range with no other interactionsF (1,8) 21.11, p 0.002 . Post hoc analyses show that the

mean voltage of range 1 was larger than that of range 2 forthe blocked conditions and there was no difference in meanvoltage between ranges 1 and 2 in the mixed conditions. Thissubstantiates the nding that one MMN was elicited in theblocked conditions and two MMNs were elicited in themixed conditions.

FIG. 8. Experiment 3—Tones-only conditions . Grand mean event-relatedbrains potentials elicited by the T thin line and TD thick line tones in thetones-only conditions are displayed in the left column. The deviant-minus-standard difference curves are displayed in the right column with Fz thick line overlaid with the left mastoid LM; thin line . Arrows point to signi-cant MMNs.

FIG. 9. Experiment 3—Alternating conditions . Grand mean event-relatedbrains potentials elicited by the T thin line and TD thick line tones in thealternating conditions are displayed in the left column. The deviant-minus-standard difference curves are displayed in the right column with Fz thick line overlaid with the left mastoid LM; thin line . Arrows point to signi-cant MMNs.

1294 J. Acoust. Soc. Am., Vol. 117, No. 3, Pt. 1, March 2005 Elyse S. Sussman: Integration and segregation

Page 11: Integration and Segregation in Auditory Stream Analysis

8/11/2019 Integration and Segregation in Auditory Stream Analysis

http://slidepdf.com/reader/full/integration-and-segregation-in-auditory-stream-analysis 11/14

The results extend the ndings of our previous studiesusing pure tones e.g., Sussman et al. , 2002b , now demon-strating with complex tones the same contextual effects onwithin-stream integration processes observed with puretones. Additionally, the results demonstrate stream segrega-tion by a nonspectral cue. This suggests that the stream seg-regation process involves more than simple frequency reso-lution of peripheral mechanisms. This nding is consistentwith other studies showing that stream segregation is not

solely dependent upon peripheral channeling Bey andMcAdams, 2003 Experiment 1 ; Cusack and Roberts, 2000;Iverson, 1995; Moore and Glockel, 2002; Roberts et al. ,2002; Vliegen and Oxenham, 1999 . The results providestrong support for the notion that segregation occurs prior tointegration and that integration effects operate within alreadysegregated sound streams. The interplay between these twoprocesses is thus observed in complex sound situations, suchas would be needed for processing speech and other environ-mental sounds.

V. GENERAL DISCUSSION

Three experiments were conducted to investigate thetiming of two organizational processes of ASA, segregationof streams and integration of within-stream elements, whenthey were called upon to function concurrently and the lis-tener had no task to perform with the sounds. The within-stream contextual effects, the integration of successive ele-ments to a single or double deviant event as indexed byMMN, followed the same pattern of results as has been dem-onstrated for temporal integration processes when only onesound stream was present Sussman et al. , 2002b . This in-dicates that the segregation to independent streams occurredprior to the integration and formation of the auditory units

within streams. The results demonstrate that 1 multiplesound streams are maintained independently, 2 stream seg-regation occurs prior to within-stream integration or eventformation, and 3 auditory context effects act on the alreadyformed streams.

Experiments 1 and 2 demonstrated that the integrationprocess operates on two streams independently, as thougheach stream occurred alone. These results show that the in-tegration occurred on the already segregated streams, the

segregation process taking place at an earlier level of pro-cessing than the integration process. Experiment 3 conrmedthat the segregation process preceded the integration of in-formation within streams. Given that the spectral composi-tions of the two sets of sounds were similar in experiment 3,if the sounds were processed as part of the same stream, thenthe effects of intervening tones on within stream processeswould have been observed experiment 1, control condi-tions . Thus, the results of experiment 3 provided strong sup-port to conclude that the segregation processes mediated theintegration processes.

The timing of the segregation and integration processesthat have been demonstrated in the current and previousstudies suggests the following model. A neural representationof the sound input is initially made regarding how manysources are present in the signal. This is calculated accordingto the acoustic characteristics of the input the frequency,intensity, duration, and spatial location features as well as thetiming of the input . Extraction of the acoustic regularities isthen established on the already formed streams i.e., the con-text is determined . The long-term effects of the auditorycontext mediate local, within-stream, event formation Suss-man and Winkler, 2001 . The integration processes or soundevent formation operate after the within-stream context has

TABLE III. Experiment 3. Mean amplitude values in V and standard deviations in parentheses of thegrand-averaged deviant-minus-standard difference waveforms in ranges 1 and 2 for all conditions at allelectrode sites.

Electrode

Condition

Tones-only—Blocked

Tones-only—Mixed

Alternating—Blocked

Alternating—Mixed

Range1

Range2

Range1

Range2

Range1

Range2

Range1

Range2

Fz 0.720.84

0.130.50

0.630.31

0.810.39

1.020.46

0.340.47

0.760.76

0.640.44

Cz 0.400.58

0.280.51

0.420.41

0.780.56

0.710.64

0.260.52

0.660.88

0.390.64

Pz 0.090.55

0.310.59

0.100.88

0.360.96

0.140.82

0.140.63

0.100.92

0.050.90

F3 0.640.76

0.140.42

0.490.33

0.590.43

0.870.44

0.230.35

0.710.83

0.490.34

F4 0.660.81

0.180.59

0.530.37

0.810.45

0.890.40

0.230.44

0.600.72

0.450.49

C3 0.340.58

0.230.49

0.320.45

0.610.59

0.600.66

0.220.55

0.500.88

0.210.63

C4 0.240.53

0.330.54

0.130.50

0.550.71

0.610.55

0.110.54

0.280.75

0.120.55

LM 0.560.82

0.230.65

0.480.73

0.230.71

0.920.66

0.210.65

0.580.67

0.760.72

RM 0.460.67

0.280.45

0.500.62

0.320.69

0.760.51

0.300.53

0.700.66

0.730.70

1295J. Acoust. Soc. Am., Vol. 117, No. 3, Pt. 1, March 2005 Elyse S. Sussman: Integration and segregation

Page 12: Integration and Segregation in Auditory Stream Analysis

8/11/2019 Integration and Segregation in Auditory Stream Analysis

http://slidepdf.com/reader/full/integration-and-segregation-in-auditory-stream-analysis 12/14

been determined. Thus, whether a sound element is pro-cessed as one or two events depends on the surrounding in-formation within the same stream. The MMN process usesthis information—the representation of the segregatedstreams and the detected sound regularities withinstreams—in determining what is deviant in the sound input.

A. Implications for speech processing

Extending the notion that the integration of sound ele-ments into perceptual units proceeds on the already segre-gated information, the data discussed here support the viewthat sound elements are integrated into linguistic units pho-nemes, syllables, and words after the initial segregation of the input into distinct sound sources. Speech processing, ac-cording to this model, would rely, at least in part, on theprimitive ASA processes Brown and Cooke, 1994;Rosenthal and Okuno, 1998 .

When more than one voice occurs simultaneously, seg-regation by acoustic parameters e.g., f 0) serves to distin-guish the separate voices in the auditory environment,whereas the integration of successive sound elements withinthe separate voices serves to distinguish speech units. There

are additional segmentation processes that are also necessaryfor within-stream comprehension, those that allow us to dis-tinguish words within a speech stream or individual footstepswithin a series. For example, hearing the difference between‘‘topic’’ and ‘‘top pick’’ is determined by the duration of thesilent / ! / closure less than 150 ms was needed to hear oneword and between 150 and 300 ms was needed to hear twowords; Pickett and Decker 1960 . In fact, this timing thatmade the difference between hearing one or two words isconsistent with our nding in which the same timing madethe difference between tones pure or complex being repre-sented as one or two deviant events. Two successive deviantselicited one MMN at 150-ms interval between them, whereas

the two successive deviants elicited two MMNs when theywere separated by 300 ms Sussman et al. , 1999b . Furtherresearch would be needed, however, to clarify whether pureacoustic cues function similarly as acoustic cues embeddedin speech for dening auditory events.

Within-stream segmentation also takes place within the‘‘integrating’’ period. The standard intervening tone betweenthe two successive deviant tones in the low-only—standard intervening condition inuenced the integration process asreected by the MMN response i.e., affected what becameintegrated into one event . When a single tone intervenedbetween the two successive deviant tones, falling within theless than 200-ms integrating window, each of the deviantselicited its own MMN. What otherwise might have been re-sponded to as one deviant event two successive deviantsoccurring within 150 ms was separated into two deviantevents eliciting two separate MMNs by the presence of theintervening tone occurring between them when they wereprocessed in the same low-tone stream. This result indicatesthat the integration of elements occurring within less than200 ms operates according to the sequential organization of the unit or auditory event that falls within the window. Thetemporal proximity of the elements, on its own, is notenough for integration—this should already be evident be-

cause the phenomenon of streaming can occur when soundsare 100 ms apart—but neither is the frequency proximitysufcient the low tone standards are near in frequency to thedeviants but are not integrated just by the fact that they fallwithin this window . Thus, the result indicates that the inte-gration mechanism is ‘‘searching’’ for elements that likelyform a perceptual unit. The auditory system operates on thebasis of the informational units falling within the windowand does not just integrate all the elements that fall within awindow of less than 200 ms.

This is further conrmed with the triple deviants in thelow-only—triple deviants condition , in which three succes-sive deviants occurring within the integrating period elicitedone MMN. Elements that fall within the integrating windoware being integrated to form a single auditory event. Thesedata lend credence to the notion that the integrating periodoperates independently of the sheer number of elements fall-ing within the window and focuses on the formation of events out of closely occurring sound elements. The integra-tion of elements across time occurred as a function of therole of the stimuli that fell within the integrating period e.g.,deviant vs. standard . Whereas three deviant sounds occur-

ring successively were responded to as a single deviantevent, the successive deviant-standard-deviant pattern wasnot integrated, even though the tones occurred within thesame time period. This demonstrates that auditory contextinuences the integrating window in terms of auditory eventformation. This mechanism could be important in speechprocessing when determining which phonemic units are to beintegrated and which segregated, such as those cues that dis-tinguish ‘‘topic’’ from ‘‘top pick.’’

It remains to be determined whether the segmentationprocesses that operate within streams to facilitate identica-tion of perceptual units and which are also part of integra-tion processes necessary to link acoustic elements to percep-

tual units differ from the procedures that are initially used todetermine how many sources are in the environment. Addi-tionally, the role of attention in these processes remains to bedetermined.

Context effects on MMN have now been demonstratedin a number of different studies using a wide range of para-digms Sussman et al. , 1998a, 1999a, 2001, 2002a, b; Suss-man and Winkler, 2001; Winkler et al. , 2003 . Interestingly,MMN has also been reported to reect categorical perceptionof consonants and vowels Dehaene-Lambertz et al. , 2000;Na atanen et al. , 1997; Winkler et al. , 1999 , voice onset timeSharma and Dorman, 2000 , and length Nenonen et al. ,

2003 . The phonetic context effects the shift in category

boundary that is altered by surrounding speech characteris-tics observed with these MMN studies of categorical per-ception might actually represent another form of context ef-fect on MMN and not a speech-specic effect Pisoni et al. ,1983; Sussman et al. , 2004 .

B. Concluding remarks

The results of the current study demonstrate context ef-fects on auditory event formation that occur after the segre-gation of sounds to distinct streams when the listener is notfocused on the sound input. Moreover, integration of within-

1296 J. Acoust. Soc. Am., Vol. 117, No. 3, Pt. 1, March 2005 Elyse S. Sussman: Integration and segregation

Page 13: Integration and Segregation in Auditory Stream Analysis

8/11/2019 Integration and Segregation in Auditory Stream Analysis

http://slidepdf.com/reader/full/integration-and-segregation-in-auditory-stream-analysis 13/14

stream elements is inuenced by the larger context of thewithin-stream sounds. The auditory system does not just in-tegrate all information that falls within a 150-ms window;the informational content of the elements determines what isintegrated within this period and what is not. These data,taken together with previous results, also showing that seg-regation occurs prior to other processes e.g., Sussman et al. ,1998b; 1999b; Yabe et al. , 2001 , provide strong evidencethat segregation is an earlier, primitive process initiallydriven by stimulus characteristics of the acoustic input.Functionally, the integration of sequential elements to per-ceptual units taking place on the already segregated streamswould facilitate the ability to identify within-stream soundpatterns, thus making it possible to appreciate music or com-prehend speech.

ACKNOWLEDGMENTS

This research was supported by the National Institutes of Health Grant No. R01 DC04263 . The author is grateful toDr. Albert S. Bregman for his advice on experiment 3 and forhis comments on an earlier version of the manuscript. Theauthor thanks Dr. Yonathan Fishman and three anonymousreviewers for their helpful comments on an earlier version of the manuscript. The author is also grateful to Rosie Wongand Fakhar Kahn for collecting the data and for their assis-tance with data processing and manuscript preparation.

1This transient form of auditory memory has been estimated to store infor-mation for a period of time of at least 30 s Cowan, 2001 .

Beauvois, M., and Meddis, R. 1991 . ‘‘A computer model of auditorystream segregation,’’ Q. J. Exp. Psychol. A 43A , 517–541.

Bey, C., and McAdams, S. 2003 . ‘‘Postrecognition of interleaved melodiesas an indirect measure of auditory stream formation,’’ J. Exp. Psychol.Hum. Percept. Perform. 29 2 , 267–279.

Botte, M.-C., Drake, C., Brochard, R., and McAdams, S. 1997 . ‘‘Percep-tual attenuation of nonfocused auditory streams,’’ Percept. Psychophys.59 , 419–425.

Bregman, A. S. 1990 . Auditory Scene Analysis MIT, Cambridge, MA .Brochard, R., Drake, C., Botte, M.-C., and McAdams, S. 1999 . ‘‘Percep-

tual organization of complex auditory sequences: Effect of number of simultaneous sub-sequences and frequency separation,’’ J. Exp. Psychol.Hum. Percept. Perform. 25 , 1742–1759.

Brown, G. J., and Cooke, M. P. 1994 . ‘‘Computational auditory sceneanalysis,’’ Comput. Speech Lang. 8 4 , 297–336.

Carlyon, R. P., Cusack, R., Foxton, J. M., and Robertson, I. H. 2001 .‘‘Effects of attention and unilateral neglect on auditory stream segrega-tion,’’ J. Exp. Psychol. Hum. Percept. Perform. 27 1 , 115–127.

Compte, A., Brunel, N., Goldman-Rakic, P. S., and Wang, X.-J. 2000 .‘‘Synaptic mechanisms and network dynamics underlying spatial workingmemory in a cortical network model,’’ Cereb. Cortex 10 9 , 910–923.

Cowan, N. 2001 . ‘‘The magical number 4 in short-term memory: a recon-sideration of mental storage capacity,’’ Behav. Brain Sci. 24 1 , 87–114.

Cusack, R., and Roberts, B. 1999 . ‘‘Effects of similarity in bandwidth onthe auditory sequential streaming of two-tone complexes,’’ Percept. Psy-chophys. 28 , 1281–1289.

Cusack, R., and Roberts, B. 2000 . ‘‘Effects of differences in timbre onsequential grouping,’’ Percept. Psychophys. 62 , 1112–1120.

Dehaene-Lambertz, G., Dupoux, E., and Gout, A. 2000 . ‘‘Electrophysi-ological correlates of phonological processing: a cross-linguistic study,’’ J.cogn. Neurosci. 12 , 635–647.

Fay, R. R. 2000 . ‘‘Spectral contrasts underlying auditory stream segrega-tion in goldsh Carassius auratus ,’’ J. Assoc. Res. Otolaryngol. 1 , 120–128.

Fishman, Y. I., Reser, D., Arezzo, J., and Steinschneider, M. 2001 . ‘‘Neuralcorrelates of auditory stream segregation in primary auditory cortex of theawake monkey,’’ Hear. Res. 151 , 167–187.

Giard, M. H., Perrin, F., Pernier, J., and Bouchet, P. 1990 . ‘‘Brain genera-tors implicated in processing of auditory stimulus deviance: A topographicevent-related potential study,’’ Psychophysiology 27 , 627–640.

Halgren, E., Baudena, P., Clarke, J. M., Heit, G., Liegeois, C., Chauvel, P.,and Musolino, A. 1995 . ‘‘Intracerebral potentials to rare target and dis-tractor auditory and visual stimuli. I. Superior temporal plane and parietallobe,’’ Electroencephalogr. Clin. Neurophysiol. 94 , 191–220.

Ha, K. S., Youn, T., Kong, S. W., Park, H. J., Ha, T. H., Kim, M. S., andKwon, J. S. 2003 . ‘‘Optimized individual mismatch negativity source

localization using a realistic head model and the Talairach coordinate sys-tem.’’ Brain Topography 15 4 , 233-238.

Hartmann, W. M., and Johnson, D. 1991 . ‘‘Stream segregation and periph-eral channeling,’’ Music Percept. 9 , 155–184.

Hulse, S. H., MacDougall-Shackleton, S. A., and Wisniewski, A. B. 1997 .‘‘Auditory scene analysis by songbirds: stream segregation of birdsong byEuropean starlings Sturnus vulgaris ,’’ J. Comp. Psychol. 111 , 3–13.

Iverson, P. 1995 . ‘‘Auditory stream segregation by musical timbre: effectsof static and dynamic acoustic attributes,’’ J. Exp. Psychol. Hum. Percept.Perform. 21 , 751–763.

Javitt, D. C., Steinschneider, M., Schroeder, C. E., and Arezzo, J. C. 1996 .‘‘Role of cortical N-methyl-D-aspartate receptors in auditory sensorymemory and mismatch negativity generation: Implications for schizophre-nia,’’ Proc. Natl. Acad. Sci. USA 93 , 11962–11967.

Jones, D. M., Alford, D., Bridges, A., Tremblay, S., and Macken, W. J.1999 . ‘‘Organizational factors in selective attention: The interplay of

acoustic distinctiveness and auditory streaming in the irrelevant soundeffect,’’ J. Exp. Psychol.: Learn. Mem. Cogn. 25 , 464–473.Jones, M. R., Maser, D. J., and Kidd, G. R., 1978 . ‘‘Rate and structure in

memory for auditory patterns,’’ Mem. Cognit. 6 , 246–258.Kauer, J. A., Malenka, R. C., and Nicoll, R. A. 1988 . ‘‘NMDA application

potentiates synaptic transmission in the hippocampus,’’ Nature London334 , 250–252.

Kropotov, J. D., Na ¨atanen, R., Sevostianov, A. V., Alho, K., Reinkainen, K.,and Kropotova, O. V. 1995 . ‘‘Mismatch negativity to auditory stimuluschange recorded directly from the human temporal cortex,’’ Psychophysi-ology 32 , 418– 422.

Macken, W. J., Tremblay, S., Houghton, R. J., Nicholls, A. P., and Jones, D.M. 2003 . ‘‘Does auditory streaming requie attention? Evidence fromattentional selectivity in short-term memory,’’ J. Exp. Psychol. Hum. Per-cept. Perform. 29 1 , 43–51.

Massaro, D. W. Experimental Psychology and Information Processing. Chi-

cago: Rand McNally, 1975.Moore, B. C. J., and Gockel, H. 2002 . ‘‘Factors inuencing sequentialstream segregation,’’ Acust. Acta Acust. 88 , 320–332.

Na atanen, R. 1992 . Attention and Brain Function Erlbaum Hillsdale, NJ .Na atanen, R., Lehtokoski, A., Lennes, M., Cheour, M., Huotilainen, M.,

Iivonen, A., Vainio, M., Alku, P., Ilmoniemi, R. J., Luuk, A., Allik, J.,Sinkkonen, J., and Alho, K. 1997 . ‘‘Language-specic phoneme repre-sentations revealed by electric and magnetic brain responses,’’ NatureLondon 385 , 432–434.

Nenonen, S., Shestakova, A., Huotilainen, M., and Na ¨atanen, R. 2003 .‘‘Linguistic relevance of duration within the native language determinesthe accuracy of speech-sound duration processing,’’ Cognitive Brain Res.16 , 492–495.

Opitz, B., Mecklinger, A., von Cramon, D. Y., and Kruggel, F. 1999 .‘‘Combining electrophysiological and hemodynamic measures of the au-ditory oddball. Psychophysiology 36 1 , 142-147.

Pickett, J. M., and Decker, L. R. 1960

. ‘‘Time factors in perception of adouble consonant,’’ Lang. Speech 3 , 11–17.Pisoni, D. B., Carrell, T. D., and Gans, S. J. 1983 . ‘‘Perception of the

duration of rapid spectrum changes in speech and nonspeech signals,’’Percept. Psychophys. 34 , 314–322.

Pulvirenti, L. 1992 . ‘‘Neural plasticity and memory: towards an integratedview,’’ Funct Neurol. 7 6 , 481–490.

Ritter, W., Sussman, E., and Molholm, S. 2000 . ‘‘Evidence that the mis-match negativity system works on the basis of objects,’’ NeuroReport 11 ,61–63.

Roberts, B., Glasberg, B. R., and Moore, B. C. 2002 . ‘‘Primitive streamsegregation of tone sequences without differences in fundamental fre-quency or passband,’’ J. Acoust. Soc. Am. 112 , 2074–2085.

Rose, M. M., and Moore, B. C. 2000 . ‘‘Effects of frequency and level onauditory stream segregation,’’ J. Acoust. Soc. Am. 108 , 1209–1214.

1297J. Acoust. Soc. Am., Vol. 117, No. 3, Pt. 1, March 2005 Elyse S. Sussman: Integration and segregation

Page 14: Integration and Segregation in Auditory Stream Analysis

8/11/2019 Integration and Segregation in Auditory Stream Analysis

http://slidepdf.com/reader/full/integration-and-segregation-in-auditory-stream-analysis 14/14

Rosenthal, D. F., and Okuno, H. G. Eds. 1998 . Computational AuditoryScene Analysis Erlbaum, Mahwah, NJ .

Sams, M., and Hari, R. 1991 . ‘‘Magnetoencephalography in the study of human auditory information processing,’’Ann. N.Y. Acad. Sci. 620 , 102–117.

Scherg, M., Vajsar, J., and Picton, T. W. 1989 . ‘‘A source analysis of thelate human auditory evoked potentials,’’ J. Cogn. Neurosci. 1 , 336–355.

Sharma, A., and Dorman, M. F. 2000 . ‘‘Neurophysiologic correlates of cross-language phonetic perception,’’ J. Acoust. Soc. Am. 107 , 2697–2703.

Singh, P. G. and Bregman, A. S. 1997 . ‘‘The inuence of different timbreattributes on the perceptual segregation of complex-tone sequences,’’ J.

Acoust. Soc. Am. 102 4 , 1943-1952.Sussman, E., and Winkler, I. 2001 . ‘‘Dynamic process of sensory updating

in the auditory system,’’ Cognitive Brain Res. 12 , 431–439.Sussman, E., Horva ´th, J., and Winkler, I. submitted . ‘‘The role of atten-

tion in the formation of auditory streams,’’ Percept. Psychophys.Sussman, E., Ritter, W., and Vaughan, Jr., H. G. 1998a . ‘‘Stimulus predict-

ability and the mismatch negativity system,’’ NeuroReport 9 , 4167–4170Sussman, E., Ritter, W., and Vaughan, Jr., H. G. 1998b . ‘‘Attention affects

the organization of auditory input associated with the mismatch negativitysystem,’’ Brain Res. 789 , 130–138.

Sussman, E., Ritter, W., and Vaughan, Jr., H. G. 1999a . ‘‘An investigationof the auditory streaming effect using event-related brain potentials,’’ Psy-chophysiology 36 , 22–34.

Sussman, E., Bregman, A. S., Wang, W., and Kahn, F. J. in press . ‘‘Atten-tional modulation of electrophysiological activity in auditory cortex forunattended sounds in multi-stream auditory environments,’’Cognitive, Af-

fective, and Behavioral Neurosci.Sussman, E., Sheridan, K., Kreuzer, J., and Winkler, I. 2003 . ‘‘Represen-tation of the standard: stimulus context effects on the process generatingthe mismatch negativity component of event-related brain potentials,’’Psychophysiology 40 , 465–471.

Sussman, E., Cˇ eponien, R., Shestakova, A., Na ¨atanen, R., and Winkler, I.2001 . ‘‘Auditory stream segregation processes operate similarly in

school-aged children as adults,’’ Hear. Res. 153 1-2 , 108–114.

Sussman, E., Winkler, I., Huoutilainen, M., Ritter, W., and Na ¨atanen, R.2002a . ‘‘Top-down effects on the initially stimulus-driven auditory orga-

nization,’’ Cognitive Brain Res. 13 , 393–405.Sussman, E., Winkler, I., Ritter, W., Alho, K., and Na ¨atanen, R. 1999b .

‘‘Temporal integration of auditory stimulus deviance as reected by themismatch negativity,’’ Neurosci. Lett. 264 , 161–164.

Sussman, E., Kujala, T., Halmetoja, J., Lyytinen, H., Alku, P., and Na ¨atanen,R. 2004 . ‘‘Automatic and controlled processing of acoustic and phoneticcontrasts,’’ Hear. Res. 190 1-2 , 128–140.

Sussman, E., Winkler, I., Kreuzer, J., Saher, M., Na ¨atanen, R., and Ritter, W.2002b . ‘‘Temporal integration: Intentional sound discrimination does not

modify stimulus-driven processes in auditory event synthesis,’’ Clin. Neu-rophysiol. 113 , 909–920.

Vliegen, J., and Oxenham, A. J. 1999 . ‘‘Sequential stream segregation inthe absence of spectral cues,’’ J. Acoust. Soc. Am. 105 , 339–346.

Winkler, I., Sussman, E., Tervaniemi, M., Ritter, W., Horvath, J., andNaatanen, R. 2003 . ‘‘Preattentive auditory context effects,’’ Cogn., Af-fect., and Behav. Neurosci. 3 1 , 57–77.

Winkler, I., Lehtokoski, A., Alku, P., Vainio, M., Czigler, I., Cse ´pe, V.,Aaltonen, O., Raimo, I., Alho, K., Lang, A. H., Iivonen, A., and Na ¨atanen,R. 1999b . ‘‘Preattentive detection of vowel contrasts utilizes both pho-netic and auditory memory representations,’’ Cognitive Brain Res. 7, 357–369.

Woldorff, M., Hillyard, S., Gallen, C. C., Hampson, R. S., and Bloom, F. E.1998 . ‘‘Magnetoencephalographic recordings demonstrate attentional

modulation of mismatch-related neural activity in human auditory cortex,’’Psychophysiology 35 , 283–292.

Yabe, H., Winkler, I., Czigler, I., Koyama, S., Kakigi, R., Sutoh, T., Hiruma,T., and Kaneko, S. 2001 . ‘‘Organizing sound sequences in the humanbrain: the interplay of auditory streaming and temporal integration,’’BrainRes. 897 , 222–227.

Zwislowski, J. J. 1969 . ‘‘Temporal summation of loudness: An analysis,’’J. Acoust. Soc. Am. 46 , 431–440.