Corso di Principi e Modelli della Percezione Prof ...

20
Percezione uditiva (4) Corso di Principi e Modelli della Percezione Prof. Giuseppe Boccignone Dipartimento di Informatica Università di Milano [email protected] http://boccignone.di.unimi.it/PMP_2018.html Percezione acustica ambientale //Localizzazione dei suoni (1) SOUND LOCATION (4) MUSIC’S BEAT (3) WOMAN'S VOICE (2) REFLECTED SOUND Bruce Goldstein

Transcript of Corso di Principi e Modelli della Percezione Prof ...

Page 1: Corso di Principi e Modelli della Percezione Prof ...

Percezione uditiva (4)

Corso di Principi e Modelli della Percezione

Prof. Giuseppe Boccignone

Dipartimento di InformaticaUniversità di Milano

[email protected]://boccignone.di.unimi.it/PMP_2018.html

Percezione acustica ambientale //Localizzazione dei suoni

290 CHAPTER 12 Auditory Localization and Organization

scene analysis problem. (4) How does he perceive the music’s sound as perceptually organized in time, with a regular beat? Perceiving the beat pattern is the problem of metrical structure. We will consider each of these four problems in this chapter. We begin by considering localization.

Auditory LocalizationAfter reading this sentence, close your eyes for a moment, listen, and notice what sounds you hear and where they are coming from. When I do this right now, sitting in a cof-fee shop, I hear the beat and vocals of a song coming from a speaker above my head and slightly behind me, a woman talking somewhere in front of me, and the “fi zzy” sound of an espresso maker off to the left. There are other sounds as well, because many people are talking, but let’s focus on these three for now.

Each of the sounds—the music, the talking, and the mechanical fi zzing sound—are heard as coming from differ-ent locations in space. These sounds at different locations

create an auditory space, which exists all around, wherever there is sound. This locating of sound sources in auditory space is called auditory localization. We can appreciate the problem the auditory system faces in determining these locations by comparing the information for location for vision and hearing. To do this, we substitute a bird in a tree and a cat on the ground in Figure 12.2 for the sounds in the coffee shop.

Visual information for the relative locations of the bird and the cat is contained in the images of the bird and the cat on the surface of the retina. The ear, however, is different. The bird’s “tweet, tweet” and the cat’s “meow” stimulate the cochlea based on their sound frequencies, and as we saw in Chapter 11, these frequencies cause patterns of nerve fi ring that result in our perception of a tone’s pitch and timbre. But activation of nerve fi bers in the cochlea is based on the tones’ frequency components and not on where the tones are com-ing from. This means that two tones with the same frequency that originate in different locations will activate the same hair cells and nerve fi bers in the cochlea. The auditory sys-tem must therefore use other information to determine loca-tion. The information it uses involves location cues that are

(1) SOUND LOCATION (4) MUSIC’S BEAT

(3) WOMAN'S VOICE

(2) REFLECTED SOUND

Figure 12.1 Coffee shop scene, which contains multiple sound sources. The most immediate sound source for the man in the middle is the voice of the woman talking to him across the table. Additional sources include speakers on the wall behind him, which are broadcasting music, and all the other people in the room who are speaking. The four problems we will consider in this chapter—(1) auditory localization, (2) sound refl ection, (3) analysis of the scene into separate sound sources, and (4) musical patterns that are organized in time—are indicated in this fi gure.

Bruc

e Go

ldst

ein

Page 2: Corso di Principi e Modelli della Percezione Prof ...

Percezione acustica ambientale //Localizzazione dei suoni

Auditory Localization 291

created by the way sound interacts with the listener’s head and ears.

There are two kinds of location cues: binaural cues, which depend on both ears, and monaural cues, which depend on just one ear. Researchers studying these cues have determined how well people can locate the position of a sound in three dimensions: the azimuth, which extends from left to right (Figure 12.3); elevation, which extends up and down; and the distance of the sound source from the listener. In this chapter, we will focus on the azimuth and elevation.

Binaural Cues for Sound LocalizationBinaural cues use information reaching both ears to deter-mine the azimuth (left–right position) of sounds. The two binaural cues are interaural time difference and interaural level difference. Both are based on a comparison of the sound sig-nals reaching the left and right ears. Sounds that are off to the side reach one ear before the other and are louder at one ear than the other.

Interaural Time Difference The interaural time difference (ITD) is the difference between when a sound reaches the left ear and when it reaches the right ear (Figure 12.4). If the source is located directly in front of the listener, at A, the distance to each ear is the same; the sound reaches the left and right ears simultaneously, so the ITD is zero. However, if a source is located off to the side, at B, the sound reaches the right ear before it reaches the left ear. Because the ITD becomes larger as sound sources are located more to the side, the mag-nitude of the ITD can be used as a cue to determine a sound’s location. Behavioral research, in which listeners judge sound locations as ITD is varied, indicates that ITD is an effec-tive cue for localizing low-frequency sounds ( Wightman & Kistler, 1997, 1998).

Interaural Level Difference The other binaural cue, interaural level difference (ILD), is based on the difference in the sound pressure level (or just “level”) of the sound reaching the two ears. A difference in level between the two ears occurs

Cat

“Tweet, tweet”

TweetMeowTweet

“Meow”Bird

Figure 12.2 Comparing location information for vision and hearing. Vision: The bird and the cat, which are located at different places, are imaged on different places on the retina. Hearing: The frequencies in the sounds from the bird and cat are spread out over the cochlea, with no regard to the animals’ locations. © Cengage Learning

Figure 12.3 The three directions used for studying sound localization: azimuth (left–right), elevation (up–down), and distance.© Cengage Learning

Distance

Elevation(up–down)

Azimuth(left–right)

Percezione acustica ambientale //Localizzazione dei suoni

Auditory Localization 291

created by the way sound interacts with the listener’s head and ears.

There are two kinds of location cues: binaural cues, which depend on both ears, and monaural cues, which depend on just one ear. Researchers studying these cues have determined how well people can locate the position of a sound in three dimensions: the azimuth, which extends from left to right (Figure 12.3); elevation, which extends up and down; and the distance of the sound source from the listener. In this chapter, we will focus on the azimuth and elevation.

Binaural Cues for Sound LocalizationBinaural cues use information reaching both ears to deter-mine the azimuth (left–right position) of sounds. The two binaural cues are interaural time difference and interaural level difference. Both are based on a comparison of the sound sig-nals reaching the left and right ears. Sounds that are off to the side reach one ear before the other and are louder at one ear than the other.

Interaural Time Difference The interaural time difference (ITD) is the difference between when a sound reaches the left ear and when it reaches the right ear (Figure 12.4). If the source is located directly in front of the listener, at A, the distance to each ear is the same; the sound reaches the left and right ears simultaneously, so the ITD is zero. However, if a source is located off to the side, at B, the sound reaches the right ear before it reaches the left ear. Because the ITD becomes larger as sound sources are located more to the side, the mag-nitude of the ITD can be used as a cue to determine a sound’s location. Behavioral research, in which listeners judge sound locations as ITD is varied, indicates that ITD is an effec-tive cue for localizing low-frequency sounds ( Wightman & Kistler, 1997, 1998).

Interaural Level Difference The other binaural cue, interaural level difference (ILD), is based on the difference in the sound pressure level (or just “level”) of the sound reaching the two ears. A difference in level between the two ears occurs

Cat

“Tweet, tweet”

TweetMeowTweet

“Meow”Bird

Figure 12.2 Comparing location information for vision and hearing. Vision: The bird and the cat, which are located at different places, are imaged on different places on the retina. Hearing: The frequencies in the sounds from the bird and cat are spread out over the cochlea, with no regard to the animals’ locations. © Cengage Learning

Figure 12.3 The three directions used for studying sound localization: azimuth (left–right), elevation (up–down), and distance.© Cengage Learning

Distance

Elevation(up–down)

Azimuth(left–right)

Page 3: Corso di Principi e Modelli della Percezione Prof ...

Percezione acustica ambientale //Localizzazione dei suoni

Auditory Localization 291

created by the way sound interacts with the listener’s head and ears.

There are two kinds of location cues: binaural cues, which depend on both ears, and monaural cues, which depend on just one ear. Researchers studying these cues have determined how well people can locate the position of a sound in three dimensions: the azimuth, which extends from left to right (Figure 12.3); elevation, which extends up and down; and the distance of the sound source from the listener. In this chapter, we will focus on the azimuth and elevation.

Binaural Cues for Sound LocalizationBinaural cues use information reaching both ears to deter-mine the azimuth (left–right position) of sounds. The two binaural cues are interaural time difference and interaural level difference. Both are based on a comparison of the sound sig-nals reaching the left and right ears. Sounds that are off to the side reach one ear before the other and are louder at one ear than the other.

Interaural Time Difference The interaural time difference (ITD) is the difference between when a sound reaches the left ear and when it reaches the right ear (Figure 12.4). If the source is located directly in front of the listener, at A, the distance to each ear is the same; the sound reaches the left and right ears simultaneously, so the ITD is zero. However, if a source is located off to the side, at B, the sound reaches the right ear before it reaches the left ear. Because the ITD becomes larger as sound sources are located more to the side, the mag-nitude of the ITD can be used as a cue to determine a sound’s location. Behavioral research, in which listeners judge sound locations as ITD is varied, indicates that ITD is an effec-tive cue for localizing low-frequency sounds ( Wightman & Kistler, 1997, 1998).

Interaural Level Difference The other binaural cue, interaural level difference (ILD), is based on the difference in the sound pressure level (or just “level”) of the sound reaching the two ears. A difference in level between the two ears occurs

Cat

“Tweet, tweet”

TweetMeowTweet

“Meow”Bird

Figure 12.2 Comparing location information for vision and hearing. Vision: The bird and the cat, which are located at different places, are imaged on different places on the retina. Hearing: The frequencies in the sounds from the bird and cat are spread out over the cochlea, with no regard to the animals’ locations. © Cengage Learning

Figure 12.3 The three directions used for studying sound localization: azimuth (left–right), elevation (up–down), and distance.© Cengage Learning

Distance

Elevation(up–down)

Azimuth(left–right)

Percezione acustica ambientale //Localizzazione dei suoni (azimuth)

• Come localizziamo i suoni?

• Un esempio: la localizzazione di un grillo

• Un dilemma simile si ha anche quando si deve valutare la distanza di una fonte sonora

• Due orecchi: Fattore critico per la localizzazione dei suoni

• Differenze temporali

• Differenze di volume (loudness)

Page 4: Corso di Principi e Modelli della Percezione Prof ...

Percezione acustica ambientale //Localizzazione dei suoni: ITD

292 CHAPTER 12 Auditory Localization and Organization

because the head is a barrier that creates an acoustic shadow, reducing the intensity of sounds that reach the far ear. This reduction of intensity at the far ear occurs for high-frequency sounds, as shown in Figure 12.5a, but not for low-frequency sounds, as shown in Figure 12.5b. VL

We can understand why an ILD occurs for high fre-quencies but not for low frequencies by drawing an analogy between sound waves and water waves. Consider, for example, a situation in which small ripples in the water are approach-ing the boat in Figure 12.5c. Because the ripples are small compared to the boat, they bounce off the side of the boat and go no further. Now imagine the same ripples approach-ing the cattails in Figure 12.5d. Because the distance between the ripples is large compared to the stems of the cattails, the ripples are hardly disturbed and continue on their way. These two examples illustrate that an object has a large effect on the wave if it is larger than the distance between the waves, but has a small effect if it is smaller than the distance between the waves. When we apply this principle to sound waves

Figure 12.4 The principle behind interaural time difference (ITD). The tone directly in front of the listener, at A, reaches the left and right ears at the same time. However, when the tone is off to the side, at B, it reaches the listener’s right ear before it reaches the left ear. © Cengage Learning

B

A

(c) (d)

(a)

6,000 Hz

Acoustic shadow

Spacing smallcompared to object

(b)

200 Hz

Spacing largecompared to object

Figure 12.5 Why interaural level difference (ILD) occurs for high frequencies but not for low frequencies. (a) Person listening to a high-frequency sound; (b) person listening to a low-frequency sound. (c) When the spacing between waves is smaller than the size of the object, illustrated here by water ripples that are smaller than the boat, the waves are stopped by the object. This occurs for the high-frequency sound waves in (a) and causes the sound intensity to be lower on the far side of the listener’s head. (d) When the spacing between waves is larger than the size of the object, as occurs for the water ripples and the narrow stalks of the cattails, the object does not interfere with the waves. This occurs for the low-frequency sound waves in (b), so the sound intensity on the far side of the head is not affected. © Cengage Learning

Percezione acustica ambientale //Localizzazione dei suoni: ITD

• Come localizziamo i suoni?

• Interaural time difference (ITD): La differenza in ordine di tempo (ritardo/anticipo) con cui un suono arriva ad un orecchio rispetto a quando arriva all’altro orecchio

Page 5: Corso di Principi e Modelli della Percezione Prof ...

Percezione acustica ambientale //Localizzazione dei suoni: ITD

azimuth =angle in the horizontalplane (relative to head)

Percezione acustica ambientale //Localizzazione dei suoni: ITD

298 CHAPTER 12 Sound Localization and the Auditory Scene

(neurons 1, 2, 3) or the right ear (neurons 9, 8, 7), but not both, and they do not fire. But when the signals both reach neuron 5 together, that neurons fires (Figure 12.11b). This neuron and the others in this circuit are called coincidence detectors, because they only fire when both signals arrive at the neuron simultaneously. The firing of neuron 5 indicates that ITD ! 0.

If the sound comes from the right, similar events oc-cur, but the signal from the right ear has a head start, as shown in Figure 12.11c. These signals reach neuron 3 simul-taneously (Figure 12.11d), so this neuron fires. This neuron, therefore, detects ITDs that occur when the sound is com-ing from a specific location on the right. The other neurons in the circuit fire to locations corresponding to other ITDs.

Broadly Tuned ITD NeuronsRecent research on the gerbil indicates that localization can also be based on neurons that are broadly tuned, as shown in Figure 12.12a (McAlpine, 2005). According to this idea, there are neurons in the gerbil’s right hemisphere that re-spond best when sound is coming from the left and neurons in the left hemisphere that respond when sound is coming from the right. The location of a sound is indicated by the ratio of responding of these two types of broadly tuned neurons. For example, a sound from the left would cause the pattern of response shown in the left pair of bars in Figure 12.12b; sounds straight ahead, by the middle pair of bars; and sounds to the right, by the far right bars.

This type of coding resembles the distributed coding model we described in Chapter 2, in which information in the nervous system is based on the pattern of neural responding. This is, in fact, how the visual system signals different wavelengths of light, as we saw when we discussed color vision in Chapter 9, in which wavelengths are signaled by the pattern of response of three different cone pigments (Figure 9.10).

We have seen that there is evidence for both narrowly tuned ITD neurons and broadly tuned ITD neurons. Both types of neurons can potentially provide information re-garding the location of low-frequency sounds. Exactly which of these mechanisms, or perhaps a combination of the two, works in different animals is being studied by audi-tory researchers. In addition to determining that the firing of single neurons can provide information for localization, researchers have also determined that there are specific ar-eas of the cortex that are involved in auditory localization. This research is described in Chapter 11 (see “What and Where Streams for Hearing,” page 281).

TEST YOURSELF 12.1

1. How is auditory space described in terms of three coordinates?

2. How well can people localize sounds that are in front, to the side, and in back?

3. What is the basic difference between determining the location of a sound source and determining the location of a visual object?

Interaural timedifference

Firi

ng r

ate

Left earfirst

0 Right earfirst

Figure 12.10 ❚ ITD tuning curves for six neurons that each respond to a narrow range of ITDs. The neurons on the left respond when sound reaches the left ear first. The ones on the right respond when sound reaches the right ear first. Neurons such as these have been recorded from the barn owl and other animals. Adapted from McAlpine, 2005.

© N

iall

Benv

ie/C

ORBI

S

(a)

(b)

(c)

(d)

Sound from straight ahead:

Sound from the right:

Leftear

Leftear

Leftear

Leftear

Neuralcoincidencedetector

Rightear

Rightear

Rightear

Rightear

Axon

1 2 3

3

4 5

5

6 7 8 9

Figure 12.11 ❚ How the Jeffress circuit operates. Axons transmit signals from the left ear (blue) and right ear (red) to neurons, indicated by circles. (a) Sound in front: signals start in left and right channels simultaneously. (b) Signals meet at neuron 5, causing it to fire. (c) Sound to the right: signal starts in the right channel first. (d) Signals meet at neuron 3, causing it to fire.

Modello del detettore di Jeffress

Page 6: Corso di Principi e Modelli della Percezione Prof ...

Percezione acustica ambientale //Localizzazione dei suoni: fisiologia dell’ ITD

298 CHAPTER 12 Sound Localization and the Auditory Scene

(neurons 1, 2, 3) or the right ear (neurons 9, 8, 7), but not both, and they do not fire. But when the signals both reach neuron 5 together, that neurons fires (Figure 12.11b). This neuron and the others in this circuit are called coincidence detectors, because they only fire when both signals arrive at the neuron simultaneously. The firing of neuron 5 indicates that ITD ! 0.

If the sound comes from the right, similar events oc-cur, but the signal from the right ear has a head start, as shown in Figure 12.11c. These signals reach neuron 3 simul-taneously (Figure 12.11d), so this neuron fires. This neuron, therefore, detects ITDs that occur when the sound is com-ing from a specific location on the right. The other neurons in the circuit fire to locations corresponding to other ITDs.

Broadly Tuned ITD NeuronsRecent research on the gerbil indicates that localization can also be based on neurons that are broadly tuned, as shown in Figure 12.12a (McAlpine, 2005). According to this idea, there are neurons in the gerbil’s right hemisphere that re-spond best when sound is coming from the left and neurons in the left hemisphere that respond when sound is coming from the right. The location of a sound is indicated by the ratio of responding of these two types of broadly tuned neurons. For example, a sound from the left would cause the pattern of response shown in the left pair of bars in Figure 12.12b; sounds straight ahead, by the middle pair of bars; and sounds to the right, by the far right bars.

This type of coding resembles the distributed coding model we described in Chapter 2, in which information in the nervous system is based on the pattern of neural responding. This is, in fact, how the visual system signals different wavelengths of light, as we saw when we discussed color vision in Chapter 9, in which wavelengths are signaled by the pattern of response of three different cone pigments (Figure 9.10).

We have seen that there is evidence for both narrowly tuned ITD neurons and broadly tuned ITD neurons. Both types of neurons can potentially provide information re-garding the location of low-frequency sounds. Exactly which of these mechanisms, or perhaps a combination of the two, works in different animals is being studied by audi-tory researchers. In addition to determining that the firing of single neurons can provide information for localization, researchers have also determined that there are specific ar-eas of the cortex that are involved in auditory localization. This research is described in Chapter 11 (see “What and Where Streams for Hearing,” page 281).

TEST YOURSELF 12.1

1. How is auditory space described in terms of three coordinates?

2. How well can people localize sounds that are in front, to the side, and in back?

3. What is the basic difference between determining the location of a sound source and determining the location of a visual object?

Interaural timedifference

Firi

ng r

ate

Left earfirst

0 Right earfirst

Figure 12.10 ❚ ITD tuning curves for six neurons that each respond to a narrow range of ITDs. The neurons on the left respond when sound reaches the left ear first. The ones on the right respond when sound reaches the right ear first. Neurons such as these have been recorded from the barn owl and other animals. Adapted from McAlpine, 2005.

© N

iall

Benv

ie/C

ORBI

S

(a)

(b)

(c)

(d)

Sound from straight ahead:

Sound from the right:

Leftear

Leftear

Leftear

Leftear

Neuralcoincidencedetector

Rightear

Rightear

Rightear

Rightear

Axon

1 2 3

3

4 5

5

6 7 8 9

Figure 12.11 ❚ How the Jeffress circuit operates. Axons transmit signals from the left ear (blue) and right ear (red) to neurons, indicated by circles. (a) Sound in front: signals start in left and right channels simultaneously. (b) Signals meet at neuron 5, causing it to fire. (c) Sound to the right: signal starts in the right channel first. (d) Signals meet at neuron 3, causing it to fire.

Percezione acustica ambientale //Localizzazione dei suoni: fisiologia dell’ ITD

4. Describe the binaural cues for localization. Indicate the frequencies and directions relative to the listener for which the cues are effective.

5. Describe the monaural cue for localization. 6. How is auditory space represented physiologically in

single neurons? Describe the two different types of neural coding that have been proposed.

Perceptually Organizing Sounds in the EnvironmentSo far we have been describing how single tones are localized in space. But we rarely hear just a single tone (unless you are a subject in a hearing experiment!). Our experience usu-ally involves hearing a number of sounds simulta neously. This poses a problem for the auditory system: How can it separate one sound from another?

Consider, for example, a situation in which you are listening to music in the old-fashioned way, with “ste-reo” turned off so all of the music is coming from a single speaker. By doing this, you have eliminated the location in-formation usually supplied by binaural cues, so the sound of all of the instruments appears to be coming from the speaker in front of you. But even though you have elimi-nated information about location, you can still make out the vocalist, the guitar, and the keyboard. “Well, of course,” you might think. “After all, each of the instruments makes different sounds.”

This is a good example of a situation in which our per-ceptual system enables us to effortlessly solve a perceptual problem that is actually extremely complex. We can appre-ciate why this is a complex problem by considering how the sounds coming from the loudspeaker affect vibration of the basilar membrane and therefore activation of the auditory nerve fibers.

Figure 12.13 shows the sound stimuli created by the vocalist and two instruments and the output of the loud-speaker. The problem for our auditory system is that al-though each sound source produces its own signal, all of the signals are combined when they are broadcast by the loud-speaker and enter the listener’s ear. Each of the frequencies in this signal causes the basilar membrane to vibrate, but just as in the case of the bird and the cat in Figure 12.3, in which there was no information on the cochlea for the locations of the two sounds, there is also no information on the cochlea about which vibration is created by which instrument. We now consider how the auditory system solves this problem.

Auditory Scene AnalysisA problem similar to the one above occurs when you are talking to a friend at a noisy party. Even though the sounds produced by your friend’s conversation are mixed together on your cochlea with the sounds produced by all of the other people’s conversations, plus perhaps music, the sound of the refrigerator door slamming, and glasses clinking, you are somehow able to separate what your friend is say-ing from all of the other sounds. The array of sound sources in the environment is called the auditory scene, and the

Perceptually Organizing Sounds in the Environment 299

Interaural time difference

Left earfirst

L R L R L R

(a)

(b)

0 Right earfirst

Right-hemisphereneurons

Left-hemisphereneurons

1 2 3

1 2 3 Figure 12.12 ❚ (a) ITD tuning curves for broadly tuned neurons. The left curve represents the tuning of neurons in the right hemisphere; the right curve is the tuning of neurons in the left hemisphere; (b) patterns of response of the broadly tuned neurons for stimuli coming from the left (ITD indicated by line 1), in front (ITD indicated by line 2), and from the right (ITD indicated by line 3). Neurons such as this have been recorded from the gerbil. (Adapted from McAlpine, 2005.)

© D

K Li

mite

d/CO

RBIS

Page 7: Corso di Principi e Modelli della Percezione Prof ...

Percezione acustica ambientale //Localizzazione dei suoni: fisiologia dell’ ITD

• Come rilevare l’ITD ?

• L’oliva mediale superiore (MSOs): E’ il primo luogo dove gli input dei due orecchi convergono

• Detettori dell’ITD formano connessioni con gli inputs provenienti dai due orecchi già nei primi mesi di vita

Oliva superiore: convergenza degli input dai due orecchi

Percezione acustica ambientale //Localizzazione dei suoni: fisiologia dell’ ITD

The Physiology of Auditory Localization 297

If the sound source is directly in front of the listener, the sound reaches the left and right ears simultaneously, and sig-nals from the left and right ears start out together, as shown in Figure 12.12a. As each signal travels along its axon, it stimu-lates each neuron along the axon in turn. At the beginning of the journey, neurons receive signals from only the left ear (neurons 1, 2, 3) or the right ear (neurons 9, 8, 7), but not both, and they do not fi re. But when the signals both reach neuron 5 together, that neurons fi res (Figure 12.12b). This neuron and the others in this circuit are called coincidence detectors, because they only fi re when both signals coincide by arriving at the neuron simultaneously. The fi ring of neu-ron 5 indicates that ITD = 0.

If the sound comes from the right, similar events occur, but the signal from the right ear has a head start, as shown in Figure 12.12c, and both signals reach neuron 3 simulta-neously (Figure 12.12d), so this neuron fi res. This neuron, therefore, detects ITDs that occur when the sound is coming from a specifi c location on the right. The other neurons in the circuit fi re to locations corresponding to other ITDs.

The Jeffress model therefore proposes a circuit that involves “ITD detectors,” and it also proposes that there are a series of these detectors, each tuned to respond best to a specifi c ITD. According to this idea, the ITD will be indicated by which ITD neuron is fi ring. This has been called a “place code” because ITD is indicated by the place (which neuron) where the activity occurs.

One way to describe the properties of ITD neurons is to measure ITD tuning curves, which plot the neuron’s fi ring rate against the ITD. Recording from neurons in the brain-stem of the barn owl, which has excellent auditory localiza-tion abilities, has revealed narrow tuning curves that respond best to specifi c ITDs, like the ones in Figure 12.13 (Carr & Konishi, 1990; McAlpine, 2005). The neurons associated with the curves on the left (blue) fi re when the sound reaches the

A1

Core area Belt area

Parabelt area

A

P

Figure 12.11 The three main auditory areas in the monkey cortex: the core area, which contains the primary auditory receiving area (A1); the belt area; and the parabelt area. P indicates the posterior end of the belt area, and A indicates the anterior end of the belt area. Signals, indicated by the arrows, travel from core to belt to parabelt. The dark lines indicate where the temporal lobe was pulled back to show areas that would not be visible from the surface. From Kaas, J. H., Hackett,

T. A., & Tramo, M. J. (1999). Auditory processing in primate cerebral cortex. Current Opinion in Neurobiology, 9, 164–170.

Figure 12.12 How the circuit proposed by Jeffress operates. Axons transmit signals from the left ear (blue) and the right ear (red) to neurons, indicated by circles. (a) Sound in front. Signals start in left and right channels simultaneously. (b) Signals meet at neuron 5, causing it to fi re. (c) Sound to the right. Signal starts in the right channel fi rst. (d) Signals meet at neuron 3, causing it to fi re. Adapted from Plack, C. J. (2005). The sense of hearing. New York: Erlbaum.

(a)

(b)

(c)

(d)

Sound from straight ahead:

Sound from the right:

Leftear

Leftear

Leftear

Leftear

Neuralcoincidencedetector

Rightear

Rightear

Rightear

Rightear

Axon

1 2 3

3

4 5

5

6 7 8 9

Interaural timedifference

Firi

ng r

ate

Left earfirst

0 Right earfirst

Figure 12.13 ITD tuning curves for six neurons that each respond to a narrow range of ITDs. The neurons on the left respond when sound reaches the left ear fi rst. The ones on the right respond when sound reaches the right ear fi rst. Neurons such as these have been recorded from the barn owl and other animals. However, when we consider mammals, another story emerges. Adapted from McAlpine, D., & Grothe, B.

(2003). Sound localization and delay lines: Do mammals fi t the model? Trends in Neurosciences, 26, 347–350.

Nial

l Ben

vie/C

ORBI

S

Page 8: Corso di Principi e Modelli della Percezione Prof ...

Percezione acustica ambientale //Localizzazione dei suoni: fisiologia dell’ ITD

298 CHAPTER 12 Auditory Localization and Organization

left ear fi rst, and the ones on the right (red) fi re when sound reaches the right ear fi rst. These are the tuning curves that are predicted by the Jeffress model, because each neuron responds best to a specifi c ITD and the response drops off rapidly for other ITDs. However, the situation is different for mammals.

Broad ITD Tuning Curves in MammalsThe results of research in which ITD tuning curves are recorded from mammals may appear, at fi rst glance, to sup-port the Jeffress model. For example, Figure 12.14a shows an ITD tuning curve of a neuron in the gerbil’s superior olivary nucleus (Pecka et al., 2008). This curve has a peak in the mid-dle and drops off on either side. However, when we compare the gerbil curve to the curve for the barn owl in Figure 12.14b, we can see from the ITD scales on the horizontal axis that the gerbil curve is much broader than the owl curve. In fact, the gerbil curve is so broad that it extends far outside the range of ITDs that are actually involved in sound localization, indi-cated by the light bar (also see Siveke et al., 2006).

This broadness of response to different locations also occurs in the auditory cortex of the monkey. Figure 12.15 shows the responses of a neuron in a monkey’s left audi-tory cortex to sounds located at different positions around the monkey’s head. This neuron fi res best to sounds on the monkey’s right side and is broadly tuned, so even though it responds best when the sound is coming from about 60 degrees (indicated by the star), it also responds strongly to other locations (Recanzone et al., 2011; Woods et al., 2006).

Because of the broadness of the ITD curves in mammals, it has been proposed that coding for localization is based on broadly tuned neurons like the ones shown in Figure 12.16 (McAlpine, 2005; Grothe, 2010). According to this idea, there are broadly tuned neurons in the right hemisphere that respond when sound is coming from the left and broadly tuned

Firi

ng r

ate

(a) (b)

Gerbil Owl

–400 0 +400 –40 0 +40

ITD (μs)

Range: ∼800μ sec. Range: ∼80μ sec.

ITD (μs)

Figure 12.14 (a) ITD tuning curve for a neuron in the gerbil superior olivary nucleus. (b) ITD tuning curve for a neuron in the barn owl’s inferior colliculus. The “range” indicator below each curve indicates that the gerbil curve is much broader than the owl curve. The gerbil curve is, in fact, broader than the range of ITDs that typically occur in the environment. This range is indicated by the light bar (between the dashed lines). © Cengage Learning 2014

60° *–90° 90°

180°

Trial

Time

112

Figure 12.15 Responses recorded from a neuron in the left auditory cortex of the monkey to sounds originating at different places around the head. The monkey’s position is indicated by the circle in the middle. The fi ring of a single cortical neuron to a sound presented at different locations around the monkey’s head is shown by the records at each location. Greater fi ring is indicated by a greater density of dots. This neuron responds to sounds coming from a number of locations on the right. From Recanzone, G. H., Engle, J. R., & Juarez-Salinas, D. L. (2011). Spatial and

temporal processing of single auditory cortical neurons and populations of neurons in the macaque monkey. Hearing

Research, 271, 115–122, Figure 4. With permission from Elsevier. Based on data from Woods et al. (2006).

neurons in the left hemisphere that respond when sound is coming from the right. The location of a sound is indicated by the ratio of responding of these two types of broadly tuned neurons. For example, a sound from the left would cause the pattern of response shown in the left pair of bars in Figure 12.16b; a sound located straight ahead, by the middle pair of bars; and a sound to the right, by the far right bars.

This type of coding resembles the distributed coding we described in Chapter 3, in which information in the nervous system is based on the pattern of neural responding. This is, in fact, how the visual system signals different wavelengths of light, as we saw when we discussed color vision in Chapter 9, in which wavelengths are signaled by the pattern of response of three different cone pigments (Figure 9.11).

neurone nella corteccia uditiva destra (scimmia)

The Physiology of Auditory Localization 297

If the sound source is directly in front of the listener, the sound reaches the left and right ears simultaneously, and sig-nals from the left and right ears start out together, as shown in Figure 12.12a. As each signal travels along its axon, it stimu-lates each neuron along the axon in turn. At the beginning of the journey, neurons receive signals from only the left ear (neurons 1, 2, 3) or the right ear (neurons 9, 8, 7), but not both, and they do not fi re. But when the signals both reach neuron 5 together, that neurons fi res (Figure 12.12b). This neuron and the others in this circuit are called coincidence detectors, because they only fi re when both signals coincide by arriving at the neuron simultaneously. The fi ring of neu-ron 5 indicates that ITD = 0.

If the sound comes from the right, similar events occur, but the signal from the right ear has a head start, as shown in Figure 12.12c, and both signals reach neuron 3 simulta-neously (Figure 12.12d), so this neuron fi res. This neuron, therefore, detects ITDs that occur when the sound is coming from a specifi c location on the right. The other neurons in the circuit fi re to locations corresponding to other ITDs.

The Jeffress model therefore proposes a circuit that involves “ITD detectors,” and it also proposes that there are a series of these detectors, each tuned to respond best to a specifi c ITD. According to this idea, the ITD will be indicated by which ITD neuron is fi ring. This has been called a “place code” because ITD is indicated by the place (which neuron) where the activity occurs.

One way to describe the properties of ITD neurons is to measure ITD tuning curves, which plot the neuron’s fi ring rate against the ITD. Recording from neurons in the brain-stem of the barn owl, which has excellent auditory localiza-tion abilities, has revealed narrow tuning curves that respond best to specifi c ITDs, like the ones in Figure 12.13 (Carr & Konishi, 1990; McAlpine, 2005). The neurons associated with the curves on the left (blue) fi re when the sound reaches the

A1

Core area Belt area

Parabelt area

A

P

Figure 12.11 The three main auditory areas in the monkey cortex: the core area, which contains the primary auditory receiving area (A1); the belt area; and the parabelt area. P indicates the posterior end of the belt area, and A indicates the anterior end of the belt area. Signals, indicated by the arrows, travel from core to belt to parabelt. The dark lines indicate where the temporal lobe was pulled back to show areas that would not be visible from the surface. From Kaas, J. H., Hackett,

T. A., & Tramo, M. J. (1999). Auditory processing in primate cerebral cortex. Current Opinion in Neurobiology, 9, 164–170.

Figure 12.12 How the circuit proposed by Jeffress operates. Axons transmit signals from the left ear (blue) and the right ear (red) to neurons, indicated by circles. (a) Sound in front. Signals start in left and right channels simultaneously. (b) Signals meet at neuron 5, causing it to fi re. (c) Sound to the right. Signal starts in the right channel fi rst. (d) Signals meet at neuron 3, causing it to fi re. Adapted from Plack, C. J. (2005). The sense of hearing. New York: Erlbaum.

(a)

(b)

(c)

(d)

Sound from straight ahead:

Sound from the right:

Leftear

Leftear

Leftear

Leftear

Neuralcoincidencedetector

Rightear

Rightear

Rightear

Rightear

Axon

1 2 3

3

4 5

5

6 7 8 9

Interaural timedifference

Firi

ng r

ate

Left earfirst

0 Right earfirst

Figure 12.13 ITD tuning curves for six neurons that each respond to a narrow range of ITDs. The neurons on the left respond when sound reaches the left ear fi rst. The ones on the right respond when sound reaches the right ear fi rst. Neurons such as these have been recorded from the barn owl and other animals. However, when we consider mammals, another story emerges. Adapted from McAlpine, D., & Grothe, B.

(2003). Sound localization and delay lines: Do mammals fi t the model? Trends in Neurosciences, 26, 347–350.

Nial

l Ben

vie/C

ORBI

S

neuroni rispondono per aree localizzate

nello spazio

Percezione acustica ambientale //Localizzazione dei suoni

• Come localizziamo i suoni?

• Un esempio: la localizzazione di un grillo

• Un dilemma simile si ha anche quando si deve valutare la distanza di una fonte sonora

• Due orecchi: Fattore critico per la localizzazione dei suoni

• Differenze temporali

• Differenze di volume (loudness)

Page 9: Corso di Principi e Modelli della Percezione Prof ...

Percezione acustica ambientale //Localizzazione dei suoni: ILD

• Come localizziamo i suoni?

• Interaural level difference (ILD):

• Differenza in intensità percepita da un orecchio rispetto a quella percepita dall’altro orecchio in relazione alla stessa stimolazione acustica

Percezione acustica ambientale //Localizzazione dei suoni: ILD

294 CHAPTER 12 Sound Localization and the Auditory Scene

reaching the two ears. A difference in level between the two ears occurs because the head creates a barrier that reduces the intensity of sounds that reach the far ear. This reduction of intensity at the far ear occurs for high-frequency

1VLsounds, but not for low-frequency sounds.We can understand why an ILD occurs for high fre-

quencies but not for low frequencies by drawing an anal-ogy between sound waves and water waves. Consider, for example, a situation in which small ripples in the water are approaching the boat in Figure 12.5a. Because the ripples are small compared to the boat, they bounce off the side of the boat and go no further. Now imagine the same ripples approaching the cattails in Figure 12.5b. Because the dis-tance between the ripples is large compared to the cattails, the ripples are hardly disturbed and continue on their way. These two examples illustrate that an object can have a large effect on the wave if it is larger than the distance be-tween the waves, but has a small effect if its size is smaller than the distance between the waves.

B

A

Figure 12.4 ❚ The principle behind interaural time difference (ITD). The tone directly in front of the listener, at A, reaches the left and the right ears at the same time. However, when the tone is off to the side, at B, it reaches the listener’s right ear before it reaches the left ear.

(a) (c)

(d)

6,000 Hz

200 Hz

Acoustic shadow

(b)

Spacing smallcompared to object

Spacing largecompared to object

Figure 12.5 ❚ Why interaural level difference (ILD) occurs for high frequencies but not for low frequencies. (a) When water ripples are small compared to an object, such as this boat, they are stopped by the object. (b) The same ripples are large compared to the cattails, so they are unaffected by the cattails. (c) The spacing between high-frequency sound waves is small compared to the head. The head interferes with the sound waves, creating an acoustic shadow on the other side of the head. (d) The spacing between low-frequency sound waves is large compared to the person’s head, so the sound is unaffected by the head.

• Come localizziamo i suoni?

• Interaural level difference (ILD): funziona bene alle alte frequenze

Page 10: Corso di Principi e Modelli della Percezione Prof ...

Percezione acustica ambientale //Localizzazione dei suoni: ILD

292 CHAPTER 12 Auditory Localization and Organization

because the head is a barrier that creates an acoustic shadow, reducing the intensity of sounds that reach the far ear. This reduction of intensity at the far ear occurs for high-frequency sounds, as shown in Figure 12.5a, but not for low-frequency sounds, as shown in Figure 12.5b. VL

We can understand why an ILD occurs for high fre-quencies but not for low frequencies by drawing an analogy between sound waves and water waves. Consider, for example, a situation in which small ripples in the water are approach-ing the boat in Figure 12.5c. Because the ripples are small compared to the boat, they bounce off the side of the boat and go no further. Now imagine the same ripples approach-ing the cattails in Figure 12.5d. Because the distance between the ripples is large compared to the stems of the cattails, the ripples are hardly disturbed and continue on their way. These two examples illustrate that an object has a large effect on the wave if it is larger than the distance between the waves, but has a small effect if it is smaller than the distance between the waves. When we apply this principle to sound waves

Figure 12.4 The principle behind interaural time difference (ITD). The tone directly in front of the listener, at A, reaches the left and right ears at the same time. However, when the tone is off to the side, at B, it reaches the listener’s right ear before it reaches the left ear. © Cengage Learning

B

A

(c) (d)

(a)

6,000 Hz

Acoustic shadow

Spacing smallcompared to object

(b)

200 Hz

Spacing largecompared to object

Figure 12.5 Why interaural level difference (ILD) occurs for high frequencies but not for low frequencies. (a) Person listening to a high-frequency sound; (b) person listening to a low-frequency sound. (c) When the spacing between waves is smaller than the size of the object, illustrated here by water ripples that are smaller than the boat, the waves are stopped by the object. This occurs for the high-frequency sound waves in (a) and causes the sound intensity to be lower on the far side of the listener’s head. (d) When the spacing between waves is larger than the size of the object, as occurs for the water ripples and the narrow stalks of the cattails, the object does not interfere with the waves. This occurs for the low-frequency sound waves in (b), so the sound intensity on the far side of the head is not affected. © Cengage Learning

Percezione acustica ambientale //Localizzazione dei suoni: ILD

• Interaural level difference(ILD):

• Differenza in intensità percepita da un orecchio rispetto a quella percepita dall’altro orecchio in relazione alla stessa stimolazione acustica

• I suoni sono più intensi per l’orecchio più vicino alla fonte sonora

• ILD è massimo per 90 gradi, mentre è nullo per 0 gradi e 180 gradi

• ILD correla generalmente con l’angolo della fonte sonora, ma la correlazione non è così robusta come per l’ITDs

• E’ piu’ importante per le frequenze alte.

Page 11: Corso di Principi e Modelli della Percezione Prof ...

Percezione acustica ambientale //Localizzazione dei suoni: ILD

When we apply this principle to sound waves inter-acting with a listener’s head, we find that high-frequency sound waves (which are small compared to the size of the head) are disrupted by the head (Figure 12.5c), but that low-frequency waves are not (Figure 12.5d). This disruption of high-frequency sound waves creates a decrease in sound intensity on the far side of the head, called the acoustic shadow (Figure 12.5c).

This effect of frequency on the interaural level differ-ence has been measured by using small microphones to record the intensity of the sound reaching each ear in re-sponse to a sound source located at different positions rela-tive to the head (Figure 12.6). The results show that the level is affected only slightly by changes in location for low fre-quencies, but that the level is greatly affected by location for higher frequencies.

Using Binaural Cues for Perceiving Azimuth Locations When we consider ITD and ILT together, we see that they complement each other. ITD provides in-formation about the location of low-frequency sounds, and ILD provides information about the location of high-frequency sounds.

ITD and ILD provide information that enables people to judge location along the azimuth coordinate, but pro-vide ambiguous information about the elevation of a sound source. You can demonstrate why this is so by considering a sound source located directly in front of your face at arm’s length, which would be equidistant from your left and right ears, so ITD and ILD would be zero. If you now increase the sound source’s elevation by moving it straight up so it is above your head, it is still equidistant from the two ears, so both ITD and ILD are still zero.

Thus, the ITD and ILD can be the same at a number of different elevations, and therefore can’t reliably indicate

the elevation of the sound source. Similar ambiguous in-formation is provided when the sound source is off to the side. These places of ambiguity are illustrated by the cone of confusion shown in Figure 12.7. All points on this cone have the same ILD and ITD. For example, points A and B would result in the same ILD and ITD because they are both the same distance from the left ear and from the right ear. Similar situations occur for other points on the cone.

The ambiguous nature of the information provided by ITD and ILD at different elevations means that another source of information is needed to locate sounds along the elevation coordinate. This information is provided by a monaural cue—a cue that depends on information from only one ear.

Monaural Cue for LocalizationThe primary monaural cue for localization is called a spectral cue, because the information for localization is contained in differences in the distribution (or spectrum) of frequencies that reach the ear from different locations. These differences are caused by the fact that before the sound stimulus enters the auditory canal, it is reflected from the head and within the various folds of the pinnae. The effect of this interaction with the head and pinnae has been measured by placing small microphones inside a lis-tener’s ears and comparing frequencies from sounds that are coming from different directions.

This effect is illustrated in Figure 12.8, which shows the frequencies picked up by the microphone when a broad-band sound (one containing many frequencies) is presented at elevations of 15 degrees above the head and 15 degrees below the head. Sounds coming from these two locations would result in the same ITD and ILD, because they are the same distance from the left and right ears, but differences

Auditory Localization 295

0

2

4

6

8

10

Frequency (Hz)

Inte

raur

al le

vel d

iffer

ence

(dB

)

20 100 1,000 2,000 5,000

10°

45°

90°

q

Figure 12.6 ❚ The three curves indicate interaural level difference (ILD) as a function of frequency for three different sound source locations. Note that the ILD is greater for locations farther to the side and is greater for all three locations at higher frequencies. (Adapted from Hartmann, 1999.)

Percezione acustica ambientale //Localizzazione dei suoni: ILD

• Interaural level difference(ILD): Differenza in intensità percepita da un orecchio rispetto a quella percepita dall’altro orecchio in relazione alla stessa stimolazione acustica

• I suoni sono più intensi per l’orecchio più vicino alla fonte sonora

• ILD è massimo per 90 gradi, mentre è nullo per 0 gradi e 180 gradi

• ILD correla generalmente con l’angolo della fonte sonora, ma la correlazione non è così robusta come per l’ITDs

• E’ piu’ importante per le frequenze alte.

Interaural level differences for tones of different frequenciespresented at different positions

Page 12: Corso di Principi e Modelli della Percezione Prof ...

Percezione acustica ambientale //Localizzazione dei suoni: fisiologia dell’ILD

• Oliva superiore laterale (LSOs): Qui ci sono neuroni che sono sensibili alla differenza di intensità fra i due orecchi

• Connessioni eccitatorie con LSO provengono dall’orecchio ipsilaterale

• Connessioni inibitorie con LSO provengono dall’orecchio contralaterale

Percezione acustica ambientale //Localizzazione dei suoni: ITD vs. ILD

ITD piu’ importante per le frequenze

basse

ILD piu’ importante per le frequenze

alte

Page 13: Corso di Principi e Modelli della Percezione Prof ...

Percezione acustica ambientale //Localizzazione dei suoni: verticale

Auditory Localization 291

created by the way sound interacts with the listener’s head and ears.

There are two kinds of location cues: binaural cues, which depend on both ears, and monaural cues, which depend on just one ear. Researchers studying these cues have determined how well people can locate the position of a sound in three dimensions: the azimuth, which extends from left to right (Figure 12.3); elevation, which extends up and down; and the distance of the sound source from the listener. In this chapter, we will focus on the azimuth and elevation.

Binaural Cues for Sound LocalizationBinaural cues use information reaching both ears to deter-mine the azimuth (left–right position) of sounds. The two binaural cues are interaural time difference and interaural level difference. Both are based on a comparison of the sound sig-nals reaching the left and right ears. Sounds that are off to the side reach one ear before the other and are louder at one ear than the other.

Interaural Time Difference The interaural time difference (ITD) is the difference between when a sound reaches the left ear and when it reaches the right ear (Figure 12.4). If the source is located directly in front of the listener, at A, the distance to each ear is the same; the sound reaches the left and right ears simultaneously, so the ITD is zero. However, if a source is located off to the side, at B, the sound reaches the right ear before it reaches the left ear. Because the ITD becomes larger as sound sources are located more to the side, the mag-nitude of the ITD can be used as a cue to determine a sound’s location. Behavioral research, in which listeners judge sound locations as ITD is varied, indicates that ITD is an effec-tive cue for localizing low-frequency sounds ( Wightman & Kistler, 1997, 1998).

Interaural Level Difference The other binaural cue, interaural level difference (ILD), is based on the difference in the sound pressure level (or just “level”) of the sound reaching the two ears. A difference in level between the two ears occurs

Cat

“Tweet, tweet”

TweetMeowTweet

“Meow”Bird

Figure 12.2 Comparing location information for vision and hearing. Vision: The bird and the cat, which are located at different places, are imaged on different places on the retina. Hearing: The frequencies in the sounds from the bird and cat are spread out over the cochlea, with no regard to the animals’ locations. © Cengage Learning

Figure 12.3 The three directions used for studying sound localization: azimuth (left–right), elevation (up–down), and distance.© Cengage Learning

Distance

Elevation(up–down)

Azimuth(left–right)

Percezione acustica ambientale //Localizzazione dei suoni: verticale

Auditory Localization 291

created by the way sound interacts with the listener’s head and ears.

There are two kinds of location cues: binaural cues, which depend on both ears, and monaural cues, which depend on just one ear. Researchers studying these cues have determined how well people can locate the position of a sound in three dimensions: the azimuth, which extends from left to right (Figure 12.3); elevation, which extends up and down; and the distance of the sound source from the listener. In this chapter, we will focus on the azimuth and elevation.

Binaural Cues for Sound LocalizationBinaural cues use information reaching both ears to deter-mine the azimuth (left–right position) of sounds. The two binaural cues are interaural time difference and interaural level difference. Both are based on a comparison of the sound sig-nals reaching the left and right ears. Sounds that are off to the side reach one ear before the other and are louder at one ear than the other.

Interaural Time Difference The interaural time difference (ITD) is the difference between when a sound reaches the left ear and when it reaches the right ear (Figure 12.4). If the source is located directly in front of the listener, at A, the distance to each ear is the same; the sound reaches the left and right ears simultaneously, so the ITD is zero. However, if a source is located off to the side, at B, the sound reaches the right ear before it reaches the left ear. Because the ITD becomes larger as sound sources are located more to the side, the mag-nitude of the ITD can be used as a cue to determine a sound’s location. Behavioral research, in which listeners judge sound locations as ITD is varied, indicates that ITD is an effec-tive cue for localizing low-frequency sounds ( Wightman & Kistler, 1997, 1998).

Interaural Level Difference The other binaural cue, interaural level difference (ILD), is based on the difference in the sound pressure level (or just “level”) of the sound reaching the two ears. A difference in level between the two ears occurs

Cat

“Tweet, tweet”

TweetMeowTweet

“Meow”Bird

Figure 12.2 Comparing location information for vision and hearing. Vision: The bird and the cat, which are located at different places, are imaged on different places on the retina. Hearing: The frequencies in the sounds from the bird and cat are spread out over the cochlea, with no regard to the animals’ locations. © Cengage Learning

Figure 12.3 The three directions used for studying sound localization: azimuth (left–right), elevation (up–down), and distance.© Cengage Learning

Distance

Elevation(up–down)

Azimuth(left–right)

294 CHAPTER 12 Auditory Localization and Organization

The primary monaural cue for localization is called a spectral cue, because the information for localization is contained in differences in the distribution (or spectrum) of frequencies that reach each ear from different locations. These differences are caused by the fact that before the sound stimulus enters the auditory canal, it is refl ected from the head and within the various folds of the pinnae (Figure 12.8a). The effect of this interaction with the head and pinnae has

been measured by placing small microphones inside a lis-tener’s ears and comparing frequencies from sounds that are coming from different directions.

This effect is illustrated in Figure 12.8b, which shows the frequencies picked up by the microphone when a broadband sound (one containing many frequencies) is presented at ele-vations of 15 degrees above the head and 15 degrees below the head. Sounds coming from these two locations would result in the same ITD and ILD because they are the same distance from the left and right ears, but differences in the way the sounds bounce around within the pinna create dif-ferent patterns of frequencies for the two locations (King et al., 2001). The importance of the pinna for determining elevation has been demonstrated by showing that smoothing out the nooks and crannies of the pinnae with molding com-pound makes it diffi cult to locate sounds along the elevation coordinate (Gardner & Gardner, 1973).

The idea that localization can be affected by using a mold to change the inside contours of the pinnae was also demonstrated by Paul Hofman and coworkers (1998). They determined how localization changes when the mold is worn for several weeks, and then what happens when the mold is removed. The results for one listener’s localization performance measured before the mold was inserted are shown in Figure 12.9a. Sounds were presented at positions indicated by the intersections of the black grid. Average localization performance is indicated by the blue grid. The overlap between the two grids indicates that localization was fairly accurate.

After measuring initial performance, Hofman fi tted his listeners with molds that altered the shape of the pinnae and therefore changed the spectral cue. Figure 12.9b shows that

A

B

Figure 12.7 The “cone of confusion.” There are many pairs of points on this cone that have the same left-ear distance and right-ear distance and so result in the same ILD and ITD. There are also other cones in addition to this one. © Cengage Learning

Bruc

e Go

ldst

ein

Frequency (kHz)

10 d

B

3

(a) (b)

4 5 6 7 8 9 10

15°

–15°

15° Elevation

–15° Elevation

pinna

lobe

Figure 12.8 (a) Pinna showing sound bouncing around in nooks and crannies. (b) Frequency spectra recorded by a small microphone inside the listener’s right ear for the same broadband sound coming from two different locations. The difference in the pattern when the sound is 15 degrees above the head (blue curve) and 15 degrees below the head (red curve) is caused by the way different frequencies bounce around within the pinna when entering it from different angles. Adapted from

Plack, C. J. (2005). The sense of hearing, Figure 9.11. New York: Psychology Press. Ear photo by Bruce Goldstein.

cono di confusione

Page 14: Corso di Principi e Modelli della Percezione Prof ...

Percezione acustica ambientale //Localizzazione dei suoni: verticale

• Indicatori mono-aurali

294 CHAPTER 12 Auditory Localization and Organization

The primary monaural cue for localization is called a spectral cue, because the information for localization is contained in differences in the distribution (or spectrum) of frequencies that reach each ear from different locations. These differences are caused by the fact that before the sound stimulus enters the auditory canal, it is refl ected from the head and within the various folds of the pinnae (Figure 12.8a). The effect of this interaction with the head and pinnae has

been measured by placing small microphones inside a lis-tener’s ears and comparing frequencies from sounds that are coming from different directions.

This effect is illustrated in Figure 12.8b, which shows the frequencies picked up by the microphone when a broadband sound (one containing many frequencies) is presented at ele-vations of 15 degrees above the head and 15 degrees below the head. Sounds coming from these two locations would result in the same ITD and ILD because they are the same distance from the left and right ears, but differences in the way the sounds bounce around within the pinna create dif-ferent patterns of frequencies for the two locations (King et al., 2001). The importance of the pinna for determining elevation has been demonstrated by showing that smoothing out the nooks and crannies of the pinnae with molding com-pound makes it diffi cult to locate sounds along the elevation coordinate (Gardner & Gardner, 1973).

The idea that localization can be affected by using a mold to change the inside contours of the pinnae was also demonstrated by Paul Hofman and coworkers (1998). They determined how localization changes when the mold is worn for several weeks, and then what happens when the mold is removed. The results for one listener’s localization performance measured before the mold was inserted are shown in Figure 12.9a. Sounds were presented at positions indicated by the intersections of the black grid. Average localization performance is indicated by the blue grid. The overlap between the two grids indicates that localization was fairly accurate.

After measuring initial performance, Hofman fi tted his listeners with molds that altered the shape of the pinnae and therefore changed the spectral cue. Figure 12.9b shows that

A

B

Figure 12.7 The “cone of confusion.” There are many pairs of points on this cone that have the same left-ear distance and right-ear distance and so result in the same ILD and ITD. There are also other cones in addition to this one. © Cengage Learning

Bruc

e Go

ldst

ein

Frequency (kHz)

10 d

B

3

(a) (b)

4 5 6 7 8 9 10

15°

–15°

15° Elevation

–15° Elevation

pinna

lobe

Figure 12.8 (a) Pinna showing sound bouncing around in nooks and crannies. (b) Frequency spectra recorded by a small microphone inside the listener’s right ear for the same broadband sound coming from two different locations. The difference in the pattern when the sound is 15 degrees above the head (blue curve) and 15 degrees below the head (red curve) is caused by the way different frequencies bounce around within the pinna when entering it from different angles. Adapted from

Plack, C. J. (2005). The sense of hearing, Figure 9.11. New York: Psychology Press. Ear photo by Bruce Goldstein.

Percezione acustica ambientale //Localizzazione dei suoni: Percezione attiva

• Potenziali problemi legati all’utilizzo degli indizi di ITDs e ILDs per la localizzazione dei suoni

• Si possono superare utilizzando la percezione attiva:

• girare il capo

Page 15: Corso di Principi e Modelli della Percezione Prof ...

Percezione acustica ambientale //Localizzazione dei suoni: distanza

Auditory Localization 291

created by the way sound interacts with the listener’s head and ears.

There are two kinds of location cues: binaural cues, which depend on both ears, and monaural cues, which depend on just one ear. Researchers studying these cues have determined how well people can locate the position of a sound in three dimensions: the azimuth, which extends from left to right (Figure 12.3); elevation, which extends up and down; and the distance of the sound source from the listener. In this chapter, we will focus on the azimuth and elevation.

Binaural Cues for Sound LocalizationBinaural cues use information reaching both ears to deter-mine the azimuth (left–right position) of sounds. The two binaural cues are interaural time difference and interaural level difference. Both are based on a comparison of the sound sig-nals reaching the left and right ears. Sounds that are off to the side reach one ear before the other and are louder at one ear than the other.

Interaural Time Difference The interaural time difference (ITD) is the difference between when a sound reaches the left ear and when it reaches the right ear (Figure 12.4). If the source is located directly in front of the listener, at A, the distance to each ear is the same; the sound reaches the left and right ears simultaneously, so the ITD is zero. However, if a source is located off to the side, at B, the sound reaches the right ear before it reaches the left ear. Because the ITD becomes larger as sound sources are located more to the side, the mag-nitude of the ITD can be used as a cue to determine a sound’s location. Behavioral research, in which listeners judge sound locations as ITD is varied, indicates that ITD is an effec-tive cue for localizing low-frequency sounds ( Wightman & Kistler, 1997, 1998).

Interaural Level Difference The other binaural cue, interaural level difference (ILD), is based on the difference in the sound pressure level (or just “level”) of the sound reaching the two ears. A difference in level between the two ears occurs

Cat

“Tweet, tweet”

TweetMeowTweet

“Meow”Bird

Figure 12.2 Comparing location information for vision and hearing. Vision: The bird and the cat, which are located at different places, are imaged on different places on the retina. Hearing: The frequencies in the sounds from the bird and cat are spread out over the cochlea, with no regard to the animals’ locations. © Cengage Learning

Figure 12.3 The three directions used for studying sound localization: azimuth (left–right), elevation (up–down), and distance.© Cengage Learning

Distance

Elevation(up–down)

Azimuth(left–right)

Percezione acustica ambientale //Percezione della distanza del suono

• Come si stima la distanza di una fonte sonora?

• L’intensità relativa del suono (volume, loudness)

• Legge dell’inverso del quadrato: al crescere della distanza della sorgente sonora l’intensità sonora descresce con il quadrato della distanza

• Componenti spettrali dei suoni:

• Le frequenze più alte dei suoni perdono energia più rapidamente rispetto alle basse frequenze via via che i suoni si propagano nello spazio (d > 1000m)

• Esempio il tuono

• Energia riverberante

• Quantità relativa di energia diretta (fonti vicine) vs quella di ritorno (fonti lontane)

Page 16: Corso di Principi e Modelli della Percezione Prof ...

Percezione acustica ambientale //Percezione della distanza del suono

The Physiology of Auditory Localization 297

If the sound source is directly in front of the listener, the sound reaches the left and right ears simultaneously, and sig-nals from the left and right ears start out together, as shown in Figure 12.12a. As each signal travels along its axon, it stimu-lates each neuron along the axon in turn. At the beginning of the journey, neurons receive signals from only the left ear (neurons 1, 2, 3) or the right ear (neurons 9, 8, 7), but not both, and they do not fi re. But when the signals both reach neuron 5 together, that neurons fi res (Figure 12.12b). This neuron and the others in this circuit are called coincidence detectors, because they only fi re when both signals coincide by arriving at the neuron simultaneously. The fi ring of neu-ron 5 indicates that ITD = 0.

If the sound comes from the right, similar events occur, but the signal from the right ear has a head start, as shown in Figure 12.12c, and both signals reach neuron 3 simulta-neously (Figure 12.12d), so this neuron fi res. This neuron, therefore, detects ITDs that occur when the sound is coming from a specifi c location on the right. The other neurons in the circuit fi re to locations corresponding to other ITDs.

The Jeffress model therefore proposes a circuit that involves “ITD detectors,” and it also proposes that there are a series of these detectors, each tuned to respond best to a specifi c ITD. According to this idea, the ITD will be indicated by which ITD neuron is fi ring. This has been called a “place code” because ITD is indicated by the place (which neuron) where the activity occurs.

One way to describe the properties of ITD neurons is to measure ITD tuning curves, which plot the neuron’s fi ring rate against the ITD. Recording from neurons in the brain-stem of the barn owl, which has excellent auditory localiza-tion abilities, has revealed narrow tuning curves that respond best to specifi c ITDs, like the ones in Figure 12.13 (Carr & Konishi, 1990; McAlpine, 2005). The neurons associated with the curves on the left (blue) fi re when the sound reaches the

A1

Core area Belt area

Parabelt area

A

P

Figure 12.11 The three main auditory areas in the monkey cortex: the core area, which contains the primary auditory receiving area (A1); the belt area; and the parabelt area. P indicates the posterior end of the belt area, and A indicates the anterior end of the belt area. Signals, indicated by the arrows, travel from core to belt to parabelt. The dark lines indicate where the temporal lobe was pulled back to show areas that would not be visible from the surface. From Kaas, J. H., Hackett,

T. A., & Tramo, M. J. (1999). Auditory processing in primate cerebral cortex. Current Opinion in Neurobiology, 9, 164–170.

Figure 12.12 How the circuit proposed by Jeffress operates. Axons transmit signals from the left ear (blue) and the right ear (red) to neurons, indicated by circles. (a) Sound in front. Signals start in left and right channels simultaneously. (b) Signals meet at neuron 5, causing it to fi re. (c) Sound to the right. Signal starts in the right channel fi rst. (d) Signals meet at neuron 3, causing it to fi re. Adapted from Plack, C. J. (2005). The sense of hearing. New York: Erlbaum.

(a)

(b)

(c)

(d)

Sound from straight ahead:

Sound from the right:

Leftear

Leftear

Leftear

Leftear

Neuralcoincidencedetector

Rightear

Rightear

Rightear

Rightear

Axon

1 2 3

3

4 5

5

6 7 8 9

Interaural timedifference

Firi

ng r

ate

Left earfirst

0 Right earfirst

Figure 12.13 ITD tuning curves for six neurons that each respond to a narrow range of ITDs. The neurons on the left respond when sound reaches the left ear fi rst. The ones on the right respond when sound reaches the right ear fi rst. Neurons such as these have been recorded from the barn owl and other animals. However, when we consider mammals, another story emerges. Adapted from McAlpine, D., & Grothe, B.

(2003). Sound localization and delay lines: Do mammals fi t the model? Trends in Neurosciences, 26, 347–350.

Nial

l Ben

vie/C

ORBI

S

pB: spatial tuning

aB: frequency tuning

suoni semplici

regione integrativa complessa

Percezione acustica ambientale //Percezione della distanza del suono

300 CHAPTER 12 Auditory Localization and Organization

for vision (see page 84), are called the what and where path-ways for audition (Kaas & Hackett, 1999). The blue arrow in Figure  12.17 indicates the what pathway, which starts in the front (anterior) part of the core and belt (indicated by the “A” in Figure 12.11) and extends to the prefrontal cortex. The what pathway is responsible for identifying sounds. The red arrow in Figure 12.17 indicates the where pathway, which starts in the rear (posterior) part of the core and belt and extends to the prefrontal cortex. This is the pathway associ-ated with locating sounds.

The branching of the auditory system into these two pathways or streams is indicated by the difference between how information in the posterior and anterior areas of the belt are processed. We have seen that neurons in the posterior belt have better spatial tuning than neurons in A1. Research has also shown that while monkey A1 neurons are activated by simple sounds such as pure tones, neurons in the anterior area of the belt respond to more complex sounds, such as monkey calls—vocalizations recorded from monkeys in the jungle (Rauschecker & Tian, 2000). Thus, the posterior belt is associated with spatial tuning, and the anterior belt is associ-ated with identifying different types of sounds. This difference between posterior and anterior areas of the belt represents the difference between where and what auditory pathways.

Additional evidence for what and where auditory pathways is provided by Stephen Lomber and Shveta Malhortra (2008), who showed that temporarily deactivating a cat’s anterior auditory areas by cooling the cortex disrupts the cat’s ability to tell the difference between two patterns of sounds, but does not affect the cat’s ability to localize sounds ( Figure 12.18a). Conversely, deactivating the cat’s posterior auditory areas dis-rupts the cat’s ability to localize sounds, without affecting the cat’s ability to tell the difference between different patterns of sounds (Figure 12.18b). If the design of this experiment seems familiar, it is because it is the same as the design of Ungerleider and Mishkin’s (1982) experiment that demonstrated what and where visual pathways in the monkey (compare Figure 12.18 to Figure 4.13). In both experiments, lesioning one area

(for the vision experiment) or deactivating one area (for the hearing experiment) eliminated a what function, and lesion-ing or deactivating another area eliminated a where function.

Cases of human brain damage also support the what/where idea (Clarke et al., 2002). For example, Figure 12.19a shows the areas of the cortex that are damaged in J.G., a 45-year-old man with temporal lobe damage caused by a head injury, and E.S., a 64-year-old woman with parietal and fron-tal lobe damage caused by a stroke. Figure 12.19b shows that J.G. can locate sounds, but his recognition is poor, whereas E.S. can recognize sounds, but her ability to locate them is poor. Thus, J.G.’s what stream is damaged, and E.S’s where stream is damaged. Other researchers have also provided evi-dence for auditory what and where pathways by using brain scanning to show that what and where tasks activate differ-ent brain areas in humans (Alain et al., 2001, 2009; De Santis et al., 2007; Wissinger et al., 2001).

We can summarize where information about auditory localization is processed in the cortex as follows: Lesion and cooling studies indicate that A1 is important for localiza-tion. However, additional research indicates that processing information about location also occurs in the belt area and then continues farther in the where processing stream, which extends from the temporal lobe to the prefrontal area in the frontal lobe.

A1

Prefrontal cortex

Where

What

Figure 12.17 Auditory what and where pathways. The blue arrow from the anterior core and belt is the what pathway. The red arrow from the posterior core and belt is the where pathway. Adapted from

Poremba, A., Saunders, R. C., Crane, A. M., Cook, M., Sokoloff, L., & Mishkin, M. (2003). Functional mapping of the

primate auditory system. Science, 299, 568–572.

Area deactivated(posterior)

Can’t locate sounds(Can identify sounds)

Can’t identify sounds(Can locate sounds)

Area deactivated(anterior)

(a) “What” pathway deactivated

(b) “Where” pathway deactivated

Figure 12.18 Results of Lomber and Malhortra’s (2008) experiment. (a) When the anterior (what) auditory area of the cat was deactivated by presenting a small cooling probe within the purple area, the cat could not identify sounds but could locate sounds. (b) When the posterior (where) auditory area was deactivated by presenting a cooling probe within the green area, the cat could not locate sounds but could identify sounds. © Cengage Learning 2014

localizzazione suoni

identificazione di “oggetti uditivi”

Page 17: Corso di Principi e Modelli della Percezione Prof ...

Percezione acustica ambientale //Percezione della distanza del suono

• Che cosa accade in situazioni ecologiche (naturali)?

• Un ambiente acustico può essere un luogo molto complesso

• Fonti acustiche multiple

• Come fa il sistema acustico a distinguere fra queste diverse fonti?

• Segregazione della fonte o analisi della scena acustica

Percezione acustica ambientale //Analisi della scena acustica

Page 18: Corso di Principi e Modelli della Percezione Prof ...

• L’effetto “cocktail party”:

• riusciamo a prestare attenzione a una conversazione fra tante (Colin Cherry, 1953)

• possiamo utilizzare indizi spaziali, temporali, e spettrali per separare gli stream, ma non possiamo prestare attenzione a più stream contemporaneamente

Percezione acustica ambientale //Analisi della scena acustica

• Segregazione della fonte o analisi della scena acustica

• Strategie possibili:

• Separazione spaziale fra i suoni

• Separazione sulla base dello spettro dei suoni o sulle qualità temporali (temporal qualities)

• Segregazione del flusso audio: Organizzazione percettiva di un segnale acustico complesso in diversi eventi acustici che vengono percepiti come flussi acustici distinti

Percezione acustica ambientale //Analisi della scena acustica

Page 19: Corso di Principi e Modelli della Percezione Prof ...

Percezione acustica ambientale //Analisi della scena acustica

Freq

uenc

y (H

z)

• Segregazione del flusso audio: Organizzazione percettiva di un segnale acustico complesso in diversi eventi acustici che vengono percepiti come flussi acustici distinti

Percezione acustica ambientale //Analisi della scena acustica

“Toccata e Fuga” di Bach

Page 20: Corso di Principi e Modelli della Percezione Prof ...

Percezione acustica ambientale //Analisi della scena acustica

Miller & Heise, 1950, for an early demonstration of 3, 4VLauditory stream segregation).

This grouping of tones into streams by similarity of pitch is also demonstrated by an experiment done by Breg-man and Alexander Rudnicky (1975). The listener is first presented with two standard tones, X and Y (Figure 12.16a). When these tones are presented alone, it is easy to perceive

their order (XY or YX). However, when these tones are sand-wiched between two distractor (D) tones (Figure 12.16b), it becomes very hard to judge their order. The name distractor tones is well chosen: they distract the listener, making it

5VLdifficult to judge the order of tones X and Y.But the distracting effect of the D tones can be elimi-

nated by adding a series of captor (C) tones (Figure 12.16c).

Perceptually Organizing Sounds in the Environment 301

Figure 12.14 ❚ Four measures of a composition by J. S. Bach (Choral Prelude on Jesus Christus unser Heiland, 1739). When played rapidly, the upper notes become perceptually grouped, and the lower notes become perceptually grouped, a phenomenon called auditory stream segregation.

High

Low

Pitc

h

(a) Tones alternated slowly Perception: Hi-Lo-Hi-Lo-Hi-Lo

High stream

Low stream

Hi

Hi

Hi

LoLo

Lo

(b) Tones alternated rapidly Perception: Two separate streams

Figure 12.15 ❚ (a) When high and low tones are alternated slowly, auditory stream segregation does not occur, so the listener perceives alternating high and low tones. (b) Faster alternation results in segregation into high and low streams.

High

Low

Pitc

h

(a) Standards

X

Y

X

Y

XX

Y Y

or

D D C C C D C CD

(b) Order judgment is difficult

(c) Order judgment is easier

Figure 12.16 ❚ Bregman and Rudnicky’s (1975) experiment. (a) The standard tones X and Y have different pitches. (b) The distractor (D) tones group with X and Y, making it difficult to judge the order of X and Y. (c) The addition of captor (C) tones with the same pitch as the distractor tones causes the distractor tones to form a separate stream (grouping by similarity), making it easier to judge the order of tones X and Y. (Based on Bregman & Rudnicky, 1975.)

• Segregazione del flusso audio: Organizzazione percettiva di un segnale acustico complesso in diversi eventi acustici che vengono percepiti come flussi acustici distinti

Percezione acustica ambientale //Analisi della scena acustica

302 CHAPTER 12 Sound Localization and the Auditory Scene

ear (which started with a high note) and the lower notes in the left ear (which started with a low note).

The scale illusion highlights an important property of perceptual grouping. Most of the time, the principles of au-ditory grouping help us to accurately interpret what is hap-pening in the environment. It is most effective to perceive similar sounds as coming from the same source because this is what usually happens in the environment. In Deutsch’s experiment, the perceptual system applies the principle of grouping by similarity to the artificial stimuli presented through earphones and makes the mistake of assigning sim-ilar pitches to the same ear. But most of the time, when psy-chologists aren’t controlling the stimuli, sounds with similar frequencies tend to be produced by the same sound source, so the auditory system usually uses pitch to correctly

7, 8VLdetermine where sounds are coming from.

Proximity in Time We have already seen that sounds that stop and start at different times tend to be produced by different sources. If you are listening to one instrument playing and then another one joins in later, you know that two sources are present because of the cue of onset time. An-other time cue is based on the fact that sounds that occur in rapid progression tend to be produced by the same source. We can illustrate the importance of timing in stream seg-regation by returning to our examples of grouping by simi-larity. Before stream segregation by similarity of timbre or pitch can occur, tones with similar timbres or frequencies have to occur close together in time. If the tones are too far

These tones work as “captors” because they have the same pitch as the distractors, so they capture the distractors and form a stream that separates the distractors from tones X and Y. The result is that X and Y are perceived as one stream and the distractors as another stream, making it much eas-ier to perceive the order of X and Y.

Figure 12.17 provides another demonstration of group-ing by similarity. Figure 12.17a shows two streams of sound, one a series of similar repeating notes (red), and the other, a scale that goes up (blue). Figure 12.17b shows how this stimulus is perceived if the tones are presented fairly rap-idly. At first the two streams are separated, so listeners si-multaneously perceive the same note repeating and a scale. However, when the frequencies of the two stimuli become similar, something interesting happens. Grouping by simi-larity of frequency occurs, and perception changes to a back-and-forth “galloping” between the tones of the two streams. Then, as the scale continues upward and the frequencies become more separated, the two sequences are again

6VLperceived as separated.A final example of how similarity of pitch causes

grouping is an effect called the scale illusion, or melodic channeling. Diana Deutsch (1975, 1996) demonstrated this effect by presenting two sequences of notes simultaneously through earphones, one to the right ear and one to the left (Figure 12.18a). Notice that the notes presented to each ear jump up and down and do not create a scale. However, Deutsch’s listeners perceived smooth sequences of notes in each ear, with the higher notes in the right ear and the lower ones in the left ear (Figure 12.18b). Even though each ear received both high and low notes, grouping by similarity of pitch caused listeners to group the higher notes in the right

(a) Physical stimulus

(b) Perception

“Galloping”

Figure 12.17 ❚ (a) Two sequences of stimuli: a series of similar notes (red) and a scale (blue). (b) Perception of these stimuli: Separate streams are perceived when they are far apart in frequency, but when the frequencies are in the same range, the tones appear to jump back and forth between stimuli.

(a) How notes are presented

(b) What the listener hears

Right ear

Left ear

Right ear

Left ear

Figure 12.18 ❚ (a) These stimuli were presented to a listener’s right ear (red) and left ear (blue) in Deutsch’s (1975) scale illusion experiment. Notice how the notes presented to each ear jump up and down. (b) What the listener hears: Although the notes in each ear jump up and down, the listener perceives a smooth sequence of notes in each ear. This effect is called the scale illusion, or melodic channeling. (Adapted from Deutsch, 1975.)

illusione della scala