Psychoacoustics - University of...

12

Transcript of Psychoacoustics - University of...

Page 1: Psychoacoustics - University of Ljubljanamafija.fmf.uni-lj.si/seminar/files/2013_2014/Psychoacoustics.pdf · listener which occurs as a result of sound perception. Psychoacoustics

Psychoacoustics

Author: Nejc Rosenstein

Advisor: Simon �irca

8. 1. 2014

Abstract

We introduce psychoacoustics as a branch of physics which explores the link between physical

properties of sound and listener's perception. We describe experiments which measure threshold,

localization and frequency selectivity. We introduce simple physical models of perception of sound in

human pinna and cochlea. The results of experiments in psychoacustics, which are in agreement with

prediction of both models, are presented.

Page 2: Psychoacoustics - University of Ljubljanamafija.fmf.uni-lj.si/seminar/files/2013_2014/Psychoacoustics.pdf · listener which occurs as a result of sound perception. Psychoacoustics

Contents

1 Introduction 2

2 Thresholds 2

3 Frequency selectivity 3

3.1 Physical model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.2 Auditory �lters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

4 Localization 8

4.1 Physical model of pinna . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84.2 Psychoacoustical investigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94.3 Experimentation using headphones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

5 Conclusion 11

1 Introduction

Two major disciplines that dominate the �eld of research of the human auditory system are auditoryphysiology, which focuses on functions of biological systems that a�ect the hearing process, and psychoa-coustics [1]. The latter deals with relation between sound i.e. physical stimuli and the sensation in thelistener which occurs as a result of sound perception. Psychoacoustics therefore encompasses research ofthresholds, frequency analysis and masking, perception of pitch, timbre and loudness, analysis of auditoryscene, temporal processing and other properties of the auditory system. Psychophysical experiments in-volve listeners as test subjects who usually have to make some judgments about the sound they detectedand produce a certain response. Those experiments have to be carefully planned as much e�ort is neededto perform them in a proper way, especially when scientists aim to reproduce a certain sound phenomenaby the means of some device, for example headphones.

2 Thresholds

The sound pressure level (SPL) is de�ned as L = 20 log(p/p0) with decibels, dB, as units, where p isthe measured sound pressure and p0 is the absolute threshold, i.e. the lowest sound level which canbe detected by humans [2]. Since it is impossible to determine its exact value, the threshold is usuallyde�ned as the sound level at which the probability of detection equals a certain value, which is determinedexperimentally. In a common experiment where test subject have to tell, whether they perceived a soundor not (two-choice task), that probability usually equals 75% [2]. The threshold is frequency-dependentand is di�erent if the listener is using only one or both ears (Fig. 1). Absolute thresholds are also importantin psychoacoustical experiments, where quiet, but still detectable signals need to be produced. One suchexperiment is described in section 3.2.

2

Page 3: Psychoacoustics - University of Ljubljanamafija.fmf.uni-lj.si/seminar/files/2013_2014/Psychoacoustics.pdf · listener which occurs as a result of sound perception. Psychoacoustics

Figure 1: [Left] Measurements of absolute thresholds for binaural (both-ears) and monaural (one-ear)listening. Shape of the ear and other physiological factors a�ect our hearing in many ways. Therefore,monaural listening leads to di�erent perception of loudness and di�erent frequencies in the detected spec-trum. This results in di�erent frequency dependencies of the threshold. [Right] The di�erence betweenthe measurement at the eardrum and the measurement at the position of the center of the listener's headonce he is removed from his position. We see that the ear shape a�ected the detected frequency spectrum.More about the role of ear shape in auditory perception will be told in chapter 4. Pictures are taken from[3].

3 Frequency selectivity

The probability of detecting a sound from a certain source can decrease dramatically if the listener is alsoexposed to other sounds at the same time; this phenomenon is called masking [2]. While the most obviuousexample of masking is the unability to detect low-volume sound in the presence of the loud one, maskingcan also occur if two sound signals are having similar or the same freqency components. Discoveries inthe �eld of physiology shed some light on this phenomenon; in 1961, the Nobel prize in physiology wasawarded to Georg von Békésy [4] for the discovery of mechanical properties of cochlea that provide anexplanation of human frequency analysis and help to explain masking of sounds with similar frequencies.

3.1 Physical model

By developing a new method to dissect the human ear, Békésy could perform experiments on a partlyintact human chochlea (Fig. 2). He found that each part of the basilar membrane corresponds to a certainfrequency; high frequencies of sound stimulate the parts of membrane which are close to the outer ear andlow frequencies stimulate response of the membrane at the end of the cochlea. We are not going to presentpsychiological structure of the inner ear at this point; instead we will focus on a basic mechanical model,which was used by Békésy himself when he planned his physiological observations [5].

3

Page 4: Psychoacoustics - University of Ljubljanamafija.fmf.uni-lj.si/seminar/files/2013_2014/Psychoacoustics.pdf · listener which occurs as a result of sound perception. Psychoacoustics

Figure 2: [Left] A cross-section of the human chochlea [6]. [Right] The schematic representation of cochlea'sshape [7]. The inside is divided into two �uid compartments which are separated by a cochlear partition(CP), where the basilar membrane (BM) is located.

The physical model we will present here is based on a very simpli�ed structure of the cochlea and itssketch is included in Fig. 3.

Figure 3: [Top] Simpli�ed model of the cochlea interior, proposed by Bekesy [5] . [Bottom] The coordinatesystem for modeling the unrolled cochlea [7].

The interior of the cochlea is divided into two separate compartments which are �lled with �uid and are

4

Page 5: Psychoacoustics - University of Ljubljanamafija.fmf.uni-lj.si/seminar/files/2013_2014/Psychoacoustics.pdf · listener which occurs as a result of sound perception. Psychoacoustics

separated by a wall. The large opening in the separating wall is covered by the basilar membrane whichis a membrane with known mass, sti�ness and damping coe�cient (all these properties were measuredby Békésy). Vibrations enter the cochlea through the oval window, covered by the membrane; the roundwindow which is located in the other compartment is also covered by a membrane and its purpose is thatit allows the movement of the incompressible �uid in the cochlea. Even though Békésy included anotheropening, which connects both compartments (helicotrema), the exchange of �uids is neglected in �uidmodeling.Even though the Reynolds number of the �ow inside both compartments is low, we can treat the �owinside the cochlea as inviscid in the �rst order of approximation [5]. If the pulses of sound are short, wemay assume that �uid �ow inside the cochlea is irrotational [5], therefore its velocity can be expressedas v = ∇Φ, where Φ is velocity potential. Since compartments are separated, the velocity potentialsin both of them are di�erent and we denote them by Φ1 (in upper compartment) and Φ2 (in lowercompartment). The equations we write below are valid for the upper compartment. The �uid is assumedto be incompressible, therefore we write:

∇2Φ1 = 0 (1)

Since irrotational �ow is assumed, Bernoulli's law applies [8]. We can neglect the term which contains thesquare of the �uid velocity, since velocities are low. In term of irrotational �ow due to a velocity potential,Bernoulli's equation is therefore written as:

p1 + ρ∂Φ1

∂t= const. (2)

where ρ denotes the density of the �uid and p the pressure. The velocity of the �uid in the directionperpendicular to the membrane is expressed as ∂Φ/∂z and it equals the velocity of the membrane:

∂η

∂t=∂Φ1

∂z. (3)

The three equations above describe the motion of the �uid in the �rst, upper compartment. The same setis then written for the second compartment, which is physically separated from the �rst one. The nextequation connects the di�erence of the �uid pressures on both sides of the membrane with the force onthe membrane due to its displacement µ(x, t):

m′(x)∂2η

∂t2+ β′(x)

∂η

∂t+ κ′(x)η = p2(x, 0, t)− p1(x, 0, t). (4)

Values for membrane mass per unit area m′ and functions for damping coe�cient per unit area β′ andsti�ness per unit area κ′ that were used to calculate the results, were chosen to represent actual propertiesof the basilar membrane. In addition to the above equations, two sets of equations similar to (3) and (4)are needed to describe the displacement of the membrane covering the round window, ξr, and displacementof the oval window membrane, ξo (see Fig. 3). The last equation,

ξo = F, (5)

states that the displacement ξ of the membrane covering the oval window is equal to the incoming signalF which is transferred mechanically to the membrane. We are not going to describe the methods requiredto solve the model here and will instead focus only on the results. The calculated membrane positionsalong the length of the cochlea are presented in Fig. 4. Despite many assumptions and simpli�cations,the calculated movement of the basilar membrane possesses the most important property of Bekesy'sobservations � the position of the largest excitation on the basilar membrane is frequency-dependent, asshown in Fig. 4. Békésy also discovered that the membrane excitations only propagate up to a certain

5

Page 6: Psychoacoustics - University of Ljubljanamafija.fmf.uni-lj.si/seminar/files/2013_2014/Psychoacoustics.pdf · listener which occurs as a result of sound perception. Psychoacoustics

distance, which is frequency-dependent. The higher the input frequency is, the shorter that distance gets.The membrane movements, calculated by the physical model, possess that attribute as well (see Fig. 4).

Figure 4: [Top] The curves represent the shape of the membrane at di�erent times. The wave envelope isdrawn with dashed line and it shows that the displacement of the membrane has an absolute maximumat some point. [Bottom] Wave envelopes of membrane shapes. The maximum displacement appears ona frequency-dependent position along the length of the membrane (di�erent curves were calculated usinginput signals with di�erent frequencies) [5].

3.2 Auditory �lters

The discovery of the way in which the basilar membrane acts as frequency analyzer led to a suggestion thatthe entire peripheral auditory system (which consists of the outer, the middle and the inner ear) acts as abank of bandpass �lters, with passbands that overlap [3]. Since di�erent sections of the basilar membranecorrespond to di�erent frequencies, those sections actually act as auditory �lters, centered around thosefrequencies. The overlapping of the passbands occurs because an excitation at a certain point of themembrane also causes the movement of neighbouring points, i.e. �lters.The shape of auditory �lters is determined through psychophysical experiments rather than by physical

6

Page 7: Psychoacoustics - University of Ljubljanamafija.fmf.uni-lj.si/seminar/files/2013_2014/Psychoacoustics.pdf · listener which occurs as a result of sound perception. Psychoacoustics

models. One of the experiments where the auditory �lter shape was determined, was carried out in thefollowing way. The listener had to detect a sound signal (the �testing signal�) with �xed volume in thepresence of another, background sound signal (the �masker signal�). Both signals were sinusoidal with�xed frequencies and were coming from the same source. Researchers experimented with di�erent volumesof masker signal and determined the minimal masker volume at which the listener could no longer detectthe testing signal. The described experiment was repeated many times, each time with di�erent maskerfrequency. The minimal masker volume was the lowest when the di�erence of frequencies of both signalswas also low. The results are represented in the form of so-called psychophysical tuning curves (see Fig.5).

Figure 5: Psychophysical tuning curves [3] show the level of masker signal which is needed to successfullymask a signal, which is �xed at low level (it is represented by dots). The dashed line represents the absolutethreshold of a signal at a given frequency. Solid lines connect the measurements of minimal masking signal.The resulting shapes are rough estimates of the shape of auditory �lters for di�erent frequencies.

The experiments were carried out at low volume, because researchers wanted to ensure that the excitationof the membrane would be as low as possible [2]. If the sound with a certain frequency causes themovement of the membrane, which is too strong, the excitation also triggers neighbouring auditory �lterson the membrane; this is de�ned as an o�-frequency listening [1].In order to �nd out the accurate shape of the auditory �lter at a certain frequency, the o�-frequency lis-tening must be reduced as much as possible. This can be achieved by notched-noise technique experiments[1]. In these experiments, the listener is exposed to pure tone signal with added bands of noise above andbelow the tone frequency (Fig. 6). Both the pure tone and the noise are coming from the same source(either headphones or loudspeakers). If we use this technique, the observed �lter is exposed to the leastammount of noise compared to the neighbouring �lters which therefore do not contribute much to thedetection of tone frequency, due to the increased masking. The minimum audible level is a function ofthe spectral gap between the edges of each of the noise bands and the tone frequency. The shape of theauditory �lter, obtained by the described method is presented in Fig. 6.

7

Page 8: Psychoacoustics - University of Ljubljanamafija.fmf.uni-lj.si/seminar/files/2013_2014/Psychoacoustics.pdf · listener which occurs as a result of sound perception. Psychoacoustics

Figure 6: Notched-noise technique [1]. Grey areas represent bands of noise. The auditory �lter is centeredat the signal frequency.

4 Localization

Another important area which the psychoacoustic research is focused on is sound localization, i.e. de�ningthe position of the sound source [2]. The sound from the source usually reaches one ear before the other(except when the sound is emerging from the front or the back of the listener). If the sound wave issinusoidal and its frequency is low, the di�erence in time of arrival of sound to both ears is proportional tothe phase di�erence, which can be used to determine the direction from which the sound is coming. This,however, is not true for frequencies above 1500 Hz, because in that case the wavelength of sound is smallin comparison to the dimensions of the head, and hence the brain can not determine the phase di�erencecorrectly.At high frequencies, another option of localization is more suitable. Since the head of the listener representsan obstacle for high frequency waves, they bend and di�ract around the head. That causes small soundintensity di�erences between both ears. The human auditory system can then determinine the angle fromwhich the sound is coming. Such combination of localization at low and high frequencies is called theduplex theory [2]. However, it has several �aws � for example, a single non-periodic click can be easilylocalised using only information about interaural time delays, regardless of the frequencies it contains.Furthermore, localization is also possible when listener is using only one ear (monaural listening).

4.1 Physical model of pinna

Since duplex theory does not provide a satisfactory explanation of sound localization, the physical prop-erties of human body must also be taken into consideration. The outer ear, i.e. the pinnae, plays amajor role in localization. Its function is to transform the incoming sound into a signal which containsinformation about the direction from which the sound is coming [9]. That information is then extractedfrom the signal in the neural system. Incoming sound is re�ected at di�erent parts of the pinna andbecause of that, eardrums detect not only the original signal, but also re�ected signals, which arrive tothe eardrum with delays τn. We assume that the incoming signal arrives to the eardrum via M di�erentpaths due to re�ections from di�erent parts of the pinna. Each re�ected (delayed) signal is attenuated

8

Page 9: Psychoacoustics - University of Ljubljanamafija.fmf.uni-lj.si/seminar/files/2013_2014/Psychoacoustics.pdf · listener which occurs as a result of sound perception. Psychoacoustics

after the re�ection occurs. We mark the attenuation coe�cient with an < 1 and the original sound signalis represented by the function F . We will write the equations in the notation of Laplace transformation.The signal which arrives to the eardrum is [9]

H(s) = F (s)T (s) (6)

where

T (s) =M∑n=1

ane−sτn (7)

is the transfer function of the pinna. The sound signal F is transformed into signal H because of timedelays, which are dependent of the direction from which the signal is coming. The transfer function T (s)has an inverse, and according to the research in the �eld of physiology, such inverse could be performedin the neural system. Batteau [9] also proposed a so-called attention transformation function which couldalso theoretically be performed by the neural system in the inner ear:

R(s) =M∑n=0

ane−(s(τM−τn)). (8)

As a result of the above transformation, the following signal is produced:

P (s) = R(s)H(s) = F (s)e−sτM [M∑n=0

a2n +M∑k

M∑j 6=k

akaje−s(τk−τj)]. (9)

The above result indicates that the processed signal has a di�erent amplitude and a slightly alteredspectrum in comparison to the original signal [9] . The received signal is also delayed by the maximumdelay time τM . According to Batteau's theory, the brain could be able to determine the direction of theincoming sound via the changes in the spectrum and intensity.It has to be mentioned that all delays are not caused by re�ections from the pinna, but also by reverberationfrom other objects, such as walls. But re�ections from the environment do not necessarily cause troublewhen it comes to localization. It is easy to treat them separately, because time delays caused by the pinna,are short (2− 300µs) and delays due to the reverberation in environment are signi�cantly longer (usuallymore than 10 ms). But that does not mean that re�ections from the environment do not play a role inlocalization. Because of small dimensions of the pinna, sounds with high frequencies cannot be localisedonly by analyzing short delays which are caused by the re�ections from pinna.

4.2 Psychoacoustical investigation

We have shown that the information about localization can be extracted from interaural time delays,interaural intensity di�erences and from the delays in the pinna, but nothing has been said yet about ouractual ability to localize sounds. The following psychoacoustic experiment was carried out in order to �ndout how good we are at determining directions from which sounds are coming [1].Researches decided to look for minimum audible angles, i.e. angles at which the listeners could no longerdetermine whether the two sounds with the same frequency were coming from the same direction or not.Two loudspeakers were used in the experiment � the �rst loudspeaker was placed in front of the listener'shead and the second one was placed at the same distance from the center of the head, but at a di�erentangle. The researchers then moved the second speaker around and the listener had to determine whethertwo tones were coming from the same direction or not. The signals used for experimenting were pure tonesand the experiment was repeated at di�erent frequencies. Minimum audible angles were determined foreach frequency and the results are shown in Fig. 7.

9

Page 10: Psychoacoustics - University of Ljubljanamafija.fmf.uni-lj.si/seminar/files/2013_2014/Psychoacoustics.pdf · listener which occurs as a result of sound perception. Psychoacoustics

Figure 7: Minimum audible angles at di�erent frequencies [1]. White dots are measurements taken whenthe �rst loudspeker was placed in front of the listener and black dots represent measurements when the�rst loudspeaker was at 30° angle.

The minimum audible angle is the lowest at low frequencies, i.e. at wavelengths which are much biggerthan the dimensions of the human head [1]. In that case the information needed for localization is mostlikely provided to the brain by interaural time di�erencies. In 1956, Klumpp and Eady found out that thesmallest interaural time di�erence which can be detected by a human is approximately 10µs [1]. Sincewe know the diameter of the human head and the speed of sound in air, we can estimate the minimumdetectable change in direction, which equals roughly 1°. This value matches the experimental results atlow frequency, shown in Fig. 7. This indicates that the auditory system really processess informationabout interaural delays. However, the localization is possible also when the listener can only use oneear (monaural listening), as shown in Fig. 8, even though both time-di�erence and phase-di�erence areimpossible in that case. We can therefore conclude that the re�ections of sound in the pinna, describedin section 4, really play a major role in localization.

Figure 8: Histograms show results of experiment, where listeners had to determine the direction fromwhich the sound was coming [9]. When listeners were only using one ear [Right], the results were visiblyworse than in the case of binaural listening [Left].

4.3 Experimentation using headphones

So far, we have only discussed experiments which were carried out by the use of loudspeakers. Whenresearchers tried to reproduce external sounds with headphones, the listeners often reported that thevoices appeared inside, instead of outside their heads [2]. In order to investigate what caused the brains

10

Page 11: Psychoacoustics - University of Ljubljanamafija.fmf.uni-lj.si/seminar/files/2013_2014/Psychoacoustics.pdf · listener which occurs as a result of sound perception. Psychoacoustics

to determine that sounds originated from inside the head, careful calibration of headphones had to beperformed. The goal was to produce a signal at the eardrum with the headphones, which would have thesame amplitude and phase as the signal at the eardrum due to a tone played from a loudspeaker, set at acertain angle. Experiments were carried out where the researchers reproduced signals in headphones, as ifthey were coming from di�erent directions. The same sounds were then also played through loudspeakersplaced at the position which corresponded to the position determined by the sound played by headphones.Amplitudes and phases had to be corrected for that purpose and calibration was con�rmed as successfulafter listeners could no longer distinguish between loudspeaker and headphone signals [10]. The soundcoming from loudspeaker is de�ned as the real signal and the one that comes from headphones is de�nedas the virtual signal.After it was ensured that headphones could satisfactorly reproduce the signal coming from loudspeakers,researchers began to investigate which property of the sound caused externalization, i.e. the property ofauditory system to determine that sound does not originate inside the head. They focused on both mainaspects of sound localization, described at the beginning of this chapter - time and level di�erences [7]. Inthis chapter, we will only focus on one speci�c experiment which investigated the e�ect of interaural timedi�erencies (ITD) on externalization.The loudspeaker was positioned at the angle of 37° with respect to the directon the listener was facing.The headphones were calibrated to resemble the sound coming from the same angle. If the frequency ofthe sound is low, we can calculate the ITD from the low-frequency limit of di�raction around a sphere [7]:

ITD =3r

csin θ. (10)

In the above equation, c is speed of sound in the air and r is radius of the head (which is approximatedby a sphere). For a head with a radius of 8 cm we get ITD ≈ 420µs, since θ = 37°. But after calibrationon the test subjects it turned out that the optimal interaural time di�erence was ITD ≈ 375µs. Theexperimental sessions were divided into three stages - calibration, training and trial. During the trainingphase, listeners were exposed to sequence of four signals: real - virtual - real - virtual. The purpose of thisstage of the experiment was to distinguish real signals from virtual signals; the listeners could repeat thesequence as many times as they wanted in order to learn and remember the di�erence. The next stageconsisted of 20 trials where the listeners had to decide whether the signal they were exposed to was realor virtual. The experiment was repeated at di�erend ITDs and at di�erent frequencies. Figure 9 showsthe results for experiment with sounds with fundamental frequency ν = 125 Hz.

5 Conclusion

We have presented some basic experiments, which were at one point or another cruicial in the progressin the �eld of psychophysics. Both of the physical models included in this paper have been introduced in1970s and both have seen many developments by di�erent authors. In one of the recent articles [7], theauthors report that they have succesfully implemented the cochlea model which is still based on the sameequations we presented in chapter 2, but the model also encompassess curvature in addition to the basicmodel. It turns out that the consequence of curvature in cochlea (see Fig. 2) is an increased perceptionof low frequencies [7].All psychoacoustic experiments have also been updated up to this day and psychoacoustical models are alsouseful in the commercial sector. Psychophysical models are used in software such as lossy signal converters(for example MP3-converters), in planning and designing of noise reduction systems and high-end audiosystems, as well as in loudspeaker and headphone industry.

11

Page 12: Psychoacoustics - University of Ljubljanamafija.fmf.uni-lj.si/seminar/files/2013_2014/Psychoacoustics.pdf · listener which occurs as a result of sound perception. Psychoacoustics

Figure 9: Fractions of correct identi�cations at di�erent ITD�s [10]. At ITD = 525µs and ITD = 300µs,listeners almost never identi�ed real sources as virtual sources or vice versa. In contrast, at ITDs close tothe optimal time di�erence, test subjects often made a wrong decision, which means that the virtual signalwas such a good reproduction of the real signal, that they were too hard to distinguish. It is interestingthat for ITD = 525µs all listeners reported that the virtual signal was coming from the right side of theloudspeaker. On the other hand, at ITD = 300µs, all listeners claimed that the sound originated fromthe inside of their head.

References

[1] Plack, C. J., Sense of hearing (Lawrence Erlbaum Associates Inc, Manwah, 2005).

[2] Rossing, T., Springer Handbook on Acoustics (Springer Science+Business Media, New York, 2007).

[3] Moore, B. C. J., An Introduction to the Psychology of Hearing (Emerald Group Publishing Lim-ited, Bingley, 2012).

[4] http://sl.wikipedia.org/wiki/Georg_von_Békésy (Cited on 5. January, 2014).

[5] M. B. Lesser and D. A. Berkley, J. Fluid Mech. 51, 497-512 (1972).

[6] http://www.ifd.mavt.ethz.ch/research/group_lk/projects/cochlear_mechanics/Enz_Brit_Cochlea_cross_section.gif(Cited on 5. January, 2014).

[7] H. Chai, D. Manoussaki and R. Chadwick, Phys. Rev. Lett. 96, 088701 (2006).

[8] Morrison, F. A., An Introduction to F luid Mechanics (Cambridge University Press, New York,2013).

[9] D. W. Batteau, Proc. Roy. Soc. B. 168, 158-180 (1976).

[10] W. M. Hartmann and A. Wittenberg, J. Acoust. Soc. Am 99, 3678-3688 (1996).

12