Sound Perception The ear, the brain & psychoacoustics.

Sound Perception

The ear, the brain & psychoacoustics

Plan

• About sound...• How does the hear work?• Absolute thresholds of hearing• Auditory masking• Sound spatialisation• Summary

ABOUT SOUNDSome definitions, and reminders about the nature of sound

Sound

• "Sound is a mechanical wave that is an oscillation of pressure transmitted through a solid, liquid, or gas, composed of frequencies within the range of hearing and of a level sufficiently strong to be heard, or the sensation stimulated in organs of hearing by such vibrations” – Wikipedia.

Sound (cont’d)

• Sound is a pattern of compression and depression of the air– Record it using

microphones– Perceive it from our ears– Generate it by speaking

or using speakers

• Energy per m2 decreases with the square of distance...

Sound is a waveform

• Sound is a waveform, • Can be reflected

when hitting a non-transmissive surface

• If the surface is flat, reflected in cohesive way

• Otherwise depends on frequency and surface texture

Sound proof studio wall, for absorbing high frequencies

Effect of weather

• Because sound is carried by air compression and decompression, sound travelling can be affected by temperature

• Eg. Air temperature near the ground is cooler

• Eg. Wind

THE EAR

How does the ear work? How do we perceive sound? How does it relate to sound synthesis techniques?

The ear

pinna

Outer ear

• The outer ear is composed of:– The pinna (visible part)– Auditory canal (meatus)– Tympanic membrane

(eardrum)

• The pinna – significantly modifies

incoming sound (esp. High frequencies)

– is important for sound localization

pinna

sound

(Outer ear in bats)

• Bats rely heavily on sonar for localization, navigation and hunting.

• Generate high pitched ultrasounds and listen for echoes.

• Highly refined sound perception & localisation (accurate enough to catch a bug in flight!)

The middle ear

• Sounds make the eardrum vibrate

• Vibrations are transmitted through middle ear by 3 bones: – Malleus (hammer)– Incus (anvil)– Stapes (stirrup)

• Stapes is connected to a membrane called the oval window

• Transmits sounds to the inner ear.

Tensor tympani

Auditory canal

Outer ear

Tympanic membrane

Incus (anvil)Malleus(hammer)

Stapes (stirrup)

Eustachian tube

Inner ear

The middle ear (function?)

• Efficient transfer of sound from the air to the fluid in the cochlea (inner ear)– Otherwise, sound would mostly be

reflected.– The oval window’s resistance is

higher than air– ... Also higher than the eardrum’s.– ... But surface is smaller!– middle ear is an efficient transfer

mechanism (like in a bicycle), esp. within 500Hz-4kHz

The middle ear (function - cont’d)

• “Barany (1938) suggested that middle ear reduces transmission of bone-conducted sound to the cochlea” Moore (2003)– Internally generated sounds: Chewing, flow of air,

blood, creaking of joints...– Would cause masking...– Middle ear only transmits differential movements

between ossicles and skull (when skull vibrates, spates vibrates in sync does not transmit)!

– Note: birds & reptiles usually swallow food whole...

The middle earAcoustic reflex

• Muscles in the middle ear (tensor tympani) can contract to pull the stapes away from the oval window, and therefore reduce drastically transmission.

• The ear reacts to loud sound by contracting the ossicles muscles and attenuating following sounds.

• Usually needs 70-90dB sounds to trigger.• Protects the ear against loud sounds BUT slow,

doesn’t help against sudden loud noises!

The inner ear

• The inner ear is also called the labyrinth

• Vestibular system: – balance– Spatial orientation– Horizontal, posterior

and superior canals...

• Cochlea is used for hearing...

Posterior canal Superior canal

Horizontal canal

utricle

cochlea

vestibule

saccule

The Cochlea

• Shaped like a spiral (no functional reason – space economy?)

• Filled with uncompressible fluids

• Rigid, bony walls• transmit sound pressure

without loss!• Divided across length by

two membranes– Basilar membrane– Reissner membrane

Cochlea (cont’d)

Basilar membrane

• When the oval window moves– Round window moves

in opposite manner– Basilar membranes

moves too– Waves propagates

through the BM

Basilar membrane (cont’d)

• Mechanical properties of BM varies across length:– Narrow & stiff at base – Wider & less stiff at apex

• position of the peak depends on frequency!– High frequency: near the

base– Low frequency: near the

apex


• BM acts as a (imperfect) Fourier analyser!

• Frequency that gives best response at a point of the BM is called Characteristic Frequency (CF) for this point.

• In response to a steady frequency, all points vibrate at the same frequency some point with greater amplitude.

Organ of Corti

Organ of Corti & Hair cells

• Between BM & tectorial membrane hair cells which form the Organ of Corti– Inner hair cells (12,000

cells – 140 hairs each)– Tunnel of Corti– Outer hair cells (3,500

cells – 40 hairs each)

• (Hairs are called stereocilia)

Stereocilia

Stereocilia (cont’d)

• Transforms mechanical movements into neural activity

• Stereocilia are joined by fine links (“tip links”)

• Deflection of the stereocilia apply tension to those links

• opens “transduction channels”

• flow of potassium ions, voltage alteration, etc.

Inner hair cells

• Each inner hair cell is connected ~ 20 neurons

• Most (all?) information is transferred by inner hair cells.

Outer hair cells

• What about the outer hair cells?• Actively influence mechanics of the cochlea

– High sensitivity– Sharp tuning– Evidenced by experiments with drugs that affect

outer hair cells’ performance.• Control from above? 1,800 efferent nerve fibers!• hearing is not a passive phenomenon, even

earlier stages are influenced by higher brain areas!

Otoacoustic emissions

• Experiments by Kemp (1978): if a click is sounded next to the ear, it is possible to detect a sound coming out of the ear (using a microphone sealed into the ear canal)– Reflexions? – Not only! Sound can be heard with delays from 5 to 60ms

(Kemp echoes).– Relative level greater at low emissions (grows by 3dB for

each 10dB of input)– May be stronger than the actual input!

• Disappears even with moderate cochlear pathologies

Neurons in the auditory nerve

• Approximately 30,000 neurons in each auditory nerves (left,right)

• Study this using fine tipped micro-electrodes to record voltage in single cells

• Most neurons fire spontaneously (0-150Hz)• Most neurons are tuned to specific frequencies. • Phase locking: spikes occur at specific phase of

the stimulating waveform temporal regularity.

Auditory pathways

ABSOLUTE THRESHOLDSSome comments on the absolute limits of hearing

Minimum Audible Pressure

• An absolute threshold is the minimum detectable level of a sound, in the absence of other external sounds.

• Depends on set-up: important to define precisely how the intensity is measured– Probe microphone (ideally close to the eardrum)

• Usually using headphones. • Threshold is called the Minimum Audible

Pressure (MAP)

Minimum Audible Field

• Alternatively, loudspeakers in large anechoic chamber (walls, floor, ceiling are highly sound absorbing)– Measurement made after the subject is

removed, at the point occupied by the center of the listener’s head

– Minimum Audible Field (MAF)

MAF vs. MAP

• MAF is binaural, MAP is monaural

• MAF factors in the head, pinna & meatus effects (broad resonance)

• Thresholds increase rapidly at very high and very low frequencies transmission characteristics of the middle ear!

Absolute Thresholds

• Depends on people: individuals may vary by up to 20dB and have “normal” hearing.

• Highest audible frequency depends on age: kids up to 20kHz, adults about 15kHz

Hearing loss

• MAP can be used for generating Audiograms and evaluating hearing loss.

• Two types of hearing loss:– Conductive: middle ear problems reducing sound

transmission to cochlea (eg, infection: otitis media, bone growth around stapes or oval window, wax in the ear canal) elevation in absolute threshold • Help: hearing aid, surgery

– Sensorineural: defects in cochlea or auditory nerve or higher centers in the brain. • Extent of the loss increases with frequency (esp. Elderly)• Makes it hard to understand speech, esp. In noisy environment.• Generally, no surgery is possible.

Conductive hearing loss

• middle ear problems reducing sound transmission to cochlea – eg, infection (otitis media), bone growth

around stapes or oval window, wax in the ear canal

– elevation in absolute threshold – Help: hearing aid, surgery

Sensorineural hearing loss

• Due to defects in: – cochlea – auditory nerve – higher centers in the brain.

• Extent of the loss increases with frequency (esp. Elderly)

• Makes it hard to understand speech, esp. In noisy environment.

• Generally, no surgery is possible.

UK data, using frequencies 0.5,1,2, and 4kHz: 61-71: 51% with loss >20dB, 30% > 40dB71-80: 74% > 20dB and 30% > 40dB

Using 4,6, and 8kHz 71-80: 98% > 20dB and 81% > 40dB

Temporal effect in Absolute Threshold

• Absolute thresholds of sounds depend on duration (Exner, 1876)– For sounds > 500ms, no effect – For sounds < 200ms, minimal sound intensity

increases as duration decreases• The ear appear to integrates a stimulus

energy over time– In practice: (I-IL).t = IL.t

• IL threshold intensity for a long sound (>500ms)• t constant for the auditory system integration time

AUDITORY MASKINGLimits of the human hearing, how one sound can hide another...

Auditory masking

• The human auditory system has a limited capacity to resolve sinusoidal components of complex sounds– Eg, if a we listen to two tuning forks, one

tuned at C (262Hz) and the other at A (440Hz), we hear two separate tones, each with its own pitch.

– Yet, one sound can be obscured, or rendered inaudible by other sounds (music from a car radio may mask the car’s engine – or conversely!)

Auditory Masking (definition)• Definition: “1. The process by which the

threshold of audibility for one sound is raised by the presence of another (masking) sound;2. The amount by which the threshold of audibility of a sound is raised by the presence of another (masking) sound. The unit customarily used is the decibel.” American Standards Association

Auditory masking

• A sound is more easily masked by another having a similar frequency.– Limitations of the Basilar membrane– Limits of frequency selectivity

• Masking is very dependent on time: – Simultaneous presentation of the sounds– Forward masking – Backward masking

The Critical Band

• Fletcher (1940) suggested the auditory system works as a bank of bandpass filters, with overlapping passbands, based on the BM.

• When detecting a sound in a noisy background, a listener is assumed to make use of the filter with the closest center frequency.

• Threshold is determined by the amount of noise passing through this filter

• “power spectrum” view on masking (Patterson & Moore, 1986)

Critical Band (cont’d)• Fletcher’s idea: • Only a narrow band of frequencies

surrounding the tone contribute to masking the tone

• When the noise just masks the tone, the power of the tone, divided by the power of the noise is a constant K

• Assuming rectangular bandpass filters (not true, but convenient!), we have:– P/(W.N0) = K– Where W is the bandwidth, N0 is the

noise power, and P is the tone power.

Off-frequency Listening

Shape of the Auditory Filter• Theory is sound, but

square bandpass assumption is wrong

• measuring psychophysical auditory tuning curves

• Notched noise method to remove off-frequency components

Auditory masking curves

Auditory masking curves

• Auditory masking curves show how much masking occurs at which frequencies

• Useful for efficient compression: no need to encode a frequency if we can’t hear it!– Eg, MP3

Contralateral masking

• Another form of masking is when the signal is presented to one ear, and the noise is presented to the other.

• This is called contralateral masking• When both sound and noise are presented

to the same ear, this is called ipsilateral masking

Temporal Masking

• This occurs when the masking and masked signals are not simultaneous– If the masking sound precedes the masked sound, it is

called forward masking.– If the masking sound follows the masked sound, it is

called backward masking– Masking effectiveness attenuates exponentially from the

onset and offset of the masker• Onset attenuation ~ 20ms.• Offset attenuation ~100ms.

• Note: different to the ear’s acoustic reflex (reduce ear’s sensitivity after loud sound)

PERCEPTION OF LOUDNESSPsychoacoustic perception of loudness, versus soud pressure.

Loudness

• Fletcher Munson (1933)– Subjects listen to pure

tones• Various frequencies• amplitude inc. per 10dB

• Robinson & Dadson (1956) – more accurate– Basis for standard

ISO-226

• Perceived Loudness (Phons)– 1 Phon = 1dB SPL @ 1kHz

• British Standard BS ISO 226 (2003) (source wikipedia)

SOUND SPATIALIZATIONSound localization in space and stereo hearing...

Sound Spatialization

• Localization accuracy is 1 degree for sources in front of the listener and 15 degrees for sources to the sides.

• Humans can discern interaural time differences of 10 microseconds or less

Cues for Localization• Interaural time difference

– The sound will reach the ears at different time, depending on its location in space• Phase delay at low frequencies• Group delay at high frequencies

• Interaural level differences– The sound will be louder in one

hear compared to the other.

• Distance can be estimated from – spectrum: high frequencies are

attenuated more quickly lower loudness

– Movement: parallax depends on distance

Cues for Localization

• In addition, the pinnae modifies the spectra of incoming sounds in a way that depends on the angle of incidence of the sound to the head

• Head+pinna form a direction-dependent filter

• Measured by comparing the spectrum of the sound source vs. the spectrum reaching the eardrum: Head Related Transfer Function (HRTF)

• High frequencies (>6kHz) interact especially strongly with the pinna.

SYNESTHESIAAn unusual case....

Synesthesia

• Synesthesia: Perceiving one sense as another eg. Sound as colors.

• Prevalence: unknown, could be as high as 1 in 23 (Simner J, Mulvenna C, Sagiv N, et al. (2006))

• This is believed to have a neurological basis (FMRI evidence)

• Famous synesthetes include David Hockney, who perceives music as color, shape, and configuration, and who uses these perceptions when painting opera stage sets – but not while creating his other artworks

http://en.wikipedia.org/wiki/David_Hockney

http://en.wikipedia.org/wiki/David_Hockney

Synesthesia

• Some facts: – Synesthesia is involuntary and automatic– Synesthetic perceptions are spatially

extended (sense of location)– Synesthetic percepts are consistent and

generic – Synesthesia is highly memorable– Synesthesia is laden with affect

Richard Cytowic (2002,2003,2009)

http://en.wikipedia.org/wiki/Richard_Cytowic

http://en.wikipedia.org/wiki/Richard_Cytowic

Plan

• About sound...• How does the hear work?• Absolute thresholds of hearing• Auditory masking• Sound spatialization• Summary

Additional Reading

• Brian C.J. Moore (2003) An Introduction to the Psychology of Hearing, Academic Press.

Sound Perception The ear, the brain & psychoacoustics.

Documents

Transcript of Sound Perception The ear, the brain & psychoacoustics.