Sound Perception The ear, the brain & psychoacoustics.
-
Upload
tanya-greenly -
Category
Documents
-
view
235 -
download
10
Transcript of Sound Perception The ear, the brain & psychoacoustics.
Sound Perception
The ear, the brain & psychoacoustics
Plan
• About sound...• How does the hear work?• Absolute thresholds of hearing• Auditory masking• Sound spatialisation• Summary
ABOUT SOUNDSome definitions, and reminders about the nature of sound
Sound
• "Sound is a mechanical wave that is an oscillation of pressure transmitted through a solid, liquid, or gas, composed of frequencies within the range of hearing and of a level sufficiently strong to be heard, or the sensation stimulated in organs of hearing by such vibrations” – Wikipedia.
Sound (cont’d)
• Sound is a pattern of compression and depression of the air– Record it using
microphones– Perceive it from our ears– Generate it by speaking
or using speakers
• Energy per m2 decreases with the square of distance...
Sound is a waveform
• Sound is a waveform, • Can be reflected
when hitting a non-transmissive surface
• If the surface is flat, reflected in cohesive way
• Otherwise depends on frequency and surface texture
Sound proof studio wall, for absorbing high frequencies
Effect of weather
• Because sound is carried by air compression and decompression, sound travelling can be affected by temperature
• Eg. Air temperature near the ground is cooler
• Eg. Wind
THE EAR
How does the ear work? How do we perceive sound? How does it relate to sound synthesis techniques?
The ear
pinna
Outer ear
• The outer ear is composed of:– The pinna (visible part)– Auditory canal (meatus)– Tympanic membrane
(eardrum)
• The pinna – significantly modifies
incoming sound (esp. High frequencies)
– is important for sound localization
pinna
sound
(Outer ear in bats)
• Bats rely heavily on sonar for localization, navigation and hunting.
• Generate high pitched ultrasounds and listen for echoes.
• Highly refined sound perception & localisation (accurate enough to catch a bug in flight!)
The middle ear
• Sounds make the eardrum vibrate
• Vibrations are transmitted through middle ear by 3 bones: – Malleus (hammer)– Incus (anvil)– Stapes (stirrup)
• Stapes is connected to a membrane called the oval window
• Transmits sounds to the inner ear.
Tensor tympani
Auditory canal
Outer ear
Tympanic membrane
Incus (anvil)Malleus(hammer)
Stapes (stirrup)
Eustachian tube
Inner ear
The middle ear (function?)
• Efficient transfer of sound from the air to the fluid in the cochlea (inner ear)– Otherwise, sound would mostly be
reflected.– The oval window’s resistance is
higher than air– ... Also higher than the eardrum’s.– ... But surface is smaller!– middle ear is an efficient transfer
mechanism (like in a bicycle), esp. within 500Hz-4kHz
The middle ear (function - cont’d)
• “Barany (1938) suggested that middle ear reduces transmission of bone-conducted sound to the cochlea” Moore (2003)– Internally generated sounds: Chewing, flow of air,
blood, creaking of joints...– Would cause masking...– Middle ear only transmits differential movements
between ossicles and skull (when skull vibrates, spates vibrates in sync does not transmit)!
– Note: birds & reptiles usually swallow food whole...
The middle earAcoustic reflex
• Muscles in the middle ear (tensor tympani) can contract to pull the stapes away from the oval window, and therefore reduce drastically transmission.
• The ear reacts to loud sound by contracting the ossicles muscles and attenuating following sounds.
• Usually needs 70-90dB sounds to trigger.• Protects the ear against loud sounds BUT slow,
doesn’t help against sudden loud noises!
The inner ear
• The inner ear is also called the labyrinth
• Vestibular system: – balance– Spatial orientation– Horizontal, posterior
and superior canals...
• Cochlea is used for hearing...
Posterior canal Superior canal
Horizontal canal
utricle
cochlea
vestibule
saccule
The Cochlea
• Shaped like a spiral (no functional reason – space economy?)
• Filled with uncompressible fluids
• Rigid, bony walls• transmit sound pressure
without loss!• Divided across length by
two membranes– Basilar membrane– Reissner membrane
Cochlea (cont’d)
Basilar membrane
• When the oval window moves– Round window moves
in opposite manner– Basilar membranes
moves too– Waves propagates
through the BM
Basilar membrane (cont’d)
• Mechanical properties of BM varies across length:– Narrow & stiff at base – Wider & less stiff at apex
• position of the peak depends on frequency!– High frequency: near the
base– Low frequency: near the
apex
Basilar membrane (cont’d)
Basilar membrane (cont’d)
• BM acts as a (imperfect) Fourier analyser!
• Frequency that gives best response at a point of the BM is called Characteristic Frequency (CF) for this point.
• In response to a steady frequency, all points vibrate at the same frequency some point with greater amplitude.
Organ of Corti
Organ of Corti & Hair cells
• Between BM & tectorial membrane hair cells which form the Organ of Corti– Inner hair cells (12,000
cells – 140 hairs each)– Tunnel of Corti– Outer hair cells (3,500
cells – 40 hairs each)
• (Hairs are called stereocilia)
Stereocilia
Stereocilia (cont’d)
• Transforms mechanical movements into neural activity
• Stereocilia are joined by fine links (“tip links”)
• Deflection of the stereocilia apply tension to those links
• opens “transduction channels”
• flow of potassium ions, voltage alteration, etc.
Inner hair cells
• Each inner hair cell is connected ~ 20 neurons
• Most (all?) information is transferred by inner hair cells.
Outer hair cells
• What about the outer hair cells?• Actively influence mechanics of the cochlea
– High sensitivity– Sharp tuning– Evidenced by experiments with drugs that affect
outer hair cells’ performance.• Control from above? 1,800 efferent nerve fibers!• hearing is not a passive phenomenon, even
earlier stages are influenced by higher brain areas!
Otoacoustic emissions
• Experiments by Kemp (1978): if a click is sounded next to the ear, it is possible to detect a sound coming out of the ear (using a microphone sealed into the ear canal)– Reflexions? – Not only! Sound can be heard with delays from 5 to 60ms
(Kemp echoes).– Relative level greater at low emissions (grows by 3dB for
each 10dB of input)– May be stronger than the actual input!
• Disappears even with moderate cochlear pathologies
Neurons in the auditory nerve
• Approximately 30,000 neurons in each auditory nerves (left,right)
• Study this using fine tipped micro-electrodes to record voltage in single cells
• Most neurons fire spontaneously (0-150Hz)• Most neurons are tuned to specific frequencies. • Phase locking: spikes occur at specific phase of
the stimulating waveform temporal regularity.
Auditory pathways
ABSOLUTE THRESHOLDSSome comments on the absolute limits of hearing
Minimum Audible Pressure
• An absolute threshold is the minimum detectable level of a sound, in the absence of other external sounds.
• Depends on set-up: important to define precisely how the intensity is measured– Probe microphone (ideally close to the eardrum)
• Usually using headphones. • Threshold is called the Minimum Audible
Pressure (MAP)
Minimum Audible Field
• Alternatively, loudspeakers in large anechoic chamber (walls, floor, ceiling are highly sound absorbing)– Measurement made after the subject is
removed, at the point occupied by the center of the listener’s head
– Minimum Audible Field (MAF)
MAF vs. MAP
• MAF is binaural, MAP is monaural
• MAF factors in the head, pinna & meatus effects (broad resonance)
• Thresholds increase rapidly at very high and very low frequencies transmission characteristics of the middle ear!
Absolute Thresholds
• Depends on people: individuals may vary by up to 20dB and have “normal” hearing.
• Highest audible frequency depends on age: kids up to 20kHz, adults about 15kHz
Hearing loss
• MAP can be used for generating Audiograms and evaluating hearing loss.
• Two types of hearing loss:– Conductive: middle ear problems reducing sound
transmission to cochlea (eg, infection: otitis media, bone growth around stapes or oval window, wax in the ear canal) elevation in absolute threshold • Help: hearing aid, surgery
– Sensorineural: defects in cochlea or auditory nerve or higher centers in the brain. • Extent of the loss increases with frequency (esp. Elderly)• Makes it hard to understand speech, esp. In noisy environment.• Generally, no surgery is possible.
Conductive hearing loss
• middle ear problems reducing sound transmission to cochlea – eg, infection (otitis media), bone growth
around stapes or oval window, wax in the ear canal
– elevation in absolute threshold – Help: hearing aid, surgery
Sensorineural hearing loss
• Due to defects in: – cochlea – auditory nerve – higher centers in the brain.
• Extent of the loss increases with frequency (esp. Elderly)
• Makes it hard to understand speech, esp. In noisy environment.
• Generally, no surgery is possible.
UK data, using frequencies 0.5,1,2, and 4kHz: 61-71: 51% with loss >20dB, 30% > 40dB71-80: 74% > 20dB and 30% > 40dB
Using 4,6, and 8kHz 71-80: 98% > 20dB and 81% > 40dB
Temporal effect in Absolute Threshold
• Absolute thresholds of sounds depend on duration (Exner, 1876)– For sounds > 500ms, no effect – For sounds < 200ms, minimal sound intensity
increases as duration decreases• The ear appear to integrates a stimulus
energy over time– In practice: (I-IL).t = IL.t
• IL threshold intensity for a long sound (>500ms)• t constant for the auditory system integration time
AUDITORY MASKINGLimits of the human hearing, how one sound can hide another...
Auditory masking
• The human auditory system has a limited capacity to resolve sinusoidal components of complex sounds– Eg, if a we listen to two tuning forks, one
tuned at C (262Hz) and the other at A (440Hz), we hear two separate tones, each with its own pitch.
– Yet, one sound can be obscured, or rendered inaudible by other sounds (music from a car radio may mask the car’s engine – or conversely!)
Auditory Masking (definition)• Definition: “1. The process by which the
threshold of audibility for one sound is raised by the presence of another (masking) sound;2. The amount by which the threshold of audibility of a sound is raised by the presence of another (masking) sound. The unit customarily used is the decibel.” American Standards Association
Auditory masking
• A sound is more easily masked by another having a similar frequency.– Limitations of the Basilar membrane– Limits of frequency selectivity
• Masking is very dependent on time: – Simultaneous presentation of the sounds– Forward masking – Backward masking
The Critical Band
• Fletcher (1940) suggested the auditory system works as a bank of bandpass filters, with overlapping passbands, based on the BM.
• When detecting a sound in a noisy background, a listener is assumed to make use of the filter with the closest center frequency.
• Threshold is determined by the amount of noise passing through this filter
• “power spectrum” view on masking (Patterson & Moore, 1986)
Critical Band (cont’d)• Fletcher’s idea: • Only a narrow band of frequencies
surrounding the tone contribute to masking the tone
• When the noise just masks the tone, the power of the tone, divided by the power of the noise is a constant K
• Assuming rectangular bandpass filters (not true, but convenient!), we have:– P/(W.N0) = K– Where W is the bandwidth, N0 is the
noise power, and P is the tone power.
Off-frequency Listening
Shape of the Auditory Filter• Theory is sound, but
square bandpass assumption is wrong
• measuring psychophysical auditory tuning curves
• Notched noise method to remove off-frequency components
Auditory masking curves
Auditory masking curves
• Auditory masking curves show how much masking occurs at which frequencies
• Useful for efficient compression: no need to encode a frequency if we can’t hear it!– Eg, MP3
Contralateral masking
• Another form of masking is when the signal is presented to one ear, and the noise is presented to the other.
• This is called contralateral masking• When both sound and noise are presented
to the same ear, this is called ipsilateral masking
Temporal Masking
• This occurs when the masking and masked signals are not simultaneous– If the masking sound precedes the masked sound, it is
called forward masking.– If the masking sound follows the masked sound, it is
called backward masking– Masking effectiveness attenuates exponentially from the
onset and offset of the masker• Onset attenuation ~ 20ms.• Offset attenuation ~100ms.
• Note: different to the ear’s acoustic reflex (reduce ear’s sensitivity after loud sound)
PERCEPTION OF LOUDNESSPsychoacoustic perception of loudness, versus soud pressure.
Loudness
• Fletcher Munson (1933)– Subjects listen to pure
tones• Various frequencies• amplitude inc. per 10dB
• Robinson & Dadson (1956) – more accurate– Basis for standard
ISO-226
• Perceived Loudness (Phons)– 1 Phon = 1dB SPL @ 1kHz
• British Standard BS ISO 226 (2003) (source wikipedia)
SOUND SPATIALIZATIONSound localization in space and stereo hearing...
Sound Spatialization
• Localization accuracy is 1 degree for sources in front of the listener and 15 degrees for sources to the sides.
• Humans can discern interaural time differences of 10 microseconds or less
Cues for Localization• Interaural time difference
– The sound will reach the ears at different time, depending on its location in space• Phase delay at low frequencies• Group delay at high frequencies
• Interaural level differences– The sound will be louder in one
hear compared to the other.
• Distance can be estimated from – spectrum: high frequencies are
attenuated more quickly lower loudness
– Movement: parallax depends on distance
Cues for Localization
• In addition, the pinnae modifies the spectra of incoming sounds in a way that depends on the angle of incidence of the sound to the head
• Head+pinna form a direction-dependent filter
• Measured by comparing the spectrum of the sound source vs. the spectrum reaching the eardrum: Head Related Transfer Function (HRTF)
• High frequencies (>6kHz) interact especially strongly with the pinna.
SYNESTHESIAAn unusual case....
Synesthesia
• Synesthesia: Perceiving one sense as another eg. Sound as colors.
• Prevalence: unknown, could be as high as 1 in 23 (Simner J, Mulvenna C, Sagiv N, et al. (2006))
• This is believed to have a neurological basis (FMRI evidence)
• Famous synesthetes include David Hockney, who perceives music as color, shape, and configuration, and who uses these perceptions when painting opera stage sets – but not while creating his other artworks
Synesthesia
• Some facts: – Synesthesia is involuntary and automatic– Synesthetic perceptions are spatially
extended (sense of location)– Synesthetic percepts are consistent and
generic – Synesthesia is highly memorable– Synesthesia is laden with affect
Richard Cytowic (2002,2003,2009)
Plan
• About sound...• How does the hear work?• Absolute thresholds of hearing• Auditory masking• Sound spatialization• Summary
Additional Reading
• Brian C.J. Moore (2003) An Introduction to the Psychology of Hearing, Academic Press.