Begault 2002 Challenges for Realistic Room Simulation€¦ · Challenges and solutions for...
Transcript of Begault 2002 Challenges for Realistic Room Simulation€¦ · Challenges and solutions for...
Challenges and solutions for realistic room simulation
ASA June, 2002
Durand R. BegaultHuman Factors Research and Technology Division
NASA Ames Research Center
Moffett Field, California
Background
• Perception of the acoustical environmental context • Basic technique of auralization (virtual room synthesis)
Challenges for improved simulation
• Determining acceptable data reduction in measurement and synthesis (echo thresholds)
•Simulation of low frequency energy, coupled spaces
• Accounting for localization error due to reproduction method
• Accurate simulation of source directivity
Spatial hearing fundamentally involves perception of the location of a sound source at a point in space (azimuth, elevation, distance).
Distance
Elevation
Azimuth
Listener
Spatial (and non-spatial) hearing simultaneously reveals information about the environmental context (EC) of a sound source.
• Reverberance, echoes, surface characteristics reveal dimension, extent of space via cognitive associations
DistanceElevation
Azimuth
Environmental Context
Listener
Spatial (and non-spatial) hearing simultaneously reveals information about the environmental context (EC) of a sound source.
• Reverberance, echoes, surface characteristics reveal dimension, extent of space via cognitive associations
• Interactive cues: EC’s effect on speech effort, intelligibility
DistanceElevation
Azimuth
Environmental Context
Listener
Spatial (and non-spatial) hearing simultaneously reveals information about the environmental context (EC) of a sound source.
• Reverberance, echoes, surface characteristics reveal dimension, extent of space via cognitive associations
• Interactive cues: EC’s effect on speech effort, intelligibility
• Perceived sound source width and distance affected by EC
(“concert hall” musical acoustic factors - intimacy, envelopment)
DistanceElevation
Azimuth
Environmental Context
Listener
Image Size
Spatial (and non-spatial) hearing simultaneously reveals information about the environmental context (EC) of a sound source.
• Reverberance, echoes, surface characteristics reveal dimension, extent of space via cognitive associations
• Interactive cues: EC’s effect on speech effort, intelligibility
• Perceived sound source width and distance affected by EC
• Simultaneous sources: ambient sound & background noises (inside & outside) contribute toEC perception
DistanceElevation
Azimuth
Environmental Context
Listener
Auralization: virtual simulation of sound sources within an environmental context, using loudspeakers or headphones.
Useful applications for auralization technology include:
• Psychoacoustic investigations of spatial hearing in realistic acoustical environments (e.g., echo thresholds)
• Acoustical engineering(listening to a virtual room as a part of the design phase)
• Improved virtual environmentsimulation and 3-D sound reproduction
• Increased realism for entertainment (music reproduction)
• Evaluation of sound quality for industrial applications
Basic technique for auralization• Modeling (pre-synthesis) phase:
either measure or predictroom impulse response(data reduction is usuallynecessary)
Basic technique for auralization• Modeling (pre-synthesis) phase:
either measure or predictroom impulse response(data reduction is usuallynecessary)
• Translate impulse responseto listener-referenced virtual source locations
• Apply virtual acoustic techniques (HRTF filtering) to derive the binaural impulse response
Basic technique for auralization• Modeling/pre-synthesis phase:
either measure or predictroom impulse response(data reduction is usuallynecessary)
• Translate impulse responseto listener-referenced virtual source locations
• Apply virtual acoustic techniques (HRTF filtering) to derive the binaural impulse response
Basic technique for auralization• Modeling/pre-synthesis phase:
either measure or predictroom impulse response(data reduction is usuallynecessary)
• Translate impulse responseto listener-referenced virtual source locations
• Synthesis phase:Convolve “neutral” (anechoic)signals with the binaural impulse response of the room
(data reduction is usuallynecessary for rendering)
• Recent study indicates that echo thresholds in real and virtualenvironments are similar
• Engineering rule of thumb: early reflections re direct sound should be inaudible < -21 dB @ 3 ms and < -30 dB @ 15-30 ms (add 5 dB for speech stimuli).
Although thresholds for reverberation are relatively low, background noise(e.g., NC 35) can mask a significant portion of reverberant energy(particularly the reverberant decay).
One-Third Octave Band Noise Criteria (NC)
0
10
20
30
40
50
60
70
80
One-Third Octave Band Center Frequency, Hz
Normal Speech
relative threshold forreverberation
NC 65
NC 60
NC 55
NC 50
NC 45
NC 40
NC 35
NC 30
NC 25
NC 20
NC 15Approximate Threshold of Hearing for Continuous Noise
NC 10NC 5
31.5 63 125 250 500 1k 2k 4k 8k-40
-35
-30
-25
-20
-15
-10
250
500
1000
2000 fb
w
250
500
1000
2000 fb
w
250
500
1000
2000 fb
w
Small Medium Large
Octave-Band Center Frequency (fbw=full bandwidth)
Rev
erbe
ratio
n th
resh
old
(spe
ech)
re 6
0 dB
SP
L
• Creating virtual spatial sound environments:playback
Historically, the use of acoustical absorption to ‘neutralize’acoustical characteristics of rooms began ca. 1930s
Sound stage & control room Acoustical wall/ceiling panels
Recording for film
The realism of environmental context simulations has increased in tandem with the “neutralization” of rooms via absorption
• Echo chambers• Reverberation plates• Spring reverberators••• Real-time convolution
for head-trackedauralization
Reverberation using echo chamber, NBC, 1930s
Auditorium simulators using actual sound sources
University of Goettigen 1965.
Technical University of Denmark,Lyngby, 1992.
0
5
10
15
20
25
30
anechoic early reflections full auralization
reverberation treatment
Uns
igne
d az
imut
h er
ror
(deg
rees
)
Localization error for headphone stimuli (azimuth)
AnechoicSpeech:Individualdifferences
Mean values for different reverberation conditions
Head-tracked systems increase realism of 3D audio simulations in general (minimization of reversals)
Auditory localization can be influenced or biased bycognitive references and visual capture
• Creating virtual spatial sound environments:measurement
2. Make a binaural recording using a dummy head
Spatial room impulseresponses can be obtainedfrom a existing room by…
1. Calculating the responsefrom a room model
(ray tracing, image modeling)
2. Recording the binaural impulse response
3. Recording a directional impulse response for post-processing of directional information
Challenges:-adequate yet reasonablesampling of sources & receivers
-directivity data for sound sourceslimited; sound source movementusually unaccounted for
-modeling low frequency behavior
Calculate the impulse responsefrom a room model
(using ray tracing, image modeling)
Ray tracing, while imperfect, can be an efficient means for finding the location of disturbing echoes
directional horn
Challenges:-localization error due to non-individualized HRTF-deriving spatial information froma 2-channel source
-multiple measurements for each receiver position required
Measure the binauralroom impulse response
via a dummy head recording
Directional mic room impulse response.
Essert (1996) used a B-format output from a SoundFieldMKV microphone to obtain one omnidirectional (W) and three dipole IRs (oriented left-right X, back-front Y, and down-up Z, respectively)
The omnidirectional response reveals the arrival time of significant early reflections;
Cross-correlations between the monopole and dipole responses indicate reflection direction of arrival.
Individualized HRTFs can be applied during the synthesis phase to significant early reflections
Intensity measurements have also been investigated in the literature.
• Simulation of coupled spaces, low frequency energy
Reverberation can cause spatial movement of reverberant sound within coupled spaces
-examples: theater; organ music in a gothic cathedral
-difficult to model
Undamped stagehouseLong reverberant decay
Sound source
Absorptive seating areaShort reverberant decay
Measurement of “moving reverberation”:
Grace Cathedral, San Francisco
-starter pistol sound source at 5 locations
-7 measurement microphones simultaneously recorded with synch pulse
-Superior to binaural measurement since location/time can be tracked
310’
0 . 5 0 . 5 5 0 . 6 0 . 6 5 0 . 7 0 . 7 5 0 . 8 0 . 8 5 0 . 9 0 . 9 5 1- 8 0
- 7 5
- 7 0
- 6 5
- 6 0
- 5 5
- 5 0
- 4 5
- 4 06 3 H z O c t a v e b a n d 2 0 m s e n v e l o p e
T i m e ( s )
Rel
ativ
e le
vel (
dB)
u p p e r a p s e l o w e r a p s e u p p e r a l t a r l o w e r a l t a r u p p e r n a v e l o w e r n a v e v e s t i b u l e Upper nave Upper apse
0 0 . 0 5 0 . 1 0 . 1 5 0 . 2 0 . 2 5 0 . 3 0 . 3 5 0 . 4 0 . 4 5 0 . 5- 8 0
- 7 5
- 7 0
- 6 5
- 6 0
- 5 5
- 5 0
- 4 5
- 4 06 3 H z O c t a v e b a n d 2 0 m s e n v e l o p e
T i m e ( s )
Rel
ativ
e le
vel (
dB)
u p p e r a p s e l o w e r a p s e u p p e r a l t a r l o w e r a l t a r u p p e r n a v e l o w e r n a v e v e s t i b u l e
Upper-lower altarvestibule
Sound source at altar: 20 ms envelope
Overlap of frequency sensitivities for haptic and acoustic stimuli
“Haptic” includes both kinesthetic (muscle) and touch (cutaneous) sensations.– Touch: 0 – 1 kHz– Kinesthesia: 0 – 30 Hz– Audition: .02- 20 kHz: overlap from .02 –1 kHz
Very low-frequency vibration can also be sensed via vestibular, visceral, and visual sensory systems
Transmission paths from a “felt-heard” source to a receiver
-Airborne pressure waves (sonic & infrasonic)-transfer of acoustic wave to mechanical vibration possible-simultaneous auditory-haptic stimuli possible
(touch; bone conduction; visceral)-Ground / structure-borne vibration
-”Full-body” vibrationfrom 8–80 Hz (displacement threshold 1.5 x 10-6 m)
-Haptic sensation via kinesthetic, touch sensors-Transfer of vibration into acoustic excitation
-Perception of haptic and acoustical stimuli influenced by presence of coordinated visual stimuli
Vibration source(internal, external)
Ground, Structure response
hearing feeling seeing
Airbornesound
Response: qualitative assessment
Performance metric
Walls,Windows,
objects
Chairs,Tables,
floor
Walls,Windows,
plants
3-D audiodisplay
Head-mounted
visualdisplay
ExpectationInter-modal coordinationIdentificationExperience-adaptation
Why simulate- and manipulate- both haptic and acoustical phenomena simultaneously?
-Improved “realism” ,“immersion” in virtual enviroments(experiments; entertainment; training)
Simulations of musical performance (as player or listener) in particular seem incomplete without tactile cues
-Product quality evaluation (e.g., automotive industry)
-Performance enhancement for tasks in VE(e.g., transportation simulators)
Thresholds for vibrotactile perception, ball of thumb-peak at 250 Hz corresponds to acceleration of 0.24 m/s2
Workplace sound and vibration at a desk
Summary
• Challenges for improved simulation include determining acceptable data reduction in measurement and synthesis (echo thresholds)
• Data reduction based on echo thresholds and reverberant masking can relax demands onauralization synthesis computation (important for real-time rendering that includes head tracking).
•Simulation of low frequency effects, modal prediction, vibration, and source directivity require further research efforts