On the manifolds of spatial hearing
-
Upload
deborah-summers -
Category
Documents
-
view
219 -
download
0
description
Transcript of On the manifolds of spatial hearing
On the manifolds of spatial hearing
Vikas C. Raykar and Ramani Duraiswami University of Maryland
College Park NIPS 2006 workshop on novel applications of
dimensionality reduction December 9, 2006 Human spatial hearing How
are humans able to judge the
direction of a sound source? Why do we have two ears? Why is the
pinna shaped the way it is? Plan of the talk Human spatial hearing
Perceptual manifolds
Exploratory studies Applications How do humans localize sound
source?
Primary cues Interaural Time Difference (ITD) Interaural Level
Difference (ILD) Explains localization only in the horizontal
plane. All points in the one half of the hyperboloid of revolution
have the same ITD and IID. [cone of confusion ] Other cues Pinna
shape gives elevation cues for higher frequencies. Torso and Head
give elevation cues for lower frequencies. Source HEAD Left ear
Right ear Intricate system to be completely modelled Its head,
torso, and pinna Head Related Transfer Function(HRTF)
Spectral filtering caused by the head, torso, and the pinna.
HRIRHead related impulse response. Can experimentally measure HRIR
for all elevation and azimuth. Convolve the source signal with the
measured HRIR to create virtual audio Sample HRIR and HRTF Source
directly in front of your right ear. CIPIC Database Public Domain
HRIR Database
HRIRs sampled at 1250 points around the head 45 subjects
Anthropometry measurements V. Ralph Algazi, Richard O. Duda, Dennis
M. Thompson, Carlos Avendano,"The CIPIC HRTF database, "in WASSAP
'01 (2001 IEEE ASSP Workshop on Applications of Signal Processing
to Audio and Acoustics, Mohonk Mountain House, New Paltz, NY, Oct.
2001). Interaural polar coordinate system
Azimuth Elevation Implications of the origin not being exactly the
center of the head. Plan of the talk Human spatial hearing
Perceptual manifolds
Exploratory studies Applications Manifold representation
A HRIR of N samples can be considered as a point in N dimensional
space. As the elevation is varied smoothly, the points essentially
trace out a one-dimensional manifold in the N-dimensional space. If
we can unfold this low-dimensional manifold we have a good
perceptual representation of the signal. Exploratory studies
using
Perceptual manifolds Exploratory studies using PCA LLE Isomap MVU A
few applications Perceptual distance metric Interpolation
Customization Our data matrix Elevation manifold
Our data matrix Elevation manifold points in a [HRTF=257 HRIR=200]
dimensional space d 200 x 50 Subject 10. We will concentrate only
on azimuth zero to begin with. Use the HRIR. 257 x 50
Dimensionality Reduction methods
We used to following four methods Principal Component Analysis
(PCA) Local Linear Embedding (LLE) Isomap Maximum Variance
Unfolding (MVU) We expect The manifold to have an intrinsic
dimensionality of 1. The first embedded component to be monotonic
with elevation. Optional slide HRTF elevation manifold PCA HRTF
manifold Isomap (K=3) HRTF manifold Isomap (K=2) HRTF manifold LLE
(K=3) HRTF manifold LLE (K=2) HRTF manifold MVU HRTF manifold MVU
HRIR elevation manifold
PCA Isomap LLE MVU Complete manifold Azimuth -45:5:45 Elevation
-45:5:230 We expect
The manifold to have an intrinsic dimensionality of 2. The first
two embedded components should show a grid like structure. Optional
slide Complete manifold PCA Complete manifold LLE (K=4) Complete
manifold LMVU (K=4) Isomap (K=4) HRIR manifold PCA Isomap
Isomap
Data representation -- manifold properties LLE, MVU - numerical
problems Plan of the talk Human spatial hearing Perceptual
manifolds
Exploratory studies Applications Problem 1: Interpolation
HRTFs generally measured for a finite sampling grid of elevation
and azimuth. For a smooth virtual audio system we need to
interpolate HRTFs. HRTF measurement is a tedious and time consuming
process. Normally takes an hour. Subject must be immobile. Some
prelimnary results Problem 2: Distance metric
How to compare any two given HRTFs Perceptually inspired metric
Psychoacoustical tests Squared log-magnitude error It is tough to
decide what aspects of a given signal are perceptually relevant Use
geodesic distance How to compare any two given HRIRs i.e. how to
formulate a distance metric in the space of HRIRs. The distance
metric has to be perceptually inspired. The absolute justification
however is to do psycho acoustical tests. In the absence of any
good perceptual error metric the most commonly used one is the
squared log-magnitude error of the spectrum of the HRIRs. It is
tough to decide what aspects of a given signal are perceptually
relevant. For our case of all HRIRs for different elevation angles,
the obvious perceptual information to be extracted is the elevation
of the source. A natural measure of distance would be the distance
on the extracted one-dimensional manifold. Distance on the manifold
Problem 3: Customization
HRTF measured for a particular person if used for different persons
elevation perception is very poor. Ear shape of each person is
unique and also the anatomy. Each persons localizing capabilites
are tuned to the shape of their ear and anatomy. A big bottleneck
for commercialization of spatial audio. Style vs Content
Anthropometric measurements
Can we relate the antopometric measurements to some characteristics
(?) of the manifold. Problem 4: Microphone calibration Thank You !|
Questions ?