Comparison of Spherical Harmonics based 3D-HRTF Functional...

Comparison of Spherical Harmonics based3D-HRTF Functional Models

Rodney A. KennedyResearch School of Engineering,

The Australian National University,Canberra, ACT 0200, Australia

Email: [email protected]

Wen ZhangResearch School of Engineering,

The Australian National University,Canberra, ACT 0200, AustraliaEmail: [email protected]

Thushara D. AbhayapalaResearch School of Engineering,

The Australian National University,Canberra, ACT 0200, Australia

Email: [email protected]

Abstract—The modeling performance of three models for the3D Head Related Transfer Function (HRTF) are compared. Oneof these models appeared recently in the literature whilst theother two models are novel. All models belong to the class offunctional models whereby the 3D-HRTF is expressed as anexpansion in terms of basis functions, which are functions ofazimuth, elevation, radial distance and frequency. The expan-sion coefficients capture the 3D-HRTF individualization. Themodels differ in the choice of basis functions and the degreeof orthogonality that is possibly given the constraint that foreach frequency the HRTF needs to satisfy the Helmholtz waveequation. One model introduced in this paper is designed toprovide a functional representation that is orthonormal on asphere at some nominal radius and approximately so around thatnominal radius. This model is shown to be superior to the othertwo in being able to reconstruct most efficiently the 3D-HRTFderived from a spherical head 3D-HRTF model. For all caseswe show that there is a unified technique to estimate expansioncoefficients from measurements taken on a sphere of arbitraryradius.

Index Terms—HRTF, 3D-HRTF, 2D-HRTF, Head RelatedTransfer Functions, Functional Modeling, Series Expansion, Di-mensionality, Helmholtz Equation.

I. INTRODUCTION

A. Background

Physical models of pinnae, head and torso naturally leadto models of the Head Related Transfer Function (HRTF)which are functional in form, meaning they are functionsof continuous spatial and frequency variables, including: thedelay line model [1], the concha model [2], the sphericalhead model [3], the boundary element model [4], [5], and thesnowman model [6], [7] as well as filter-band models: FIRmodel [8] and IIR model [9]), and the combined snowmanfilter model [10].

More mathematically motivated, but still physically consis-tent models, include [11], where the HRTFs are expressed interms of surface spherical harmonics, which are a completeset of continuous functions which are orthonormal on the

Rodney A. Kennedy was supported under the Australian Research Council’sDiscovery Projects funding scheme (Project No. DP1094350).

Thushara D. Abhayapala and Wen Zhang were supported under theAustralian Research Council’s Discovery Projects funding scheme (ProjectNo. DP110103369.

x ⌘ x(r, ✓,') 2 R3

✓

'

•

•

r ⌘ |x|

bx ⌘ b

x(✓,') 2 S2•

••

x

y

z

S2

Fig. 1. Coordinate system showing a point in 3D space x, a unit vector bx,

which lies on the 2-sphere, S2, and the spherical polar coordinate system with✓ 2 [0,⇡] the co-latitude, ' 2 [0, 2⇡) the longitude and r ⌘ |x| the radius.

sphere [12]. Using spherical harmonics analysis, a precondi-tioned fast multipole accelerated boundary element methodfor HRTF computation was investigated in [13], where themodel was derived by solving the wave equation based onHelmholtz reciprocity principle. Finally, a spherical harmonicbased continuous functional HRTF model was developed in[14], which separated the frequency components from thespatial components and can realize both near-field and far-field HRTFs.

The situation where a model is given as a function ofthe three spatial variables, typically azimuth, elevation andradial distance, is referred to as a 3D-HRTF. A 3D-HRTF isalways a function of frequency so there are generally fourvariables in play. When an HRTF is not given as a function ofradial distance it is called a 2D-HRTF given it has two spatialvariables plus it has a frequency variable.

B. Physical Model

In notation, see Fig. 1, we write x 2 R3 ⌘ x(r, ✓,'),

bx ⌘ b

x(✓,') = (cos' sin ✓, sin' sin ✓, cos ✓) =x

|x| 2 S2,

978-1-4799-1319-0/13/$31.00 ©2013 Crown

where r ⌘ |x| is the magnitude or 3D Euclidean distance, andthe 2-sphere is given by S2 , {x 2 R3

: |x| = 1}.A spherical harmonics based 3D-HRTF functional model is

an expansion in terms of signal independent basis functionsand where the coefficients hold the signal information, thatis, the HRTF individualization. In the expansion the angularportion uses the spherical harmonics, but the radial and fre-quency portions are constrained to have the 3D-HRTF fulfillthe Helmholtz or reduced wave equation [14]–[16]

�

2u+ k2u = 0

where u(x) is a function of space x 2 R3 and

k , 2⇡/� = 2⇡f/c

is the wavenumber, which is proportional to the frequency fand where c is the speed of sound assumed independent offrequency.

The Helmholz equation based broadband 3D-HRTF modelpresented in [14] is given by

H(r, bx; k) =X

n,m

⌘m[s]n (r; k)Y m

n (

bx), r > s, (1)

where s is an effective head radius, bx ⌘ {✓,'} 2 S2 is theangular part (co-latitude and longitude) and r ⌘ |x| is theradial part in a spherical polar coordinate system, Y m

n (

bx) is

the spherical harmonic of degree n 2 {0, 1, . . .} and orderm 2 {�`, . . . , `},

X

n,m

,1X

n=0

nX

m=�n

,

and

⌘m[s]n (r; k) , ik jn(ks)

Z

S2⇢s(by; k)Y m

n (

by) ds(by)h(1)

n (kr),

(2)where ⇢s(by; k) is a virtual source distribution at the effectivehead radius r = s, ds(by) , sin ✓ d✓ d', jn(·) is the sphericalBessel function of order n, and h(1)

n (·) is the spherical Hankelfunction of the first kind and of order n. It is assumed thatk 2 [0, ku] where ku is the upper wavenumber assumed finite.Finally, the spherical harmonics are given by [16]

Y mn (✓,') ,

s2n+ 1

4⇡

(n� |m|)!(n+ |m|)!P

mn (cos ✓)eim'

where the associated Legendre functions are

Pmn (z) , 1

2

nn!(1� z2)m/2 dn+m

dzn+m(z2 � 1)

n,

m 2 {0, 1, . . . , n}.

These definitions, for the spherical harmonics and associatedLegendre functions, are not the most common in the literaturebut chosen to be consistent with the work in [14] from whichthis paper derives. They differ only in a trivial way from themost common definitions [12, pp.198-201].

C. Observations• The wave nature is captured by noting that there are

effective terms of the form�h(1)

n (kr)Y mn (

bx)

(3)

which individually are the so-called radiating solutions tothe Helmholtz equation [16, Theorem 2.10]. We note thatthe frequency and radius are coupled because of the radialportion is a function of product kr. The use of the sphericalHankel function of the first kind h(1)

n (·) can be replaced bythe spherical Hankel function of the second kind h(2)

n (·)depending on the definition of an outward traveling wave;here we stick to the convention used in [14].

• The 3D-HRTF H(r, bx; k) is evidently completely deter-mined by ⇢s(by; k) whose domain dimension is smaller thanthat of the 3D-HRTF. This is explained by the observationthat H(r, bx; k) cannot be arbitrary and must satisfy theHelmholtz equation for each frequency k.

• By similar arguments knowing H(r1

, bx; k), where r1

> s isfixed, is sufficient to compute H(r, bx; k) for all r > s. Thisis the theoretical basis of the sufficiency to perform HRTFmeasurements at a single fixed distance to determine, inprinciple, the full 3D-HRTF.

• If one then has H(r1

, bx; k), or equivalently ⌘m[s]n (r

1

; k),then from the expressions (1) and (2) we can deduce

H(r2

, bx; k) =X

n,m

⌘m[s]n (r

2

; k)Y mn (

bx)

=

X

n,m

h(1)

n (kr2

)

h(1)

n (kr1

)

⌘m[s]n (r

1

; k)Y mn (

bx)

=

X

n,m

h(1)

n (kr2

)

h(1)

n (kr1

)

Z

S2H(r

1

, by; k)Y mn (

by) ds(by)Y m

n (

bx).

So if measurements are taken at fixed radius r1

> s to giveus H(r

1

, bx; k) then we can compute H(r2

, bx; k) for anyother radius r

2

> s [17].

II. ORTHOGONALITY AND BASIS EXPANSIONS

A. Rationale and Defining Criteria

Expansion of signals in terms of orthogonal functions isan effective representation because the orthogonal functionsare independent of the signal (they form the basis functions)and the coefficients depend only on the signal [12]. Usingbasis functions, even when they are not orthogonal, yieldscoefficients that form a sequence, or if they are finite they forma vector, and they present a more accessible way to processand store the signal. For example, comparing signals in suchmodels reduces to comparing sequences or vectors, rather thanfunctions.

B. Orthogonality and Orthonormality

To be rigorous before we can talk about orthogonality weneed to specify the inner product. There are three natural innerproduct depending of the function domains under study. We

also give expressions assuming the functions are complex-valued and these are easily modified in the real-valued case,that is, when we chose orthogonal functions which are real-valued.

For functions of frequency k on the interval [0, ku] the innerproduct is unweighted such that

hf, gi ,Z ku

0

f(k)g(k) dk. (4)

For angular functions on the sphere S2 the inner product isgiven by

hf, gi ,Z

S2f(bx)g(bx) ds(bx) (5)

where ds(bx) is the area element on the sphere. These twoinner products combine naturally for functions on S2⇥ [0, ku]to yield

hf, gi ,Z ku

0

Z

S2f(bx; k)g(bx; k) ds(bx) dk (6)

Induced norms are defined through kfk , hf, fi1/2.As usual f and g are orthogonal if hf, gi = 0 and if in

addition they satisfy kfk = 1 and kgk = 1 they are calledorthonormal. Generally, it is clear from the context which ofthe three inner products, (4), (5) or (6), is in use so we do notreflect this in the notation.

III. MODEL VARIANTS

The physical model in (1) and (2) does not satisfy theproperties of an orthogonal expansion outlined above becausethe signal part is not in the form of a sequence of coefficientsbut takes the functional form ⇢s(by; k). What we need to dois find different models whose representations and coefficientparameterizations are equivalent to (1), where the signal isonly present in the coefficients and the expansion involvesfunctions that are orthonormal in some domain.

A. Model 1 – Source Distribution ExpansionRather than attempting an orthonormal expansion for (1),

which is complicated by the Helmholtz constraint that the3D-HRTF cannot be arbitrary, we can instead attempt torepresent the 2D source distribution, ⇢s(bx; k) which lies on asphere of radius s. In comparison to the 3D-HRTF, ⇢s(bx; k) isunconstrained and finding an orthonormal function expansionis straightforward.

Firstly, directly in (2), we can define

↵m[s]n (k) ,

Z

S2⇢s(by; k)Y m

n (

by) ds(by), (7)

which for fixed k are the spherical harmonic coefficients of⇢s(· ; k). This implies (2) can be written

⌘m[s]n (r; k) = ik jn(ks)h

(1)

n (kr)↵m[s]n (k). (8)

However, although we have implicitly dealt with the angularportion by relying on the orthonormality of the sphericalharmonics, ↵m[s]

n (k) remains a function of k for each n,m.

Secondly, we can introduce complete orthonormal func-tions,1

�'q(k)

1q=1

,

defined on the interval k 2 [0, ku] to expand

↵m[s]n (k) =

X

q

�mn;q'q(k), (9)

whereX

q

,1X

q=1

,

with Fourier coefficients, using (4),

�mn;q , h↵m

n ,'qi.

The calculation of these coefficients using ↵m[s]n (k) is practi-

cally infeasible (as we do not have access to the virtual sourcedistribution) but can be done via the measurements of the 3D-HRTF at a fixed radius (as well show later).

The source distribution can then be synthesized by theinverse of (7)

⇢s(bx; k) =X

n,m

↵m[s]n (k)Y m

n (

bx)

=

X

n,m;q

�mn;q 'q(k)Y

mn (

bx),

whereX

n,m;q

,X

q

1X

n=0

nX

m=�n

,

the {Y mn (

bx)'q(k)} are orthonormal on S2 ⇥ [0, ku], and

thereby the full 3D-HRTF, (1) and (2), can be synthesizedthrough

H(r, bx; k) =X

n,m;q

�mn;q⇥

ik jn(ks)h(1)

n (kr)'q(k)Ymn (

bx), r > s, (10)

where�ik jn(ks)h

(1)

n (kr)'q(k)Ymn (

bx)

for fixed r, is not an orthogonal on S2 ⇥ [0, ku].To summarize, the coefficients, �m

n;q , parametrize the sourcedistribution referenced to a sphere of radius s (being theeffective head radius) and are sufficient to compute the 3D-HRTF. But when interpreted in terms of the 3D-HRTF theeffective basis is not orthonormal and so this representation isnot ideal.

1Different sets of orthonormal functions can be used for different values ofm and n, but here we use a single common set. The choice of different sets oforthonormal functions for different degrees n (but not different order m) wasa feature of the model considered in [14] and used in our later investigations.

B. Model 2 – Radiating Solution Expansion

The second model is essentially that presented in [14] butvaried slightly. The rationale for the model is that it is anexpansion explicitly in terms of the radiating solutions (3) ofthe Helmholtz equation

H(r, bx; k) =X

n,m

�mn (k)h(1)

n (kr)Y mn (

bx), r > s, (11)

where, to be consistent with (1) we have

�mn (k) , ik ↵m[s]

n (k)jn(ks),

and where ↵m[s]n (k) was defined in (7). This time we ex-

pand �mn (k) by introducing complete orthonormal functions2,

{'q(k)}, defined on the interval k 2 [0, ku], as follows

�mn (k) =

X

q

⇣mn;q'q(k), (12)

with Fourier coefficients

⇣mn;q , h�mn ,'qi. (13)

In this model the full 3D-HRTF, equivalent to (1) and (2), canbe synthesized through

H(r, bx; k) =X

n,m;q

⇣mn;q⇥

h(1)

n (kr)'q(k)Ymn (

bx), r > s, (14)

where �h(1)

n (kr)'q(k)Ymn (

bx)

(for fixed r) are not orthogonal on S2 ⇥ [0, ku]. That is,orthogonality in k is damaged by the presence of h(1)

n (kr).

C. Model 3 – Nominal Radius Expansion

1) 2D-HRTF Expansion: For the previous 3D-HRTF mod-els they are expansions in terms of functions which are notorthogonal (for fixed r). The presence of the h(1)

n (kr) termmakes separation of the frequency part and radial part difficultand its retention ultimately destroys frequency orthogonality.When the functional part lacks orthogonality then it is fun-damentally inefficient. The third model does not completelyalleviate the difficulties induced by the h(1)

n (kr) term but hasthe advantage of being locally orthonormal in a sense to bedeveloped next.

Consider the 3D-HRTF for a fixed radius, H(r0

, bx; k), thatis, r

0

> s. As it is a 2D-HRTF, there is no restriction inprinciple that needs to be placed on it, unlike the Helmholtzconstrained 3D-HRTF, and as a general function on S2⇥[0, ku]it is expressible in the form

H(r0

, bx; k) =X

n,m;q

⇢m[r0

]

n;q 'q(k)Ymn (

bx) (15)

2There is no need to introduce new notation for these functions as the sameones can be used as for model 1.

where {Y mn (

bx)'q(k)} are orthonormal using (6), and

⇢m[r0

]

n;q ,⌦H(r

0

, · ; ·), Y mn 'q

↵

=

Z ku

0

Z

S2H(r

0

, by; k)Y mn (

by) ds(by)'q(k) dk (16)

are the Fourier coefficients.2) 3D-HRTF Expansion: We can introduce a radial de-

pendence to extend the above 2D-HRTF model to make itequivalent to the 3D-HRTF model (14) (and thereby the (1)and (10) models). That is, we find the expression for the 3D-HRTF expression in terms of the 2D-HRTF coefficients (16)referenced to radius r

0

.For this 2D-HRTF we can infer from Model 2, (14) where

we use the coefficients from (13), and using inner product (5),that

⌦H(r

0

, · ; k), Y mn

↵=

X

q

⇣mn;qh(1)

n (kr0

)'q(k)

whereas, in comparison, (15), we have⌦H(r

0

, · ; k), Y mn

↵=

X

q

⇢m[r0

]

n;q 'q(k).

Their equality impliesX

q

⇣mn;q'q(k) =1

h(1)

n (kr0

)

X

q

⇢m[r0

]

n;q 'q(k),

and, guided by synthesis equation (14), we multiply byh(1)

n (kr) to obtain

h(1)

n (kr)X

q

⇣mn;q'q(k) =h(1)

n (kr)

h(1)

n (kr0

)

X

q

⇢m[r0

]

n;q 'q(k),

which relates the Model 2 and Model 3 coefficients. So thefull 3D-HRTF model (14), but now expressed in terms of theradius r

0

Model 3 2D-HRTF coefficients (16), can be written

H(r, bx; k) =X

n,m;q

⇢m[r0

]

n;q ⇥

h(1)

n (kr)

h(1)

n (kr0

)

'q(k)Ymn (

bx), r > s, (17)

which reduces to (15) when r = r0

. The coefficients {⇢m[r0

]

n;q }indeed hold the signal information of the full 3D-HRTF.

When r = r0

the functional part reduces to Y mn (

bx)'q(k)

which is orthonormal on S2⇥ [0, ku]. For other values of r thefunctional part has a ratio of spherical Bessel functions distort-ing the frequency k portion and destroying the orthogonality(and orthonormality). However, for kr close to kr

0

there isnear orthogonality given that the Taylor series is

h(1)

n (z) = h(1)

n (z0

) + (z � z0

)

@

@zh(1)

n (z) + · · ·

and, therefore, with z = kr and z0

= kr0

,

h(1)

n (kr)

h(1)

n (kr0

)

= 1 +O(kr � kr0

), as kr ! kr0

.

The constant associated with the correction term grows withn, following similar analysis given in [18].

TABLE IMODEL PARAMETERS IN THE GENERAL FORM (18)

Model Expansion Coefficients Weighting◆mn;q gn(k, r)

1 Source Distribution (10) �mn;q ik jn(ks)h

(1)n (kr)

2 Radiating Solution (14) ⇣mn;q h(1)n (kr)

3 Nominal Radius (17) ⇢m[r0

]n;q h(1)

n (kr)/h(1)n (kr0)

IV. COMPUTING THE COEFFICIENTS FOR THE MODELSFROM MEASUREMENTS ON A SPHERE

A. Recovering the Coefficients through Orthonormality

For a given HRTF we show how to determine the coef-ficients in the three models (10), (14) and (17). Sphericalharmonics based HRTF functional models are particularlywell-catered for measurements taken on a sphere at a fixedradius. The three models are sufficiently similar that they canall be written in the form

H(r, bx; k) =X

n,m;q

◆mn;q gn(k, r)'q(k)Ymn (

bx), (18)

where the coefficients and functions corresponding to the threemodels are given in Table I.

Following the methodology in [14] we assume we have theHRTF available at some radius r = r

1

and then the task toestimate {◆mn;q} from such H(r

1

, bx; k) is possible via

◆mn;q =

Z ku

0

1

gn(k, r1)⇥

Z

S2H(r

1

, bx; k)Y mn (

bx) ds(bx)'q(k) dk. (19)

Such analysis reveals that we can exploit the orthonormalityof both the spherical harmonics and the {'q(k)} by freezingthe radius in this case to r = r

1

.

B. Nominal Radius Expansion Example

In the case of the nominal radius expansion, or Model 3,there is no need to identify the measurement, which is at aradius r = r

1

, with the nominal radius r = r0

about whichwe have local orthonormality. Indeed, from Table I, we have

⇢m[r0

]

n;q =

Z ku

0

h(1)

n (kr0

)

h(1)

n (kr1

)

⇥Z

S2H(r

1

, bx; k)Y mn (

bx) ds(bx)'q(k) dk. (20)

This has the interpretation that we acquire the measurementas radius r

1

and determine the coefficients in the nominalradius expansion where there is a localized orthonormalityabout radius r

0

6= r1

. Of course, (20) reduces to (16) whenr1

= r0

which presents the simplest case.

C. Recovering the Coefficients from Incomplete Measurements

The integral transforms in (19) need to be typically com-puted from discrete measurements on the sphere and in fre-quency or time [19]. There are a multiplicity of methods andapproaches and the focus of this paper is not to develop newtechniques. However, there is one aspect of HRTF measure-ments which needs some care and that is the measurements arerarely over the whole sphere and similarly may not be madeat all frequencies. We call these incomplete measurements andwe give general pointers how to treat this problem by reducingthe distortion from missing measurements.

The estimation of coefficients given incomplete measure-ments can be approached from a generalization of the Pa-poulis algorithm [20]–[22] or conjugate gradient method [23].Alternatively the estimation can be done by the least squaresmethod given in [17], [24] and exploited in [14].

V. SPHERICAL HEAD MODEL BASED EVALUATION

We use the spherical head HRTF as an example to evaluatethe efficiency of the three proposed models. The closed formHRTF for a spherical head is given by [3], [14]

Hs(r, bx; k) = � 4⇡

ks2re�ikr

X

n,m

h(1)

n (kr)

h0(1)n (ks)

Y mn (

be)Y m

n (

bx),

where be 2 S2 is the vector to the measurement point on the

surface of the sphere, or the assumed ear position, and s isthe spherical head radius.

Substituting the above spherical head model analytical ex-pression into the three proposed models leads to the corre-sponding coefficients written as

X

q

◆mn;q 'q(k) = �4⇡ re�ikrY mn (

be)

ks2 h0(1)n (ks)

⇥

8>>>>><

>>>>>:

1

ikjn(ks)Model 1

1 Model 2

h(1)

n (kr0

) Model 3

.

Following [14], for the spatial part we use the sphericalharmonic degree n truncation [25], [26]

N(s; k) = deks/2e,

where e = 2.71818 . . ., k is the wavenumber and s is theeffective head radius. That is, n = 0, 1, . . . , N(s; k). In [14],s is set as a function of k in the sense that below a thresholdfrequency is set at value 0.2m because the torso comes intoplay and above the value 0.09m is used for the head onlyeffect. In the spherical head case the s is fixed at 0.09m.Despite its complications, because the truncation is commonto the three models, it provides a fair basis for comparison.

We apply the spherical Fourier Bessel series as the completeorthonormal functions on [0, ku] [14], that is,

'[n]q (k) = jn(Z

(n)q k/ku),

0 10 20 30 40 50 60 70 80 90 100 110 120 13010�3

10�2

10�1

100

expansion order Q

cumulative

variance

�

Q

Model 1Model 2Model 2

Fig. 2. Comparison of the cumulative variance of the expansion coefficientsfor the three models. This shows Model 3 can use less terms in the sphericalFourier Bessel series to model the reference spherical head model data.

where we have chosen a different series to apply for each n,and define the cumulative variance of the expansion coeffi-cients as the evaluation metric

�Q =

E�PQ

q=1

|◆mn;q|2

E�PQ

max

q=1

|◆mn;q|2 .

E{·} represents the expectation operator performed on differ-ent spherical harmonic expansion orders (n,m) and Q

max

de-notes the maximum expansion order (the number of frequencysamples).

Fig. 2 shows the cumulative variance of the decomposedcoefficients of the three models. In this example, the HRTFsare obtained at r

0

= 1.2 m and extrapolated to r = 1.5m. Model 3 demonstrates the fastest convergence in termsof representing the HRTF spectra within a bandwidth of20 kHz, which corroborates an assertion that it is the mostefficient representation among the three models proposed inthis paper. We need 45 spherical Fourier Bessel series in theModel 3 (including more than 93% cumulative variance) torepresent the 20 kHz HRTF spectra (Q

max

= 128 frequencysamples). This means that the model can achieve around 65%data reduction. Fig. 3 plots the spherical head model andreconstructed HRTFs at co-latitude ✓ = 90

� and longitude' = 80

� using the Model 3. It is clear that the Model 3extrapolated response closely matches the analytic sphericalhead data.

VI. DISCUSSION AND CONCLUSIONS

Spherical harmonics based 3D-HRTF functional modelshave been developed in this work. The angular portion ofthese basis functions is chosen for pragmatic reasons to usethe spherical harmonics. The frequency and radial portionsof these basis functions are complicated because they areconstrained by the requirement to satisfy the Helmholtz waveequation. Ultimately this means that the 3D-HRTF which

0 2 4 6 8 10 12 14 16 18 20

�20

�10

0

frequency (kHz)

magnitude(dB)

analyticextrapolated

0 2 4 6 8 10 12 14 16 18 200

20

40

60

frequency (kHz)

phase(radians)

analyticextrapolated

Fig. 3. Example of the spherical head HRTF reconstructed using the proposedModel 3. The spherical head model generated synthetic data at r0 = 1.2 mand used to determine the Model 3 parameters. Then Model 3 was used toextrapolate to r = 1.5 m for (✓ = 90�, ' = 80�) and shown marked with⇥. This was compared with the r = 1.5 m analytical spherical head modelreference data shown as a solid line.

is defined on the four dimensional domain — two angulardimensions, one radial dimension and one frequency dimen-sion — is not easily amenable to an expansion where thebasis functions are fully orthonormal across all four variables.This means there is a trade-off between what portions of thefunctional expansion are made orthonormal (in fewer thanthe four variables) and the impact on the efficiency of therepresentation caused by the non-orthogonality.

Three expansions were shown which are all exact in thesense that with an infinite number of terms they equal the phys-ically based 3D-HRTF model which satisfies the Helmholtzequation. However, each of the three models correspond todifferent choices of basis functions. Previously one of thesemodels has appeared in the literature, Model 2, and cor-responds to using the radiating solutions to the Helmholtzequation as the basis functions [14], [16]. Two new modelswere introduced in this work. The first, Model 1, considers anorthonormal representation of the virtual source distributionon a sphere but the orthonormality is not carried over to the3D-HRTF. However, it does have a unambiguous physicalinterpretation. The second novel model, Model 3, seeks tofind a representation that is orthonormal on a sphere at somenominal radius of interest (and approximately so around thatnominal radius). For all three cases we showed that there is aunified technique to estimate the expansion coefficients frommeasurements taken on a sphere of arbitrary radius.

To compare the models we employed the analytic sphericalhead model and obtained closed-form expressions for thecoefficients in each of the expansions. Our numerical inves-tigations corroborated our assertions that the most efficientmodel — approximating the HRTF with the fewest number of

truncated terms — should correspond to the one with the bestorthogonality which was Model 3. Further work is requiredto fully compared the models presented and determine theirrelative effectiveness through analysis.

Finally, in terms of open problems, we note that the radialnormalization developed in [14] can be incorporated into eachmodel and this may help in finding a superior model. Alsofrom an abstract perspective the Helmholtz equation can beconsidered as a linear subspace constraint and there shouldexist an intrinsic orthonormal functional 3D-HRTF expansionin some three-dimensional domain. However, the exact formof this intrinsic expansion currently eludes us or its existenceneeds to be falsified.

REFERENCES

[1] D. W. Batteau, “The role of the pinna in human localization,” Proc. R.Soc. Lond. B, vol. 168, no. 1011, pp. 158–180, Aug. 1967.

[2] E. A. Lopez-Poveda and R. Meddis, “A physical model of sounddiffraction and reflections in the human concha,” J. Acoust. Soc. Am.,vol. 100, no. 5, pp. 3248–3259, Nov. 1996.

[3] R. O. Duda and W. L. Martens, “Range dependence of the responseof a spherical head model,” J. Acoust. Soc. Am., vol. 105, no. 5, pp.3048–3058, Nov. 1998.

[4] B. F. G. Katz, “Boundary element method calculation of individual head-related transfer function. I. rigid model calculation,” J. Acoust. Soc. Am.,vol. 110, no. 5, pp. 2440–2448, Nov. 2001.

[5] B. F. G. Katz, “Boundary element method calculation of individual head-related transfer function. II. impedance effects and comparisons to realmeasurements,” J. Acoust. Soc. Am., vol. 110, no. 5, pp. 2449–2455,Nov. 2001.

[6] N. A. Gumerov, R. Duraiswami, and Z. Tang, “Numerical study of theinfluence of the torso on the HRTF,” in Proc. 2002 IEEE Int. Conf.Acoustics, Speech, and Signal Processing (ICASSP), vol. 2, Orlando,FL, May 2002, pp. 1965–1968.

[7] V. R. Algazi, R. O. Duda, R. Duraiswami, N. A. Gumerov, andZ. Tang, “Approximating the head-related transfer function using simplegeometric models of the head and torso,” J. Acoust. Soc. Am., vol. 112,no. 5, pp. 2053–2064, Nov. 2002.

[8] F. L. Wightman and D. J. Kistler, “Headphone simulation of free-fieldlistening. I: Stimulus synthesis,” J. Acoust. Soc. Am., vol. 85, no. 2, pp.858–867, Feb. 1989.

[9] A. Kulkarni and H. S. Colburn, “Infinite-impulse-response models ofthe head-related transfer function,” J. Acoust. Soc. Am., vol. 115, no. 4,pp. 1717–1728, Apr. 2004.

[10] V. R. Algazi, R. D. Duda, and D. M. Thompson, “The use of head-and-torso models for improved spatial sound synthesis,” in Proc. 113th AudioEngineering Society Conv., Los Angeles, CA, Oct. 2002, pp. 1–18.

[11] M. J. Evans, J. A. S. Angus, and A. I. Tew, “Analyzing head-relatedtransfer function measurements using surface spherical harmonics,” J.Acoust. Soc. Am., vol. 104, no. 4, p. 2400, Oct. 1998.

[12] R. A. Kennedy and P. Sadeghi, Hilbert Space Methods in SignalProcessing. Cambridge, UK: Cambridge University Press, Mar. 2013.

[13] N. A. Gumerov, A. E. O’Donovan, R. Duraiswami, and D. N. Zotkin,“Computation of the head-related transfer function via the fast multi-pole accelerated boundary element method and its spherical harmonicrepresentation,” J. Acoust. Soc. Am., vol. 127, no. 1, pp. 370–386, Jan.2010.

[14] W. Zhang, T. D. Abhayapala, R. A. Kennedy, and R. Duraiswami,“Insights into head related transfer function: Spatial dimensionality andcontinuous representation,” J. Acoust. Soc. Am., vol. 127, no. 4, pp.2347–2357, Apr. 2010.

[15] W. Zhang, T. D. Abhayapala, R. A. Kennedy, and R. Duraiswami,“Modal expansion of HRTFs: Continuous representation in frequency-range-angle,” in Proc. IEEE Int. Conf. Acoustics, Speech, and SignalProcessing, ICASSP’2009, Taipei, Taiwan, Apr. 2009, pp. 285–288.

[16] D. Colton and R. Kress, Inverse Acoustic and Electromagnetic ScatteringTheory, 3rd ed. New York, NY: Springer, 2013.

[17] M. Pollow, K.-V. Nguyen, O. Warusfel, T. Carpentier, M. Muller-Trapet, M. Vorlander, and M. Noisternig, “Calculation of head-relatedtransfer functions for arbitrary field points using spherical harmonicsdecomposition,” ACTA Acust. United Ac., vol. 98, no. 1, pp. 72–82,2012.

[18] R. A. Kennedy, D. B. Ward, and T. D. Abhayapala, “Nearfield beam-forming using radial reciprocity,” IEEE Trans. Signal Process., vol. 47,no. 1, pp. 33–40, Jan. 1999.

[19] W. Zhang, M. Zhang, R. A. Kennedy, and T. D. Abhayapala, “Onhigh resolution head-related transfer function measurements: An efficientsampling scheme,” IEEE Trans. Audio Speech Language Process.,vol. 20, no. 2, pp. 575–584, Feb. 2012.

[20] A. Papoulis, “A new algorithm in spectral analysis and band-limitedextrapolation,” IEEE Trans. Circuits Syst., vol. CAS-22, no. 9, pp. 735–742, Sep. 1975.

[21] W. Zhang, R. A. Kennedy, and T. D. Abhayapala, “Iterative extrapolationalgorithm for data reconstruction over sphere,” in Proc. IEEE Int. Conf.Acoustics, Speech, and Signal Processing, ICASSP’2008, Las Vegas,Nevada, Mar. 2008, pp. 3733–3736.

[22] R. A. Kennedy, W. Zhang, and T. D. Abhayapala, “Spherical harmonicanalysis and model-limited extrapolation on the sphere: Integral equationformulation,” in Proc. International Conference on Signal Processingand Communication Systems, ICSPCS’2008, Gold Coast, Australia, Dec.2008, p. 6.

[23] Z. Khalid, R. A. Kennedy, S. Durrani, and P. Sadeghi, “Conjugategradient algorithm for extrapolation of sampled bandlimited signals onthe 2-sphere,” in Proc. IEEE Int. Conf. Acoustics, Speech, and SignalProcessing, ICASSP’2012, Kyoto, Japan, Mar. 2012, pp. 3825–3828.

[24] D. N. Zotkin, R. Duraiswami, and N. A. Gumerov, “Regularized HRTFfitting using spherical harmonics,” in Proc. of 2009 IEEE Workshop onApplications of Signal Processing to Audio and Acoustics, New Paltz,NY, October 2009, pp. 257–260.

[25] H. M. Jones, R. A. Kennedy, and T. D. Abhayapala, “On dimensionalityof multipath fields: Spatial extent and richness,” in Proc. IEEE Int.Conf. Acoustics, Speech, and Signal Processing, ICASSP’2002, vol. 3,Orlando, FL, May 2002, pp. 2837–2840.

[26] R. A. Kennedy, P. Sadeghi, T. D. Abhayapala, and H. M. Jones, “Intrinsiclimits of dimensionality and richness in random multipath fields,” IEEETrans. Signal Process., vol. 55, no. 6, pp. 2542–2556, Jun. 2007.

Comparison of Spherical Harmonics based 3D-HRTF Functional...

Documents

Transcript of Comparison of Spherical Harmonics based 3D-HRTF Functional...