On the Preprocessing and Postprocessing of HRTF Individualization Based on Sparse Representation of...

On the Preprocessing and Postprocessing of HRTF Individualization Based on

Sparse Representation of Anthropometric Features

Jianjun HE, Woon-Seng Gan, and Ee-Leng Tan

April 2015

Jhe007@e.ntu.edu.sg

Digital Signal Processing Lab,

School of Electrical and Electronic Engineering,

Nanyang Technological University, Singapore

1. Head-related transfer functions (HRTFs) are highly individualized;

2. HRTFs are closely related to Anthropometry (torso, head, pinna);

3. Anthropometry can be used for HRTF individualization.

IndividualizationAnthropometry

of a new person

HRTF of the

new person

Anthropometry database

HRTF database

In this paper, we aim to answer:

1. Whether the preprocessing and postprocessing methods affect the performance of HRTF individualization?

2. If so, what is the best preprocessing and postprocessing combination?

3. And, how good is it?

CIPIC Anthropometric data (35 subjects)

3V. R. Algazi, R. O. Duda, D. M. Thompson, and C. Avendano, “The CIPIC HRTF database,” in Proc. IEEE WASPAA, New Paltz, NY, Oct. 2001.

Methodology

IndividualizationA1 H1?

Anthropometry database A : S subjects * 1 set of Anthropometry (F features)

Anthropometry of a new person A1: 1 subjects * 1 set of Anthropometry

HRTF database H : S subjects * 1 set of HRTF (D directions * K points)

HRTF of a new person H1 ： 1 subjects * 1 set of HRTF

P. Bilinski, J. Ahrens, M. R. P. Thomas, I. Tashev, and J. C. Plata, “HRTF magnitude synthesis via sparse representation of anthropometric features,” in Proc. IEEE ICASSP, Florence, Italy,

pp. 4501-4505, May 2014.

Methodology

Preprocessing of Anthropometry i

1. Direct

2. Min-max normalization

3. Standard score

4. Standard deviation normalization

-min , 2

max min

1,-mean , 3

, 4std

f ff fi

2,..., .

Preprocessing of HRTF m

1. Magnitude

2. Log magnitude

3. Power

, , 11,2,..., ;

, 20 log , , 2 1, 2,...,

d k md D

d k d k mk K

Sparse representation j

1. Direct

2. Nonnegative

i ji j l

,, , , 20

ˆ , 10 , 2 .

i j l m

d ki j l m

i j l m

arg min ,

arg min , s.t. 0.

i i i i i

i i i i i i

A A Aw

A A A Aw

w A w A w

w A w A w w

ii i AwA A Postprocessing of Anthropometry l

1. Direct

2. Normalized

In total, we have variants of methods! 484× 3× 2× 2 =

PreprocessingH

PreprocessingA

HRTF of the new person

HRTF databasePreprocessing

SynthesisSparse

representation

Anthropometry of a new person wA

Postprocessing A

wH(i,j,l)

, , ,1

ˆ i j l mH

PreprocessingA

Anthropometry of a new person

Sparse representation

wA(i,j)

PreprocessingH

HRTF database

Postprocessing A

wH(i,j,l)

HRTF of the new personSynthesis

, , ,1

ˆ i j l mH

Evaluation

, , , ,

2, , , ,

101 1 1

Spectral distortion SD

ˆ ,1 1 120log dB

i j l m n

i j l m nS D Ks

s d ktest s

S D K d k

• CIPIC HRTF database;

• Cross validation technique to selection the regularization

parameter;

• Stest = 35 test cases, all 1250 directions, and full frequency

range.

Performance varies among different preprocessing and postprocessing methods!

Sparse representation PostA PreH

Direct Min-max Standard score

Standard deviation

Direct

Mag 6.37 6.57 81.00 6.23

Log mag 6.40 6.50 21.61 6.17

Power 6.56 6.60 78.94 6.46

Normalized

Mag 6.36 6.35 15.97 6.25

Log mag 6.37 6.26 8.89 6.17

Power 6.60 6.77 25.21 6.52

Nonnegative

Direct

Mag 6.32 6.32 6.47 6.23

Log mag 6.38 6.47 6.79 6.17

Power 6.52 6.37 6.55 6.46

Normalized

Mag 6.31 6.26 6.10 6.25

Log mag 6.35 6.20 5.86 6.17

Power 6.53 6.54 6.54 6.52

1 2 3 45.6

6.8(a)

Preprocessing method for A1 2 3 4

6.8(b)

6.8(c)

6.8(d)

Preprocessing method for A

Mag Log mag Power

1 2 3 45.6

6.8(a)

6.8(b)

6.8(c)

6.8(d)

Preprocessing method for A

Mag Log mag Power

Results

Direct sparse representation

1. PreA: standard deviation best, standard score worst;

2. PreH: log mag best, power worst;

3. PostA, PostH: minimal effect for good PreA, PreH.

Nonnegative sparse representation

1. Better than corresponding direct sparse representation (especially for

standard score);

2. Trend in PreA/PreH not obvious;

3. Normalized PostA can improve the performance (especially for standard

score).

(a) Direct sparse; Direct PostA

(b) Direct sparse; Normalized PostA

(c) Nonnegative sparse; Direct PostA

(d) Nonnegative sparse; Normalized PostA

Method Specification SD (dB)

Single bestSelect one single set of HRTF

with the corresponding closest anthropometry

Tashev et alMin-max PreA

Magnitude PreHDirect sparse representation No reported postprocessing

Our bestStandard score PreALog magnitude PreH

Nonnegative sparse representationNormalized PostA

Lower boundLinear regression based HRTF

individualization 5.12

Comparison

opt 2 21

Conclusions

1. Introduced preprocessing and postprocessing in HRTF individualization based on sparse

representation of anthropometric features.

2. Investigated 48 variants of preprocessing and postprocessing methods, and found

a) Preprocessing and postprocessing methods do affect the performance of HRTF individualization, though the effects

differ in different combinations;

b) Adding nonnegative constraints in sparse representation improves the performance;

c) The best combination for HRTF individualization is

< standard score + log magnitude + nonnegative + normalized >.

3. Established the lower bound for this type of HRTF individualization and verified that “our best”

combination outperforms existing approaches and is quite close to the lower bound.

4. Future work: subjective evaluation of HRTF individualization.

References

[1] D. R. Begault, 3-D Sound for Virtual Reality and Multimedia, Cambridge, MA: Academic Press, 1994.

[2] J. Blauert, Spatial Hearing: The Psychophysics of Human Sound Localization, The MIT Press, revised edition, 1996.

[3] H. Møller, “Fundamentals of Binaural Technology,” Applied Acoustics, vol. 36, 171-218, 1992.

[4] W. G. Gardner, and K. D. Martin, “HRTF Measurements of a KEMAR,” J. Acoust. Soc. Amer., vol., vol. 97, pp. 3907-3908, 1995. See also http://www.sound.media.mit.edu/KEMAR.html.

[5] H. Møller, M. F. Sørensen, D. Hammershøi, and C. B. Jensen, “Head-Related Transfer Functions of Human Subjects,” J. Aud. Eng. Soc., vol. 43, pp. 300-321, 1995.

[6] V. R. Algazi, R. O. Duda, D. M. Thompson, and C. Avendano, “The CIPIC HRTF database,” in Proc. IEEE WASPAA, New Paltz, NY, USA, Oct. 2001.

[7] E. M. Wenzel, M. Arruda, D. J. Kistler, and F. L. Wightman, “Localization Using Non-individualized Head-Related Transfer Functions,” J. Acoust. Soc. Amer., vol. 94, pp. 111-123, 1993.

[8] S. Xu, Z. Li, and G. Salvendy, “Individualization of Head-related transfer function for three-dimensional virtual auditory display: a review,” in R. Shumaker (Ed.): Virtual Reality, HCII 2007,

LNCS 4563, pp. 397–407, 2007.

[9] K. Sunder, J. He, E. L. Tan, and W. S. Gan, “Natural sound rendering for headphones,” IEEE Signal Processing Magazine, vol. 32, no.2, pp. 100-113, Mar. 2015.

[26] P. Bilinski, J. Ahrens, M. R. P. Thomas, I. Tashev, and J. C. Plata, “HRTF magnitude synthesis via sparse representation of anthropometric features,” in Proc. IEEE ICASSP, Florence, Italy, pp.

4501-4505, May 2014.

[28] S. J. Kim, K. Koh, M. Lusig, S. Boyd, and D. Gorinevsky, “An interior-point method for large-scale l1-regularized least squares,” J. Selected topics in signal processing, vol. 1, no. 4, pp. 606-

617, Dec. 2007.

[30] J. Breebaart, F. Nater, and A. Kohlrausch, “Spectral and spatial parameter resolution requirements for parametric, filter-bank-based HRTF processing,” J. Audio Eng. Soc., vol. 58, no. 3, pp. 126-

140, Mar. 2010.

Acknowledgement

THIS WORK IS SUPPORTED BY THE SINGAPORE MINISTRY OF EDUCATION ACADEMIC RESEARCH FUND

TIER-2, UNDER RESEARCH GRANT MOE2010-T2-2-040.

On the Preprocessing and Postprocessing of HRTF Individualization Based on

Sparse Representation of Anthropometric Features

Jianjun HE

Jhe007@e.ntu.edu.sg

Nanyang Technological University, Singapore

Thank you!

On the Preprocessing and Postprocessing of HRTF Individualization Based on Sparse Representation of...

Documents

Transcript of On the Preprocessing and Postprocessing of HRTF Individualization Based on Sparse Representation of...

ANSYS Postprocessing

Introductory FLUENT Trainingdl.ptecgroup.ir/.../fluent12-lecture11-post.pdf · · 2015-03-22Introductory FLUENT Training . Postprocessing 11-2 ... (UDF). Postprocessing 11-3 ANSYS,

Grid Postprocessing Cfd

Minimum Representation and Reconstruction of HRTF using ...

Postprocessing for skin detection

OwnSurround HRTF Service for Professionals

Import ANSYS database Solve Postprocessing

Pinna Morphological Parameters Influencing HRTF Setspro.3dsoundlabs.com/wp-content/uploads/DAFX17_Pinna_Morphologic… · representation is then used for the synthesis of his HRTF

Comparison of different EPS postprocessing methods

PT postprocessing Fieldextra

Statistical Postprocessing of LM Weather Parameters

Eric Schwenker, An EM-Derived Approach to Blind HRTF ...cs229.stanford.edu/proj2014/Eric Schwenker, An EM-Derived Approa… · An EM-Derived Approach to Blind HRTF Estimation Eric

Results Viewer Workshop 13D Postprocessing. Workshop Supplement November 1, 2002 Inventory #001756 W13-2 13D. Postprocessing Results Viewer Description.

Pre- & Postprocessing in QGIS BASEmesh 2

Cfd Grid Postprocessing 2013

Specialized Techniques for Postprocessing and ... · Specialized Techniques for Postprocessing and Visualization in COMSOL Multiphysics COMSOL, COMSOL Multiphysics, Capture the Concept,

Torlon Processing and Postprocessing

OpenFOAM postprocessing and advanced running options

Pre- & Postprocessing in QGIS

Statistical Postprocessing of Surface Weather Parameters