Post on 13-Mar-2020
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 1 of 74
April, 2002
Abstract
In this project, I use the Head Related Transfer Functions (HRTFs)
to illustrate how a monaural sound synthesizes some particular sounds in
3-D space. It comprises a study of HRTF and an explanation of creating a
3-D environment.
I employed some HRTF measurements of MIT Laboratory and my
Matlab codes to generate “spatialized” sounds. To enhance my design, I
produced a simple graphical interface for users to activate the operation
of HRTFs. In GUI window, one simply clicks the mouse and some
processes will be performed in the background. Thus, HRTF-sounds
come in different directions.
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 2 of 74
April, 2002
Acknowledgements
I thank those for encouraging me to begin, continue, finish, and
cherish my time as a graduate student here in the Hong Kong Polytechnic
University. I offer my deepest thanks:
To my supervisor, Dr. Mak Man Wai of the Electronic and
Information Engineering Department, thanks for his guidance, patience
and wisdom over the past years. Without his support, none of this could
be possible.
To all of my teachers, past and present, in and out of school, for
caring for me and training me.
Finally, thanks to my parents for their incessant love, support,
affection, and encouragement. Thank you very much!
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 3 of 74
April, 2002
Content
Abstract...........................................................................................................................1
Acknowledgements........................................................................................................2
Content............................................................................................................................3
List of Tables ..................................................................................................................5
List of Figures.................................................................................................................6
Chapter 1: Introduction..................................................................................................8
1.1 Motivation..............................................................................................................8
1.2 Objective of the work............................................................................................9
1.3 Scope of thesis .....................................................................................................9
1.4 Details of relevant theory ...................................................................................10
1.4.1 HRTF's - Head Related Transfer Functions................................................10
1.4.2 Overview of the human ear .........................................................................13
Chapter 2 Project Implementation ..............................................................................15
2.1 Review of MIT HRTF’s measurements ..............................................................15
2.1.1 Measurements Data .....................................................................................16
2.2 Review of past work ...........................................................................................18
2.3 Introduction of proposed work..........................................................................19
2.4 System Architecture and Requirements...........................................................20
2.4.1 Hardware.......................................................................................................20
2.4.2 Software ........................................................................................................20
Chapter 3 Methodology..............................................................................................21
3.1 Matlab Scripts in Details ....................................................................................21
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 4 of 74
April, 2002
3.1.1 Interface codes.............................................................................................23
3.1.1.1 hrtf.m ..................................................................................................... 23 3.1.1.2 verti_sur.m ............................................................................................ 26 3.1.1.3 hori_sur.m ............................................................................................. 28
3.1.2 HRTF codes for verti_final.m and hori_final.m..........................................30
3.2 Input Parameters ................................................................................................37
3.3 Output Parameters..............................................................................................37
Chapter 4 HRTF Results ............................................................................................38
4.1 Comparison HRTF-sound with common stereo sound ...................................39
4.2 Wave Statistics from different condition ..........................................................42
Chapter 5 User Manual of 3D Simulation ...................................................................48
5.1 Setting the input and output file name: ............................................................50
5.2 Setting Parameters of Horizontal Surround .....................................................51
5.3 Setting Parameters of Vertical Surround..........................................................53
5.4 Processing the parameter..................................................................................56
5.5 Close the window................................................................................................56
Chapter 6 Conclusion ..................................................................................................57
6.1 Problems with HRTF-based synthesis of spatial audio...................................57
6.2 Headphones ........................................................................................................58
6.3 Future Work.........................................................................................................58
Appendices...................................................................................................................60
Homepage of my project ..........................................................................................60
Pseudo-code of Matlab scripts................................................................................61
References....................................................................................................................74
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 5 of 74
April, 2002
List of Tables Table 2.1 Number of measurements and azimuth increment at each elevation…….17
Table 3.1 The Matlab Scripts in my project and their objectives………….….…..…...21
Table 3.2 Sub-functions in hrtf.m of my project and their objectives……….…….….23
Table 3.3 Sub-functions in verti_sur.m of my project and their objectives……….…26
Table 3.4 Sub-functions in hori_sur.m of my project and their objectives…………..27
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 6 of 74
April, 2002
List of Figures Figure 1.1 distribution of sound source of measurements …………………….......10 Figure 3.1 hrtf.fig in GUIDE……………………………………………………..……..22 Figure 3.2 verti_sur.fig in GUIDE………………………………………………….…..25 Figure 3.3 hori_sur.fig in GUIDE……………………………...……………….….......27 Figure 3.4 flow Chart of hori_final.m and verti_final.m……………….……..….…..29 Figure 3.5 pseudo-codes of verti_final.m and hori_final.m………………..….…....31 Figure 3.6 pseudo-codes of cir_mon.m and cirsft.m………….………….…………32 Figure 3.7 illustration of sound sample, and prediction error for sound
where the prediction error is large at the end of the section….……....33 Figure 3.8 block convolution using the overlap-save method………….…….…….34 Figure 3.8 (a) input signal x(n) divided into overlapping sections,overlap is Nh-1=2...34 Figure 3.8 (b) impulse response h(n) …..…………………………………... ………..….34 Figure 3.8 (c) output y(n) using direct convolution …..…………………...………........34 Figure 3.8 (d) output y1(n) for block circular convolution of x1(n) and h(n) ….…….....34 Figure 3.8 (e) output y2(n), (f) output y3(n) ………………………..……….…….…..….34 Figure 3.8 (g) output y4(n) ……………………..……………………….….…...……..…..34 Figure 3.8 (h) sequential concatenation of block outputs after discarding the first
two samples of each block, which is equivalent to the direct convolution result "|" represents concatenation………………...…..…..34
Figure 3.9 overlap-save methods in this program…………………………..……….35 Figure 4.1 graph of GloryBe.wav and its spectrogram………………….……………37 Figure 4.2 graph of GloryBe.wav after HRTF processing and its spectrogram…...38 Figure 4.3 simulation of stereo sound of GloryBe.wav by one application and
its spectrogram……….…………………………………………....…….....39
Figure 4.4 3D surface of the GloryBe.wav in frequency expression……………….40 Figure 4.5 flow and graph of GloryBe.wav HRTF sound flowing from the
front to the back…………………………………………………….………..41 Figure 4.6 flow and graph of GloryBe.wav HRTF sound flowing a cycle………...41
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 7 of 74
April, 2002
Figure 4.7 spectrogram of the wave in Figure 4.6 and left side is at elevation 0° and the right side is at elevation 50°…………………………………...42 Figure 4.8 flow and graph of GloryBe.wav HRTF sound flowing in the front
from -40° to 90°…………..……………...……………………………………43
Figure 4.9 flow and graph of GloryBe.wav HRTF sound flowing in the right from -40° to 90°…………………………………..…………………………...43
Figure 4.10 flow and graph of GloryBe.wav HRTF sound flowing in the back
-40° to 90°……………………………………………..….…………………...44
Figure 4.11 flow and graph of GloryBe.wav HRTF sound flowing in the left from -40° to 90°………..…………………………………………….…...…..44
Figure 4.12 flow and graph of GloryBe.wav HRTF sound flowing from the
front to the back through the above of head……………………….….…...45
Figure 4.13 flow and graph of GloryBe.wav HRTF sound flowing in from the right to the left through the above of head………………..……….…..45
Figure 5.1 Matlab 6.0 first pages………………….………………………………..…...47 Figure 5.2 initial “HRTF based surround sound” window……………………………..47 Figure 5.3 pop-up window for choosing the output file name………...……………...49 Figure 5.4 “HRTF based surround sound” after choosing input and output file name………………………………………………..……….….…50 Figure 5.5 initial window of “ Setting Horizontal Surround Sound Parameters”……………………………………………...…….……….…….50 Figure 5.6 window of “ Setting Horizontal Surround Sound Parameters” after setting parameter…………………………………………….………...51 Figure 5.7 initial window of “Setting Vertical Surround Sound Parameters”……….52 Figure 5.8 window of “Setting Vertical Surround Sound Parameters” after pressing “Route of Wave”. A picture loaded in the blank box when choosing “Spcify azimuth angle”……………………….………….………..53 Figure 5.9 window of “Setting Vertical Surround Sound Parameters” after pressing “Route of Wave”. A picture loaded in the blank box when choosing “Semi_Circle azimuth angle”……………………….……….…...54 Figure 5.10 Message box for referencing the completing of processing…………..54 Figure 5.11 Message box asking user for choosing the close operation…………...54
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 8 of 74
April, 2002
Chapter 1: Introduction
1.1 Motivation
Directional hearing refers to the sensory attribute that enables humans to
determine the location of a sound source in both azimuth (left/right) and elevation
(up/down) position. Sounds give the sensation of direction called “spatialized sounds”.
[1] The ability of humans to perceive the spatial location of sound in space is truly a
remarkable skill. Not only the auditory systems localize sounds with extraordinary
accuracy, but spatial hearing mechanisms are also exceptionally robust under the
most extreme of listening conditions. To get better control of spatial sound in various
applications, researchers from all fields have sought to understand how the human
auditory system accomplishes spatial hearing. Recently, advances in computational
power and acoustic measurement techniques have made spatial hearing possible to
empirically measure, analyze, and synthesize the spectral cues. These spectral cues
are called Head Related Transfer Functions(HRTF’s) sound.[2] Loosely speaking,
HRTF’s are filters, which describe the acoustic filtering the head, torso, and external
ear(pinna) performed on a sound, and use to simulate the illusion of spatial sound
over headphones. HRTF data sets presumably contain all of the sonic cues affecting
the perception of spatial location for a specific individual.
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 9 of 74
April, 2002
1.2 Objective of the work
I have completed a program which aims to transfer a monaural sound to a
stereo one. The headphone listener can perceive processed sound at any specified
locations in space such as front, back, right or left. I use head-related transfer
functions (HRTFs) to impart sound localization cues. The apparent direction of
incidence of sound (the signal) can be manipulated with it.
HRTF’s measurements I used are produced by Bill Gardner and Keith Martin at
the MIT Media Lab [4] which uses a KEMAR (Knowles Electronics Manikin for
Auditory Research) Dummy-Head Microphone. [5] This data was available for public
use. The tools and database for HRTF processing was described in detail below.
1.3 Scope of thesis
This thesis is structured as follows. In the rest of Chapter 1 briefs relevant
theories such as Head-Related Transfer Function (HRTF), interaural transfer function
(ITF) and the interaural time difference (ITD). [6] I give an overview of human ear.
Chapter 2 investigates the project implementation. First, I take a review of MIT
HRTF’s measurements. It explains how the MIT Laboratory gets HRTF measurements.
Then, I tell what I did last year. I will also state the system architecture and some
requirements of my program. Chapter 3 is about methodology. I explain the
architecture of Matlab scripts in greater details. For example, I draw the program flow
of HRTF.m, verti_sur.m, hori_sur.m, verti_final.m and hori_final.m.
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 10 of 74
April, 2002
In chapter 4, I paste some processed wave statistics and spectrograms of
GloryBe.wav and present out some characteristics of HRTF-based sound. Finally,
Chapter 5 provided some conclusions and future directions.
1.4 Details of relevant theory
1.4.1 HRTF's - Head Related Transfer Functions
The transformation of a sound wave from a source to the ear is normally
described by a transfer function called the head-related transfer function (HRTF). The
HRTF is a frequency-function of a signal and the location of the source with respect to
head. The source’s azimuth angle and elevation angle location use to specify the
location of source. A complete set of HRTF measurements consists of many filters
that describe a spherical map of the possible sound sources. It contains information
about frequency dependent sound delays and intensity differences between ears.
When a signal sends through
Figure 1.1 Distribution of sound source in the measurements
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 11 of 74
April, 2002
HRTF filter and then plays through headphones, the listener receives the impression
of the sound source.
In above pictures, each point represents a sound source. The distance of the
source is a constant relative to the center of the head. Sources change in the lateral
position around the azimuth or the elevation of the source. The differences in delay
and intensity between ears are greatest in the azimuthally position and less so for
elevation. These differences are also frequency dependent.
Let me consider a sound source located at azimuth angle θ with respect to the
head. Let S (ω) be the Fourier transform of the source signal, HX(ω, θ) and HY (ω, θ)
be the Fourier transform of the signals of left-ear and right-ear respectively.
X(ω, θ) = HX(ω, θ)S(ω)
Y (ω, θ) = HY (ω, θ)S(ω)
Next, we define,
F(ω, θ) = HY (ω, θ) HX(ω, θ)
F (ω, θ) is known as the interaural transfer function (ITF). [6] The interaural
transfer function captures the important binaural cues. The interaural time difference
(ITD) [6] is captured in the phase information of the ITF. More specifically, the
derivative of arg(F(ω, θ)) with respect to ω give the ITD. Note that introduction of
frequency dependent HRTF results in dependence of the interaural time (phase)
difference on frequency. The ITF could be estimated by taking the ratio of Fourier
transforms of the signals receive at the left-ear and the right-ear.
F(ω, θ) = Y (ω, θ) X(ω, θ)
It is important to note that F (ω, θ) is independent of the source spectrum and
thus could be used to find the location of any wideband source. This observation could
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 12 of 74
April, 2002
be utilized for finding a simple way of solving the sound localization problem using a
priori information. Suppose the actual interaural transfer function of the head, F (ω, θ),
is known a priori. This information is obtained from a training process. Later, in order
to estimate the direction of an unknown source signal, one can estimate the interaural
transfer function of the head from the received signals using F(ω, θ) = Y (ω, θ) X(ω, θ)
and compare it with the known F(ω, θ). The value of θ for the estimated interaural
transfer function is lost. The actual interaural transfer function gives the direction of
the source.
HRTFs are created from ITD, IID and pinna measurements associate with a
certain azimuth and possibly distance. HRTF parameters can be measured using
either real human heads or dummy heads. Tiny microphones are inserted into the ear
canals, and a recording is made of a sound source from many different azimuths and
elevations. Each particular measurement represents one sound source direction.
HRTFs provide insight into localization cues. For example, inspection of HRTFs for
sounds coming directly from the right at different elevations would show the effects of
pinna filtering. The notches in the filtered frequency response would change in
number and frequency position with different elevations as direct and indirect sounds
combine in the ear. This particular data would help reveal how we detect elevation.
In addition, HRTF processing uses to trick the ear. For example, a sound from
a stationary speaker at an arbitrary elevation can be processed by this series of
HRTFs. As its frequency response gradually changed, the ear/brain would perceive
changing elevation from the stationary source. Similarly, complex surround sound
fields can be synthesized from stereo speakers, but the ear isn't always entirely fooled.
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 13 of 74
April, 2002
The realism of the created sound field will suffer if the listener moves from the "sweet
spot" between the speakers (headphones solve this problem) because the delicate
balance of cues will be upset. Moreover, the folds of each listener's pinna are different;
generic HRTFs are exactly not matching our own psychoacoustic expectations. In PC
sound cards, using cross-talk cancellation techniques to improve perceived channel
separation cleans up the 3D sound image in two channels, but headphones, with their
superior real channel separation deliver a better two-channel experience. This type of
sound field rendering in two channels is called binaural rendering.
The human ear's purpose in the area of hearing is to convert sound waves into
nerve impulses. These impulses are then perceived and interpreted by the brain as
sound. The human ear can perceive sounds in the range of 20 to 20,000 Hz. This
section is broken down into a basic overview of the ear; the ear receives a section of
how sound, a section on how the ears communicate with the brain, and finally, a
section of human factors. Understanding how the ear worked is the key to
successfully implementing 3D sound in VR systems.
1.4.2 Overview of the human ear
The human ear is made of three distinct areas: the outer ear, the middle ear,
and the inner ear. The outer ear channels sound waves through the ear canal to the
eardrum.
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 14 of 74
April, 2002
The outer ear (pinna) is considered an auditory cue in HRTF generation. The
eardrum is a thin membrane stretched across the inner end of the canal. HRTF
generation uses small microphones embedded in a real ear canal to help simulate the
real acoustic environment. Air pressure changes in the ear canal cause the thin
membrane to vibrate. These vibrations are transmitted to three small bones called
"ossicles" which are located in the air-filled middle ear and conducted the vibrations
across the middle ear to another thin membrane called the oval window. All of these
vibrations are different when a small microphone is embedded in the ear canal, thus it
is impossible to accurately simulate the acoustic environment of the human ear using
this method (HRTF measurements). The use of a small microphone in the ear canal
approximated the real thing. It is because frequency response, position, reflection and
refraction of sounds waves are different. The problem is trying to make measurements
without affecting the measurements. The oval window separates the middle ear from
the fluid-filled inner ear. The effect that the fluid filled inner ear has on sound
transmission is not modeled at all in HRTF generation, in fact, HRTF generation
basically attempts to measure affects of external structures and immediate results
inside the ear canal. Everything happen from middle ear on back is not modeled at all.
The "cochlea" in the inner ear is the most important component of hearing. It
contains the organ of "Corti". The Corti sit in an extremely sensitive membrane called
the "basilar membrane". Whenever the basilar membrane vibrates, small sensory hair
cells inside the Corti are bent, which stimulates the sending of nerve impulses to the
brain.
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 15 of 74
April, 2002
Chapter 2 Project Implementation 2.1 Review of MIT HRTF’s measurements
For this project, I used HRTF measurements compiled by researchers working
in sound media lab in MIT. There are three types of data sets that they have made
publicly available. The "compact data set" is a set of impulse responses, which have
been preprocessed to compensate for recording equipment response and other
factors, and is ready to be used directly. The ‘full data set’ is recorded when they are
generating the data. We prefer to use the compact data set for various reasons. The
advantage of this is that data was equalized to compensate for the non-uniform
response of the Optimum Pro 7 speaker. The full data set have 512 taps for the FIR
filter instead of 128 taps in the compact data set. Finally, the diffused field set data is
compensating for the recorded equipment’s response that allows us to emphasize the
difference.
The MIT researchers use a manikin, named KEMAR, in their experiment. They
set up microphones in its ear canal and played sounds from different locations using
speaker about 1.4 meters apart from the head. Responses from total of 710 different
locations are recorded using sampling frequency of 44.1 KHz. A non-
echoic room uses for the purpose.
Utilizing the symmetry of the head, the KEMAR is set up with two
different pinnae. The left pinna in KEMAR is a normal pinna, while the
right pinna is a slightly larger one. The microphones in the KEMAR’s
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 16 of 74
April, 2002
ears also pick up the ear canal resonance of the manikin’s ears during recording
sound. When these HRTFs are used to generate sound, the listener will hear the
KEMAR ear canal resonance in addition to his own ear canal resonance. Besides that,
as mentioned earlier, the full data set contains the recording system’s response too. A
possible way to eliminate the measurement system’s response, as well as effect of ear
canal resonance is to normalize the measurements with respect to an average across
all directions (called diffuse-field equalization). Since neither the measurement system
response nor the ear canal response change as a function of sound direction, they will
be factored out of the data. To find the diffuse field data, the magnitude-squared
responses of all responses are averaged, which results in power average across all
directions. My project is using the diffuse field data. It uses the pre-computed value of
inverse diffuse field to process the sound before playing. In general, the sounds
synthesized using diffuse-field data can be localized better. The purpose is to provide
the option there is to allow the users to evaluate its effect themselves.
2.1.1 Measurements Data
The spherical space around the KEMAR is sampled at elevations from -40
degrees (40 degrees below the horizontal plane) to +90 degrees (directly overhead).
At each elevation, a full 360 degrees of azimuth is sampled in equal sized increments.
The increment sizes are chosen to maintain approximately 5-degree great-circle
increments. The table below shows the number of samples and azimuth increment at
each elevation (all angles in degrees). A total of 710 locations are sampled.
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 17 of 74
April, 2002
elevation angle no. of measurement azimuth increment
-40 56 6.43
-30 60 6
-20 72 5
-10 72 5
0 72 5
10 72 5
20 72 5
30 60 6
40 56 6.43
50 45 8
60 36 10
70 24 15
80 12 30
90 1 x
Table 2.1: Number of measurements and azimuth increment at each elevation
Each HRTF measurement yields impulse response at a 44.1 kHz sampling rate.
Most of this data is irrelevant. The impulse responses are stored as 16-bit signed
integers, with the most significant byte stored in the low address (i.e. Motorola 68000
format). The HRTF data is stored in directories by elevation. Each directory name
has the format ``elevEE'', where EE is the elevation angle. Within each directory each
filename has the format ``HEEeAAAa.dat'' where EE is the elevation angle of the
source in degrees, from -40 to 90, and AAA is the azimuth of the source in degrees,
from 0 to 355. Elevation and azimuth angles indicate the location of the source relative
to the KEMAR, such that elevation 0 azimuth 0 was directly in front of the KEMAR,
elevation 90 is directly above the KEMAR, elevation 0 azimuth 90 is directly to the
right of the KEMAR, etc. For example, the file ``H-20e270a.dat'' is the right ear
response, with the source 20 degrees below the horizontal plane and 90 degrees to
the left of the head. Note that three digits are always given for azimuth so that the
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 18 of 74
April, 2002
files appear in sorted order in each directory. To select a pair of HRTF responses, I
recommend using symmetrical responses obtained from one of the KEMAR ears.
2.2 Review of past work
I had completed stage one and wrote custom scripts in last year. These scripts
directly convolved an input file (in *.wav format) with a selected head-related transfer
function (HRTF) to simulate an incidence of source at a particular azimuth and
elevation. A stereo output sound is produced for binaural presentation to the subject.
HRTF processing and signal mixing is implemented in one MATLAB [3] script
Stereo.m.
Stereo.m is invoked from the MATLAB command line by optionally providing
parameter values in the function call. If no parameters are provided in the function call,
the script set most parameters to default values (e.g., the default for azimuth and
elevation angles was 0 degrees for both). Stereo.m opens a file requester to ask user
to specify the input wave file. Then all wave files contain in the same directory as the
file specified would then be processed and the results are saved in a subdirectory
named HRTF. If not specified in the function call, the first time a stereo input file is
encountered the user would be requested to select which side to process (the input
signal x (n) must be monophonic).
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 19 of 74
April, 2002
2.3 Introduction of proposed work
In this semester, I added more functions and a graphical user interface to the
program. GUI is enhanced the user-friendly of my project. User needs to set the input
directory, input wave file, output wave file and which surrounding surface space
(horizontal or vertical HRTF). If user is pressing horizontal or vertical button, another
pop-up window called the user to enter the range of azimuth and elevation value. To
close these pop-up windows, the parameters would call back the hrtf.m. Then users
press the “Process” button to call other Matlab script files either hori_final.m or
verti_final.m to process HRTF simulation. After trimming the ends, the sound is
created in the output file.
The way of playing sound data is using sound (vector, fs) function of MATLAB.
[3] I opted to use wavplay ( ), a freeware in Matlab, that is suitable for playing sounds
on PCs. An added advantage that I get from wavplay( ) is that it waited for the sound
port to be released if it’s already busy. The user would play the sound when press the
“Playback” in GUI window or play with other application tools by just clicking the file
name.
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 20 of 74
April, 2002
2.4 System Architecture and Requirements
2.4.1 Hardware
Pentium, Pentium Pro, Pentium II, Pentium III, Pentium IV based personal
computer
64 MB RAM minimum, 128 MB RAM recommended
Microsoft Windows supported graphics accelerator card and sound card
Headphones, Microphone
2.4.2 Software
Microsoft Windows 95, Windows 98, Windows ME, Windows NT 4.0 (with SP5 or
SP6a), Windows 2000, Windows XP
Matlab 6
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 21 of 74
April, 2002
Chapter 3 Methodology
3.1 Matlab Scripts in Details
There are thirteen Matlab scripts in my project. Some are used in the interface
while the other in the HRTF-convolution. The following table states their names and
objectives:
Scripts Name Objectives
hrtf.m This script calls for the main GUI window
to retrieve data from a user and transfer
data to other sub-functions
verti_sur.m This script calls for the vertical HRTF
window when a user presses “Vertical
Surround” button in the main window. It
retrieves some data like the range of
azimuth or elevation angles, and returns
the data to hrtf.m.
hori_sur.m This script calls for the horizontal HRTF
window after a user presses “Horizontal
Surround” button in the main window. It
retrieves some data like the range of
azimuth or elevation angles, and returns
them to hrtf.m.
modaldlg.m This script calls for “closing confirm
window” after a user presses the “Close”
button in the main window.
verti_final.m This script is called by hrtf.m after a user
has pressed the “Process” button and
chosen the horizontal HRTF. It also calls
for the other scripts to activate HRTF-
processing.
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 22 of 74
April, 2002
hori_final.m This script is called by hrtf.m after a user
has pressed the “Process” button and
chosen the horizontal HRTF. It also calls
the other scripts to activate HRTF-
processing.
readhrtf.m This script returns HRTF measurements to
two columns of 128 rows. The first column
is left channel and the second column is
right channel.
hrtfpath.m This script returns the pathname of HRTF
data file to readhrtf.m.
group_file.m This script uses to group a sound clip with
stereo channels, left and right.
par_ser.m This script divides the mono sound clip into
parts of clips.
half_circle.m This script is called by hori_final.m when
user chooses 180º surrounding.
cir_con.m This script uses to do the circulate
convolution.
cirsft.m This script is called by cir_con.m.
Table 3.1 The Matlab Scripts in my project and their objectives
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 23 of 74
April, 2002
3.1.1 Interface codes
3.1.1.1 hrtf.m
The hrtf.m contains sub-functions that launch and control GUI and callbacks.
Each callback refers to relevant object stored in hrtf.fig. Users use it to implement
HRTF-synthesis after retrieving data and transferring data to other scripts for
processing. The pseudo-code of hrtf.m is attached in Appendix.
Figure 3.1 of hrtf.fig in GUIDE
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 24 of 74
April, 2002
The followings are descriptions of sub-functions in hrtf.m:
Sub-function Name Link with which object in
hrtf.fig
Function
listbox1_Callback “Listed File” list box It uses a list box to display
the files in a directory.
When the user double clicks
on a list item, one of the
following happens: If the item is a file, the GUI
opens the file appropriately
for the file type.
If the item is a directory, the
GUI reads the contents of
that directory into the list
box.
If the item is a single dot (.),
the GUI updates the display
of the current directory.
If the item is two dots (..),
the GUI changes to the
directory up one level and
populates the list box with
the contents of that
directory.
load_listbox X This sub-function retrieves
the path of a directory and
set the handles structure
from the listbox1_Callback
to be input arguments.
selectedfile_Callback “Input File Name”
edit box
This function uses to display
the input file name
outfile_Callback “Output File Name”
edit box
This function uses to display
the output file name
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 25 of 74
April, 2002
savefile_name_Callback “Choose Output File
Name”
push button
This function uses to
prompt a dialog box for
user entering the output file
name
hori_button_Callback “Horizontal Surround”
push button
This function generates a
call to the hori_sur.m and
return parameters when
pressing the button.
verti_button_Callback “Vertical Surround“
push button
This function generates a
call to the verti_sur.m and
return parameters when
pressing the button.
initial_button_Callback “Reset” push button This function uses to
initialize all the state of
hori_button_Callback and
verti_button_Callback
processing_Callback “Process” push button This function generates a
call to process HRTF-
synthesis
playback_Callback “Play” push button This function plays the
HRTF- sound by calling
wavplay function
close_Callback “Close” push button This function calls the
modaldlg.m function
figure1_CloseRequestFcn X This function call the
close_Callback.m
Table 3.2 sub-functions in hrtf.m of my project and their objectives
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 26 of 74
April, 2002
3.1.1.2 verti_sur.m
It is another script that contains functions to launch and control the GUI and
callbacks. The pseudo code is in Appendix.
Figure 3.2 of verti_sur.fig in GUIDE
There are two special functions “uiwait“ & “uiresume”. They use to stop and
resume MATLAB program execution. In the dialog, user calls a uicontrol with a
callback (uiresume) that destroys the dialog box. The “uiwait” is a convenient way to
use the wait for command. You typically use it in conjunction with a dialog box. It
provides a way to block the execution of the M-file that created the dialog, until the
user responds to the dialog box. When one uses conjunction with a modal dialog, the
“uiwait” causes MATLAB program to wait before returning execution to the Close
button callback.
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 27 of 74
April, 2002
The followings are descriptions of sub-functions of verti_sur.m:
Sub-function Name Link with which object
in verti_sur.fig
Function
verti_azim_radio_Callback
“Specify azimuth
angle” radio button
This function calls another
functions mutual_exclude
which makes inactive to
verti_semi_radio_Callback
from group of two radio
buttons.
azim_angle_popup_Callback
“Specify azimuth
angle” popup menu
This function prompts a
list of azimuth angles when
users press the arrow.
verti_semi_radio_Callback
“Semi_Circle azimuth
angle” radio button
This function calls another
functions mutual_exclude
which makes inactive to
verti_azim_radio_Callback
from group of two radio
buttons.
sc_azim_angle_popup_Callback
“Semi_Circle azimuth
angle” popup menu
This function prompts a
list of azimuth angles when
users press the arrow.
verti_reset_button_Callback “Reset” press button This function reset all
handles objects in the
“verti_sur.fig”
mutual_exclude X This function ensures all
buttons in the group which
must be deselected when
one of them is selected and
in active
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 28 of 74
April, 2002
verti_close_button_Callback “Close” press button This function resumes
program execution, and
runs the script of delete
(fig).
load_picture X This function uses to
display the path of the
HRTF sound
Show_pic_Callback “UpdatePic”press
button
This function aims to call
load picture function.
Table 3.3 sub-functions in verti_sur.m of my project and their objectives
3.1.1.3 hori_sur.m
Figure 3.3 of hori_sur.fig in GUIDE
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 29 of 74
April, 2002
The followings are descriptions of sub-functions in hori_sur.m:
Sub-function Name Link with which object in
hori_sur.fig
Function
hori_end_azim_Callback “End Azimuth Degree”
popup menu
This function displays a list
of end azimuth angles when
users press the arrow.
hori_elevslider_Callback “hori_elevslider” slider This function uses a slider
to specify elevation angle
since these components
enable the selection of
continuous values within a
specified range.
hori_elev_Callback “Selected Elevation Degree“
edit box
This function displays the
selected elevation degree.
hori_close_button_Callback “Close” press button This function resumes
program execution of
figure1, then run the script
of delete(fig).
load_picture X This function uses to
display the path of the
HRTF sound
show_Pic_Callback “UpdatePic”press button This function aims to call
load picture function.
Table 3.4 sub-functions in hori_sur.m of my project and their objectives
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 30 of 74
April, 2002
3.1.2 HRTF codes for verti_final.m and hori_final.m
These two scripts aim to take a monophonic input signal x (n) from a wave file
and convolve it with the appropriate pair of HRTFs to make the resulting signal
(presented binaurally). It checks and receives parameters from hrtf.m. Then it lists the
measurements used in processing and gets them with the input signal.
Check parameters received from the interface windows
Check number of measurements in the chosen
directory
Start
Determine and set the processing angles
Read the sound into a double array
Ensure the sound frequency in 44.1kHz
J=1
Call the sub-function cir_con for circular convolution
J> number of mesurements
J=J+1
Write the output hrtf sound
End
No
Yes
Figure 3.4 Flow Chart of hori_final.m and verti_final.m
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 31 of 74
April, 2002
Pseudo-Code of verti_final.m %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Subfunction verti_final(verti_choice,verti_val,filename,outfile_name)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Check parameters (verti_choice & verti_val) and then,
Set (azim_choice, azimangle, verti_gp_index, choose_index)
Set azim_dir=sprintf('%s', azim_choice)
Set directory located of the MIT lab contributed data files
Set the wave files in the selected directory to the matrix (filelist)
Set num_file = Nunber of file in the selected directory
%%
LOOP FOR i= 1 to num_file by 1
Set len=length(filelist(i).name)
Set elevangle(i)=str2num(filelist(i).name(2:4))
END OF FOR LOOP
Read the wave file into [input,fs,nbits] using wavread
Set newlength=num_file*fix(length(input)/num_file)
Set input=input(1:newlength)
Set zero padding number(n_z)=128-1
Set n_c=(newlength+(n_z*num_file))/num_file
Set x1=par_ser(input',128,n_c,num_file)
[m,n]=size(x1)
LOOP FOR k=1 to m by 1
Call the subfunction hrtf = readhrtf( elevangle(k), azimangle,choose_index )
Set left_rep=hrtf(:,1)
Set right_rep=hrtf(:,2)
Set len_imp=length(left_rep)
%% Do block circular convolution
Set x2=x1(k,:)
Call the subfunction left_blk_con(k,:)=cir_con(n_c,x2,left_rep')
Call the subfunction right_blk_con(k,:)=cir_con(n_c,x2,right_rep')
Set left_blk_lap(k,:)=left_blk_con(k,n_z+1:n_c)
Set right_blk_lap(k,:)=right_blk_con(k,n_z+1:n_c)
END OF FOR LOOP
Call the subfunction group_file
Retrieve data from group_file
Create the output wave file
Pseudo-code of hori_final.m
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 32 of 74
April, 2002
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Subfunction hori_final(start_angle,end_angle,elevangle,filename,outfile_name)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Set elevangle=10*(round(elevangle/10))
Set choose_index=1
Set elev_dir=sprintf('%s%d','elev', elevangle)
Set t the MIT lab contributed data files of the selected directory to the matrix (filelist)
IF start_angle=0 & end_angle=360 ,then
Set num_file=Nunber of file in the selected directory
ELSEIF start_angle=0 & end_angle=180
Call Subfunction [num_file]= half_circle(elevangle,filelist);
ELSE
Error_Message='Wrong number of input arguments','Input Argument Error'
ENDIF start_angle=0 & end_angle=360
LOOP FOR i= 1 to num_file by 1
Set len=length(filelist(i).name)
Set azimangle(i)=str2num(filelist(i).name(len-7:len-5))
END OF FOR LOOP
Read the wave file into [input,fs,nbits] using wavread
Set newlength=num_file*fix(length(input)/num_file)
Set input=input(1:newlength)
Set zero padding number(n_z)=128-1
Set n_c=(newlength+(n_z*num_file))/num_file
Set x1=par_ser(input',128,n_c,num_file)
[m,n]=size(x1)
LOOP FOR k=1 to m by 1
Call the subfunction hrtf = readhrtf( elevangle, azimangle(k),choose_index )
Set left_rep=hrtf(:,1) and right_rep=hrtf(:,2)
Set len_imp=length(left_rep)
Set x2=x1(k,:)
Call the subfunction left_blk_con(k,:)=cir_con(n_c,x2,left_rep')
Call the subfunction right_blk_con(k,:)=cir_con(n_c,x2,right_rep')
Set left_blk_lap(k,:)=left_blk_con(k,n_z+1:n_c)
Set right_blk_lap(k,:)=right_blk_con(k,n_z+1:n_c)
END OF FOR LOOP
Call the subfunction group_file to retrieve data to create hrtf wave file
Figure 3.5 Pseudo-codes of verti_final.m and hori_final.m
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 33 of 74
April, 2002
In these two scripts, I call another script(cir_con.m) to do circular convolution.
The cir_con.m calls for cirsft.m to cut the matrix in a suitable length and size.
Figure 3.6 Pseudo-codes of cir_mon.m and cirsft.m
This convolution in cir_con.m is using overlap-save method, which prevents the
prediction error at the beginning of each section of clip. The processed data by direct
Pseudo-Code of cir_con.m
%%%%%%%%%%%%%%%%%%%%%
Subfunction y=cir_con(n_c,x_in,cir_in)
%%%%%%%%%%%%%%%%%%%%%
Set x_inn=[x_in(1:length(x_in)) zeros(1,n_c-length(x_in))]
Set imp_in=[cir_in(1:length(cir_in)) zeros(1,n_c-length(cir_in))]
FlipMatrix(imp_in) to imp_cir1
Set m from 1 to n_c by 1
LOOP FOR k =1 to length of m by 1
Call subfunction imp_cir2=cirsft(m(k),n_c,imp_cir1)
Set y(k)=imp_cir2*x_inn'
END LOOP
Pseudo-Code of cirsft.m
%%%%%%%%%%%%%%%%%
Subfunction xx=cirsft(mm,nn,in1)
%%%%%%%%%%%%%%%%%
Set in2=in1(nn-rem(mm,nn)+1:nn)
Set xx=[in2 in1(1:nn-rem(mm,nn))]
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 34 of 74
April, 2002
convolution (function conv in Matlab) would make the prediction error larger because
the zero-valued signal is being predicted from at least some non-zero previous
samples. The conv function is not using previous sample and it assumed zero-value at
the beginning of each section of clip. If the prediction error is too large, the difference
between first sample of current section and last of previous section is very large. A
“tick-tack” sound would appear after grouping at the junction of section. [8]
To minimize the difference of sectors, the overlap-save method of circular
convolution is chosen. It requires the input blocks overlap. Then the input blocks are
circularly convolved with the impulse response. Because of the overlap redundancy at
the input, the circular artifact in the output (the first Nh-1 samples) can simply be
discarded. The following figure illustrates the overlap-save method. [9],[10]
Figure 3.7 Illustration of sound sample, and prediction error for sound which is largest at the end of the section.
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 35 of 74
April, 2002
Figure 3.8 Block convolution using the overlap-save method. (a) input signal x(n) divided into
overlapping sections, overlap is Nh-1=2, (b) impulse response h(n), (c) output y(n) using direct
convolution, (d) output y1(n) for block circular convolution of x1(n) and h(n), (e) output y2(n), (f) output
y3(n), (g) output y4(n), and (h) sequential concatenation of block outputs after discarding the first two
samples of each block, which is equivalent to the direct convolute ion result. "|" represents
concatenation.
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 36 of 74
April, 2002
The following conditions are used in the overlap-save convolution:
Total numbers of block is K which was the numbers of measurements in the specified directory
The length of FIR filter (measurements) is M (M=128).
The length of on block of data is L (L>M). It determines by
tsmuasuremen ofnumber signalinput oflength
Each time a block of data of length N=L+M-1 is filtered
Steps of this method are:
(1) Calculating N-point DFT of x (n) and multiplying it with the N-point DFT of h (n).
(2) Calculating N-point IDFT and discarding the first M-1 output sample. The last samples are the desired filter output
(3) Appending the last M-1 samples to the beginning of the new block of signal.
(4) Going to (1) until reach the end of the signal.
L L L
save last M-1 sample
Signal
M-1 zero
output
discard M-1 samples
Figure 3.9 Overlap-save method in this program
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 37 of 74
April, 2002
The reason for discarding the first (M-1) output samples is that they are the
result of circular convolution. Linear convolution starts at the Mth sample. For the
same reason, the last M-1 samples of the processed block should be appended to the
beginning of the new block
3.2 Input Parameters
The input to the system is a 16 bit, mono wave file. This is also known as “CD
Quality.” The wave data is in Pulse Code Modulation (PCM) format. Each sample of
the signal is represented as a 16 bit signed integer. Provided functions open and read
the data. Data format is assumed that re-sampling at the beginning of function. The
file is read into a dynamically allocated memory buffer.
3.3 Output Parameters
The processing file is a 16 bit, stereo wave sound. It has same sampling rate as
input sound. The maximum HRTF length supports to 128 samples.
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 38 of 74
April, 2002
Chapter 4 HRTF Results
The following figure is a monaural sound called “GloryBe.wav”. There are two
louder parts. The frequency of it concentrated at 500-700Hz in spectrogram.
The waveform statistics:
Mono Sound
Min Sample Value: -31684
Max Sample Value: 32572
Peak Amplitude: -.05 dB
Possibly Clipped: 0
DC Offset: -.031
Minimum RMS Power: -52.83 dB
Maximum RMS Power: -7.27 dB
Average RMS Power: -18.51 dB
Total RMS Power: -15.76 dB
Using RMS Window of 50 ms
Figure 4.1 Graph of GloryBe.wav and this spectrogram
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 39 of 74
April, 2002
4.1 Comparison HRTF-sound with common stereo sound
The following wave is created by HRTF-synthesis and the sound come from
right hand side from the origin.
The waveform statistics:
Left Right
Min Sample Value: -16053 -31226
Max Sample Value: 15432 28548
Peak Amplitude: -6.2 dB -.42 dB
Possibly Clipped: 0 0
DC Offset: .006 .006
Minimum RMS Power: -62.78 dB -59.29 dB
Maximum RMS Power: -13.57 dB -7.76 dB
Average RMS Power: -24.76 dB -19.29 dB
Total RMS Power: -22.04 dB -16.44 dB
Using RMS Window of 50 ms
Figure 4.2 Graph of GloryBe.wav after HRTF processing and its spectrogram
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 40 of 74
April, 2002
The following wave is created by one application (Cool Edit) that convert a
monaural sound to stereo sound .This stereo wave has similar waveform statistic to
that of HRTF sound:
Left Right
Min Sample Value: -15955 -31910
Max Sample Value: 16383 32767
Peak Amplitude: -6.02 dB 0 dB
Possibly Clipped: 0 7
DC Offset: -.016 -.031
Minimum RMS Power: -59.17 dB -53.16 dB
Maximum RMS Power: -13.28 dB -7.26 dB
Average RMS Power: -24.52 dB -18.5 dB
Total RMS Power: -21.78 dB -15.76 dB
Using RMS Window of 50 ms
Figure 4.3 Simulation of stereo sound of GloryBe.wav by one application and its spectrogram
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 41 of 74
April, 2002
Even these two graphs seem to have similar waveform and spectrogram; it can
distinguish by using headphone. The HRTF-based sound is coming from right of the
origin, but the stereo sound created from other application has not explicit direction.
From the spectrogram of the graph, I conclude that the frequencies of sound
wave in both cases were between 500-1000Hz where green or red colour were. It is
similar to the original mono sound.
Figure 4.4 3D surface of the GloryBe.wav in frequency expression
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 42 of 74
April, 2002
4.2 Wave Statistics from different condition I use “GloryBe.wav” for analysis.
a) The waveform statistics and wave structure of
HRTF-sound flowing from the front to the back:
Left Right
Min Sample Value: -14171 -32768
Max Sample Value: 14284 32767
Peak Amplitude: -7.21 dB 0 dB
Possibly Clipped: 0 90
DC Offset: .009 . 008
Minimum RMS Power: -56.95 dB -51.47 dB
Maximum RMS Power: -12.58 dB -3.84 dB
Average RMS Power: -25.54 dB -16.22 dB
Total RMS Power: -23.4 dB -13.05 dB
Using RMS Window of 50 ms
Figure 4.5
b) The waveform statistics and wave structure of HRTF-sound flowing a cycle:
Left Right
Min Sample Value: -16696 -32768 Figure 4.6 Max Sample Value: 15023 32767
Peak Amplitude: -5.86 dB 0 dB
Possibly Clipped 0 7
DC Offset: .009 .009
Minimum RMS Power: -54 dB -57.62 dB
Maximum RMS Power: -12.88 dB -4.19 dB
Average RMS Power: -24.09 dB -18.26 dB
Total RMS Power: -22.56 dB -15.03 dB
Using RMS Window of 50 ms
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 43 of 74
April, 2002
Figure 4.7 Spectrogram of the wave in Figure 4.6 and the above is at elevation 50°
and the bottom is at elevation 0°
From these two spectrograms, one point was stated. The amplitude of energy
of two channels were equivalent when the elevation was 90°, larger energy was
occurred as elevation tend to 0°
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 44 of 74
April, 2002
c) The waveform statistics and wave structure of HRTF-sound flowing in the front from -40° to 90°
Left Right
Min Sample Value: -29707 -29707
Max Sample Value: 31153 31153
Peak Amplitude: -.44 dB -.44 dB
Possibly Clipped: 0 0
DC Offset: .01 .01
Minimum RMS Power: -56.08 dB -56.08 dB
Maximum RMS Power: -7.17 dB -7.17 dB
Average RMS Power: -19.31 dB -19.31 dB
Total RMS Power: -16.45 dB -16.45 dB
Using RMS Window of 50 ms
Figure 4.8
d) The waveform statistics and wave structure of HRTF-sound flowing in the right from -40° to 90°
Left Right
Min Sample Value: -13013 -32569
Max Sample Value: 16976 32767
Peak Amplitude: -5.71 dB 0 dB
Possibly Clipped: 0 12
DC Offset: .009 .009
Minimum RMS Power: -57.34 dB -53.16 dB
Maximum RMS Power: -13.14 dB -5.26 dB
Average RMS Power: -25.78 dB -16.97 dB
Total RMS Power: -22.98 dB -14.16 dB
Using RMS Window of 50 ms
Figure 4.9
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 45 of 74
April, 2002
e) The waveform statistics and wave structure of HRTF-sound flowing in the back -40° to 90°
Left Right
Min Sample Value: -23260 -23260
Max Sample Value: 25273 25273
Peak Amplitude: -2.26 dB -2.26 dB
Possibly Clipped: 0 0
DC Offset: .009 .009
Minimum RMS Power: -56.22 dB -56.22 dB
Maximum RMS Power: -7.82 dB -7.82 dB
Average RMS Power: -20.56 dB -20.57 dB
Total RMS Power: -17.92 dB -17.92 dB
Using RMS Window of 50 ms
Figure 4.10
f) The waveform statistics and wave structure
of HRTF-sound flowing in the left from -40° to 90°
Left Right Min Sample Value: -32569 -13013
Max Sample Value: 32767 16976
Peak Amplitude: 0 dB -5.71 dB
Possibly Clipped: 12 0
DC Offset: .009 .009
Minimum RMS Power: -53.16 dB -57.34 dB
Maximum RMS Power: -5.26 dB -13.14 dB
Average RMS Power: -16.97 dB -25.78 dB
Total RMS Power: -14.16 dB -22.98 dB
Using RMS Window of 50 ms
Figure 4.11
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 46 of 74
April, 2002
g) The waveform statistics and wave structure of HRTF-sound flowing from the front to the back through the above of head
Left Right
Min Sample Value: -23934 -23934
Max Sample Value: 27461 27461
Peak Amplitude: -1.53 dB -1.53 dB
Possibly Clipped: 0 0
DC Offset: .01 .01
Minimum RMS Power: -56.25 dB -56.25 dB
Maximum RMS Power: -7.69 dB -7.69 dB
Average RMS Power: -19.92 dB -19.93 dB
Total RMS Power: -17.1 dB -17.11 dB
Using RMS Window of 50 ms
Figure 4.12 h) The waveform statistics and wave structure
of HRTF-sound flowing from the right to the left through the above of head
Left Right
Min Sample Value: -19262 -32493
Max Sample Value: 19764 32767
Peak Amplitude: -4.39 dB 0 dB
Possibly Clipped: 0 5
DC Offset: .01 .009
Minimum RMS Power: -54.88 dB -57.37 dB
Maximum RMS Power: -10.1 dB -5.05 dB
Average RMS Power: -22.68 dB -18.16 dB
Total RMS Power: -20.63 dB -14.92 dB
Using RMS Window of 50 ms
Figure 4.13
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 47 of 74
April, 2002
From the above figures, I could conclude that:
The amplitudes of HRTF wave concern the direction flow of the wave. If the wave
flows at the left side, the left channel has larger amplitude rather than the right
channel and vice versa.
If the HRTF sound is come from the front or back, the left and right channel have
the same wave’s amplitude.
Sounds process to sound as though they originate from the front of a listener
actually sound like they originate from in back of the listener.
Synthesis of sounds with non-zero elevations is similar.
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 48 of 74
April, 2002
Chapter 5 User Manual of 3D Simulation
All the scripts must run in Matlab version 6.0 or later. First, users must open Matlab. For example, the working directory is “d:\hrtf”.
To run the GUI script, user should type “HRTF”, and then a window pop-up.
Figure 5.1 Matlab 6.0 first page
Figure 5.2 Initial “HRTF based surround sound “ window
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 49 of 74
April, 2002
The “HRTF based Surround Sound Window” is designed by GUIDE. The name and descriptions of each item in this window as follows: 1)
It shows current working directory. In this case, the path is “D:\hrtf” 2)
This list box shows all directories and files located in the working directory.
3)
This shows the input file name, when user doubled-clicks the file names in the list box.
4)
This button determines the output file name, if the user inputs the file name by a prompting up window.
5)
This button is used to prompt up another window “Setting Horizontal Surround Sound Parameters
6)
This button is used to prompt up another window “Setting Vertical Surround Sound Parameters
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 50 of 74
April, 2002
7)
This button reset all parameters stored in memory. User could input their choice more times.
8)
This button uses to process all parameters and output a HRTF-based sound.
9)
This button uses to play the HRTF-based sound. 10)
This button is pressed and then calls up a closing window.
5.1 Setting the input and output file name:
To choose the input file, user only need to press the file in the list box. If user press “Choose Output File Name” button, a pop up window ask user to choose the output file name. The default name is “3D.wav”.
Figure 5.3 Pop-up window for choosing the output file name
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 51 of 74
April, 2002
After choosing the input and output name, the HRTF-based surround sound window become in the following figure. The input file name is “GloryBe.wav” and output file name is “3D.wav”.
5.2 Setting Parameters of Horizontal Surround
After pressing the “Horizontal Surround” button, the following window calls,
Figure 5.4 “HRTF based surround sound” after choosing input and output file name
Figure 5.5 Initial window of “ Setting Horizontal Surround Sound Parameters”
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 52 of 74
April, 2002
The name and descriptions of each item in this window as follows: 1, 2)
These parts set the starting and ending azimuth angle. The starting angle is 0°
and ending angle was 180° or 360°.
3)
The slider starts from -40° and ended at 90°. Each click of the bar would
increase the angle in 10°.
4)
This part is updated by the above slider.
5)
This area is inserted a picture that described the path of sound flow.
6)
When the user presses the “Route of Wave” button, it updates the above picture
part.
7)
User presses this “Close” button for closing this window and returning the
parameters to “HRTF-based Surround Sound Window”.
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 53 of 74
April, 2002
If the user chooses azimuth angle staring from 0° and ending at 360°, elevation angle at 30° and presses “Route of Wave” button, the following window would display:
5.3 Setting Parameters of Vertical Surround After pressing the “Vertical Surround” button, the following window is called,
Figure 5.6 Window of “Setting Horizontal Surround Sound Parameters” after setting parameter
Figure 5.7 Initial window of “Setting Vertical Surround Sound Parameters”
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 54 of 74
April, 2002
The name and descriptions of each item in this window as follow: 1,2)
&
These pair of radio buttons is which part of parameters should set. 3)
This pop-up menu set the azimuth angle of 0°, 90°, 180°, and 270°. 4)
This pop-up menu set the azimuth angle of 0° to 180° and 90° to 270°. 5)
This area is inserted a picture that described the path of sound flow. 6)
When the user presses the “Route of Wave” button, it updates the above picture part.
7)
This button reset all parameters stored in memory. User could input their choice more times.
8)
User presses this “Close” button for closing this window and returning the parameters to “HRTF-based Surround Sound Window”.
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 55 of 74
April, 2002
If the user chooses “Special Azimuth Angle” radio button, azimuth angle is 90° and presses “Route of Wave” button, the following window would display:
If the user choose “Semi-Circle azimuth Angle” radio button, azimuth angle starts from 90° and ends at 270° and pressed “Route of Wave” button, the following window would display:
Figure 5.8 Window of “Setting Vertical Surround Sound Parameters” after pressing “Route of Wave”. A picture loaded in the blank box when it chose “Specify azimuth angle”
Figure 5.9 Window of “Setting Vertical Surround Sound Parameters” after pressing “Route of Wave”. A picture loaded in the blank box when choosing “Semi_Circle azimuth angle.
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 56 of 74
April, 2002
5.4 Processing the parameter When user presses the “Press” button, it calls other script to do HRTF. After
that, it prompts a dialog:
5.5 Close the window Please press the “Close” button to quit the program. “Are you sure you want to
close” box pops up. If clicking “Yes”, all windows closed. Otherwise, no action would get part in.
Figure 5.10 Message box for referencing the completing of processing
Figure 5.11 Message box asking user for choosing the close operation
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 57 of 74
April, 2002
Chapter 6 Conclusion
6.1 Problems with HRTF-based synthesis of spatial audio
Although it is simple to synthesize HRTF’s to spatial audio, several problems
arise. It is often reported that there is absence of spatially-synthesized sounds -
sounds spatialize near the median plane (0° azimuth angle) sound as though they
existed “inside” the head instead of “outside” it. Sounds process as though they
originated either from the front of a listener or from his back (the so-called “front-back”
confusions) [11]. Synthesis of sounds with non-zero elevations is difficult. Since
everyone has a unique set of HRTF’s. If one listens to a spatialized sound from a
“generalized” HRTF set, one may not perceive the sound in the intended spatial
location [12]. In addition to the problems of sound quality, HRTF-based sound
synthesis encounter several computational challenges, too.
Many researchers believe that the solutions to above problems involve a
deeper understanding of the perceptual structure of HRTF data. By investigating the
structure of HRTF’s, researchers single out the features of HRTF’s, such as peaks
and dips in the magnitude responses and the impulse responses. They also examine
specific spatial parameters, such as azimuth, elevation, and distance. Future spatial
audio synthesis algorithms can help to attain perceptual information and solve
problems in the existing systems.
Auditory localization [17] is still not fully understood, and thus developers can
not make an effective price/performance decisions in the design of spatial audio
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 58 of 74
April, 2002
systems. Furthermore, developers are often at a loss to explain why systems not
perform effectively.
6.2 Headphones
Most existing spatial audio systems require headphones. For example, one
needs to wears certain headgear to listen to 3-D sounds. Headphones are necessary
because they fix the geometric relationship between physical sound sources (the
headphone drivers) and the ears. They can also eliminate crosstalk between binaural
signals.
6.3 Future Work
A problem with real-time application is time delay (The higher the quality is, the
longer the impulse response is). To solve it, some authors propose hybrid (time
domain - frequency domain) convolution. [18] The current implementation of the
program, it takes approximately an hour as processing time for a 30-second sound
clip. A careful re-implementation of C++ can probably reduce the time on dedicated
hardware and the implementation.
In order to achieve greater realism of simulated sound trajectories, a finer
resolution for available HRTF measures is more preferable. It can be performed
through the lineal interpolation of available data.
Acoustic environment modeling refers to combining 3D spatial location cues
with distance, motion, and ambience cues. It creates a complete simulation of an
acoustic scene. Simulating the acoustical interactions in the natural world, we can
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 59 of 74
April, 2002
achieve stunningly realistic recreations with the 3D positional control. [16] The Doppler
shift should be taken into account if we deal with large distance variations. [15] This is
because the pitch is changing on the moving object. Finally, reverberation in non-
controlled synthetic environment should consider the frequency-dependent effect of
the medium and the reflecting surfaces. [13], [14], [15]
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 60 of 74
April, 2002
Appendices
Homepage of my project
The web site address of this project is http://hrtf.bravepages.com/ . Inside the
web page, there are some 3D sounds produced by the HRTF measurements. It likes:
The user also gets the source codes and measurements in another
page. There is a button. Clicking this button to enter the
http://hrtf.bravepages.com/source_code.htm
A prompt window asks to enter password and this password is “fyp2002”.
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 61 of 74
April, 2002
Pseudo-code of Matlab scripts Pseudo-code of hrtf.m %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
MAINFUNCTION hrtf %% HRTF Application M-file for hrtf.fig
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
IF nargin <= 1 , Then
IF nargin == 0, Then
Set initial_dir equals to current directory
ELSEIF nargin == 1 & exist(varargin{1},'dir')
initial_dir = varargin{1};
ELSE
Error_Message='Input argument must be a valid directory'
Stop the program
ENDIF nargin == 0
Open FIG-file
Generate and store a structure of handles to pass to callbacks
Returns a structure containing the handles of the objects in a figure
Stores the variable data in the figure's application data
Call SUBUNCTION load_listbox(initial_dir)
Call the load_listbox SUBFUNCTION
Return figure handle as first output argument
IF nargout > 0, Then
varargout{1} = fig
ENDIF nargout > 0
ELSEIF ischar(varargin{1})
%% INVOKE NAMED SUBFUNCTION OR CALLBACK
%%
Executed until an error occurs
Display(lasterr)
ENDIF nargin <= 1
%%%%%%%%%%%%%%%%%%%
SUBUNCTION listbox1_Callback
%%%%%%%%%%%%%%%%%%%
IF value of 'SelectionType'in figure1 is 'open'
Get the index value from listbox1 FUNCTION
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 62 of 74
April, 2002
Get the file name from listbox1 FUNCTION
Set filename = file_list{index_selected}
IF value of sorted_index(index_selected) is 1
Change directory to (filename);
Call Subfunction load_listbox(pwd,handles)
ELSE
[path,name,ext,ver] = fileparts(filename)
Set 'String' parameter of selectedfile equals to [path,name,ext,ver]
ENDIF value of sorted_index(index_selected) is 1
ENDIF value of 'SelectionType'in figure1 is 'open'
%%%%%%%%%%%%%%%%%%%%%%%%%%
SUBFUNCTION load_listbox(dir_path)
%%%%%%%%%%%%%%%%%%%%%%%%%%
Add (dir_path) directory to MATLAB's current search path
Change diectory to (dir_path)
Set dir_struct = dir(dir_path)
Set [sorted_names,sorted_index] = sortrows({dir_struct.name}')
Set file_names = sorted_names
Set is_dir = [dir_struct.isdir]
Set sorted_index = [sorted_index]
Stores the variable data in the figure's application data.
Update the current directory name of selected file
%%%%%%%%%%%%%%%%%%%
SUBFUNCTION selectedfile_Callback
%%%%%%%%%%%%%%%%%%%%
Wait for update
%%%%%%%%%%%%%%%%%%%%%%
SUBFUNCTION outfile_Callback
%%%%%%%%%%%%%%%%%%%%%%
Wait for update
%%%%%%%%%%%%%%%%%%%%%%
SUBFUNCTION savefile_name_Callback
%%%%%%%%%%%%%%%%%%%%%%
Set the output file name, the default name is 3D.wav
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 63 of 74
April, 2002
%%%%%%%%%%%%%%%%%%%%%%%
SUBFUNCTION hori_button_Callback
%%%%%%%%%%%%%%%%%%%%%%%
Subfunction called
Set the verti_button callback inactive
Call Subfunction [val,elevangle]=hori_sur
IF val=1
start_angle=0
END_angle=180
ELSE
start_angle=0
end_angle=360
ENDIF val=1
Set choice =1
Stores the variable data in the figure's application data.
%%%%%%%%%%%%%%%%%%%%%%%%
SUBFUNCTION verti_button_Callback
%%%%%%%%%%%%%%%%%%%%%%%%
Subfunction called
Set the hori_button callback inactive
pos_size = get(handles.figure1,'Position')
Call Subfunction [verti_choice,verti_val] = verti_sur
Set choice=2
Stores the variable data in the figure's application data.
%%%%%%%%%%%%%%%%%%%%%%%%%%
SUBFUNCTION varargout = initial_button_Callback
%%%%%%%%%%%%%%%%%%%%%%%%%%
Subfunction called
Initialize the hori_button callback
Initialize the verti_button callback
%%%%%%%%%%%%%%%%%%%%%
SUBFUNCTION processing_Callback
%%%%%%%%%%%%%%%%%%%%%
Subfunction called
Set index_selected equals to the 'Value'stroed in listbox1
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 64 of 74
April, 2002
Set file_list equals to the 'String' stored in listbox1
Set filename = file_list{index_selected}
Set outfile_name equals to the 'String' stored in outfile
%%
IF (choice==1)
Call SUBUNCTION hori_final(start_angle,end_angle,elevangle,filename,outfile_name)
ENDIF choice==1
%%
IF (choice==2)
Call SUBUNCTION verti_final(verti_choice,verti_val,filename,outfile_name)
ENDIF
%%%%%%%%%%%%%%%%%%%%%%%%%%
SUBFUNCTION playback_Callback
%%%%%%%%%%%%%%%%%%%%%%%%%%
Subfunction called
Set outfile_name equals to the 'String' stored in outfile
Read the wave file
Play the proceeded wave file
%%%%%%%%%%%%%%%%%%%%%%%
SUBFUNCTION figure1_CloseRequestFcn
%%%%%%%%%%%%%%%%%%%%%%%
Subfunction called
Call SUBUNCTION close_Callback
%%%%%%%%%%%%%%%%%%
SUBFUNCTION close_Callback
%%%%%%%%%%%%%%%%%%
Subfunction called
Call subfunction modaldlg
CASE OF (output from subfunction modaldlg)
'no','cancel' : no action
'yes' : delete the frame(handles.figure1)
ENDCASE
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 65 of 74
April, 2002
Pseudo-code of verti-sur.m %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Subfunction [verti_choice,verti_val] = verti_sur(varargin)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
IF nargin = 0 or isnumeric(varargin{1}), then LAUNCH GUI
Open FIG-file
Generate and store a structure of handles to pass to callbacks
Returns a structure containing the handles of the objects in a figure
Stores the variable data in the figure's application data.
Set OrigImageAxes parameters
Set OriginalImage parameters
Position figure
Wait for callbacks to run and window to be dismissed
IF the 'Value' in verti_azim_radio not equal to 0
Set verti_choice=1
Set verti_val= the 'Value' in azim_angle_popup
ENDIF
IF the 'Value' in verti_semi_azim_radio not equal to 0
Set verti_choice=2
Set verti_val= the 'Value' in sc_azim_angle_popup
ENDIF
Delete the figure
ELSEIF ischar(varargin{1})
INVOKE NAMED SUBFUNCTION OR CALLBACK
TRY %% executed until an error occurs
IF (nargout=0)
FEVAL -- Function evaluation
Set [varargout{1:nargout}] = feval(varargin{:})
ELSE
feval(varargin{:})
ENDIF
CATCH
Display(lasterr)
END
ENDIF
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 66 of 74
April, 2002
%%%%%%%%%%%%%%%%%
Subfunction load_picture(NewVal)
%%%%%%%%%%%%%%%%%
Subfunction called
IF NewVal=1, then
Set pic_name='verti_0.bmp'
ELSEIF NewVal=2
Set pic_name='verti_90.bmp'
ELSEIF NewVal=3
Set pic_name='verti_180.bmp'
ELSEIF NewVal==4
Set pic_name='verti_270.bmp'
ELSEIF elseif NewVal==5
Set pic_name='verti_semi_0.bmp'
ELSEIF NewVal==6
Set pic_name='verti_semi_90.bmp'
ENDIF
SET I = imshow(pic_name)
Set 'Cdata' in OriginalImage to be I
%%%%%%%%%%%%%%%%%%
Subfunction verti_azim_radio_Callback
%%%%%%%%%%%%%%%%%%
Subfunction called�
Set relevant parameters to be active
Set inrelevant parameters to be inactive
Set off= verti_semi_radio
Call subfunction mutual_exclude(off)
Set NewVal equals to the "Value' of azim_angle_popup
Stores the variable data in the figure's application data
%%%%%%%%%%%%%%%%%%%
Subfunction azim_angle_popup_Callback
%%%%%%%%%%%%%%%%%%%
Subfunction called
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 67 of 74
April, 2002
%%%%%%%%%%%%%%%%%%%
Subfunction verti_s_azim_text_Callback
%% %%%%%%%%%%%%%%%%%
Subfunction called
%% %%%%%%%%%%%%%%%%%
Subfunction verti_s_elev_text_Callback
%% %%%%%%%%%%%%%%%%%
Subfunction called
%%%%%%%%%%%%%%%%%%%
Subfunction verti_s_elev_range_Callback
%%%%%%%%%%%%%%%%%%%
Subfunction called
%%%%%%%%%%%%%%%%%%%
Subfunction verti_semi_radio_Callback
%%%%%%%%%%%%%%%%%%%
Subfunction called
Set relevant parameters to be active
Set inrelevant parameters to be inactive
Set off= verti_azim_radio
Call subfunction mutual_exclude(off)
Set NewVal equals to the "Value' of sc_azim_angle_popup + 4
Stores the variable data in the figure's application data
%%%%%%%%%%%%%%%%%%%%%
Subfunction sc_azim_angle_popup_Callback
%%%%%%%%%%%%%%%%%%%%%
Subfunction called
%%%%%%%%%%%%%%%%%%%%%
Subfunction verti_sc_azim_text_Callback
%% %%%%%%%%%%%%%%%%%%%
Subfunction called
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 68 of 74
April, 2002
%%%%%%%%%%%%%%%%%%%
Subfunction verti_sc_elev_text_Callback
%% %%%%%%%%%%%%%%%%%
Subfunction called
%%%%%%%%%%%%%%%%%%%%%%%
Subfunction verti_sc_elev_range_text_Callback
%%%%%%%%%%%%%%%%%%%%%%%
Subfunction called
%%%%%%%%%%%%%%%%%%%
Subfunction verti_reset_button_Callback
%% %%%%%%%%%%%%%%%%%
Set relevant parameters to be activeand irrelevant parameters to be inactive
Set off= verti_semi_radio & verti_azim_radio
Call subfunction mutual_exclude(off), stores the variable data in the figure's application data
%%%%%%%%%%%%%%%%%%
Subfunction mutual_exclude(off)
%% %%%%%%%%%%%%%%%%
Subfunction called
Set the 'Value' in off=0
%%%%%%%%%%%%%%%%%%%
Subfunction verti_close_button_Callback
%% %%%%%%%%%%%%%%%%%
Subfunction called
Resume MATLAB program execution of figure1
% %%%%%%%%%%%%%%%%%%
Subfunction Show_pic_Callback
% %%%%%%%%%%%%%%%%%%
IF the 'Value' in verti_azim_radio not equals to 0
NewVal=the 'Value' in azim_angle_popup
ENDIF
IF the 'Value' in verti_semi_radio not equals to 0
NewVal=the 'Value' in sc_azim_angle_popup + 4
ENDIF
Call the subfunction load_picture(NewVal), stores the variable data in the figure's application data
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 69 of 74
April, 2002
Pseudo-code of hori_sur.m %%%%%%%%%%%%%%%%%%%%%%
Subfunction [val,elevangle]=hori_sur(varargin)
%%%%%%%%%%%%%%%%%%%%%%
IF nargin = 0 or isnumeric(varargin{1}), then LAUNCH GUI
Open FIG-file
Generate and store a structure of handles to pass to callbacks
Returns a structure containing the handles of the objects in a figure
Stores the variable data in the figure's application data.
Set OrigImageAxes parameters
Set OriginalImage parameters
Position figure
Wait for callbacks to run and window to be dismissed:
Set val equals to the'Value' in hori_end_azim
Set elevangle equals to the 'String' in hori_elev
delete the figure
ELSEIF ischar(varargin{1})
INVOKE NAMED SUBFUNCTION OR CALLBACK
Executed until an error occurs
Display(lasterr)
ENDIF nargin <= 1
%%%%%%%%%%%%%%%%%%
Subfunction load_picture(NewPic)
%%%%%%%%%%%%%%%%%%
Subfunction called
IF NewPic=1, then
Set pic_name='hori_180.bmp'
ELSEIF NewPic=2, then
Set pic_name='hori_360.bmp'
END
Set I= imshow(pic_name)
Set 'Cdata' in OriginalImage to be I
%%%%%%%%%%%%%%%%%%
Subfunction hori_end_azim_Callback
%%%%%%%%%%%%%%%%%%
Subfunction called
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 70 of 74
April, 2002
%%%%%%%%%%%%%%%%%%
Subfunction hori_elevslider_Callback
%%%%%%%%%%%%%%%%%%
Subfunction called
Set the initial slider value and range
Get the new value for the elev angle from the slider
Set new value to nearest number
Set the 'String' in hori_elev to new value)
Stores the variable data in the figure's application data.
%%%%%%%%%%%%%%%%%%
Subfunction hori_elev_Callback
%%%%%%%%%%%%%%%%%%
Subfunction called
%%%%%%%%%%%%%%%%%%%
Subfunction hori_close_button_Callback
%%%%%%%%%%%%%%%%%%%
Subfunction called
Resumes the M-file execution
%%%%%%%%%%%%%%%%%%
Subfunction Show_Pic_Callback
%%%%%%%%%%%%%%%%%%
IF the 'Value' of hori_end_azim=1
Set NewPic=1
ELSE the 'Value' of hori_end_azim=2
Set NewPic=2
ENDIF
Call the subfunction load_picture(NewPic)
Stores the variable data in the figure's application data
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 71 of 74
April, 2002
Pseudo-code of modaldlg.m %%%%%%%%%%%%%%%%%%%%%
Subfunction answer = modaldlg(varargin)
%%%%%%%%%%%%%%%%%%%%%
IF nargin = 0 or isnumeric(varargin{1}), then LAUNCH GUI
Open FIG-file
Generate and store a structure of handles to pass to callbacks
Returns a structure containing the handles of the objects in a figure
Stores the variable data in the figure's application data
Position figure
Wait for callbacks to run and window to be dismissed
IF ~ishandle(fig), then
Set answer = 'cancel'
ELSE
Returns a structure containing the handles of the objects in a figure
Delete the figure
ENDIF
ELSEIF ischar(varargin{1})
INVOKE NAMED SUBFUNCTION OR CALLBACK
Executed until an error occurs
Display(lasterr)
ENDIF nargin <= 1
%%%%%%%%%%%%%%%%
Subfunction noButton_Callback
%%%%%%%%%%%%%%%%
Subfunction called
Set answer = 'no'
Stores the variable data in the figure's application data
Resume MATLAB program execution of figure1
%%%%%%%%%%%%%%%%%
Subfunction yesButton_Callback
%%%%%%%%%%%%%%%%%
Subfunction called
Set answer = 'yes'
Stores the variable data in the figure's application data
Resume MATLAB program execution of figure1
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 72 of 74
April, 2002
Pseudo-Code of readhrtf.m
%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Subfunction [y] = readhrtf(elevangle,azim,choose_index)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Set ext = '.wav'
IF (choose_index=1) ,then
Call subfunction pathname = hrtfpath(pwd,filesep,'horizontal','H',ext,elevangle,azim,choose_index)
Read (pathname) wave file into [y,fs,nbits]
IF (fs ~= 44100 | nbits ~= 16) ,then
Error message='Incorrect wave file format. Expected 16 bit samples at 44.1 kHz sampling rate.'
ENDIF
ENDIF
IF ((choose_index==2) | (choose_index==3)) , then
Call subfunction pathname = hrtfpath(pwd,filesep,'vertical','H',ext,elevangle,azim,choose_index)
Read (pathname) wave file into [y,fs,nbits]
IF (fs ~= 44100 | nbits ~= 16), then
Error message='Incorrect wave file format. Expected 16 bit samples at 44.1 kHz sampling rate.'
ENDIF
ENDIF
Pseudo-Code of hrtfpath.m
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Subfunction [s] = hrtfpath(root,dir_ch,subdir,select,ext,elev,azim,choose_index)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
IF (choose_index=1), then
Set s = sprintf('%s%s%s%selev%d%s%s%de%03da%s', root,dir_ch,subdir,dir_ch,round(elev),...
dir_ch,select,round(elev),round(azim),ext)
ENDIF
IF (choose_index=2), then
Set s = sprintf('%s%s%s%sazim%d%s%s%03de%03da%s',root,dir_ch,subdir,dir_ch,round(azim),...
dir_ch,select,round(elev),round(azim),ext)
ENDIF
IF (choose_index=3), then
Set s = sprintf('%s%s%s%sazimsemi%d%s%s%03de%03da%s',root,dir_ch,subdir,dir_ch,round(azim),...
dir_ch,select,round(elev),round(azim),ext)
ENDIF
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 73 of 74
April, 2002
Pseudo-Code of groupfile.m
%%%%%%%%%%%%
Subfunction =group_file
%%%%%%%%%%%%
Input Parameters :
left_blk_lap,right_blk_lap,num_file,choice,start_angle,end_angle,elevangle,verti_gp_index
Output Paramters : first_part,second_part
Check Parameter choice
IF choice=1 and 2, then do horizontal hrtf and vertical hrtf respectively,
LOOP FOR i= 1 to num_file by 1
Set first_part, second_part= horizontal concatenation of left_blk_lap(i,:) and right_blk_lap(i,:)
END OF FOR LOOP
ENDIF
Pseudo-Code of half_circle.m
%%%%%%%%%%%%%%%%%%%%%%%%
Subfunction [num_file]= half_circle(elevangle,filelist)
%%%%%%%%%%%%%%%%%%%%%%%%
Check elevation angle (-40° to 90°), set num_file refer to particular elevation angle
Pseudo-Code of par_ser.m
%%%%%%%%%%%%%%%%%%%%%%%
Subfunction out_ser=par_ser(input,a,n,b_num)
%%%%%%%%%%%%%%%%%%%%%%%
Set l_r=length(input), z_n=a-1, L=n-z_n
LOOP FOR k=1 to b_num by 1
IF k=1, then
Set r_discard(k,:)=[zeros(1,z_n) input(1:L)]
ELSEIF k=b_num
Set r_end=input((b_num-1)*L+1-z_n:length(input))
Set r_discard(k,:)=[r_end zeros(1,n-length(r_end))]
ELSE
Set r_discard(k,:)=input((k-1)*L+1:(k-1)*L+n)
ENDIF
END LOOP
Set out_ser=r_discard
Department of Electronic and Information Engineering Ngan Cheuk Yin
The Hong Kong Polytechnic University, Hung Hom Page 74 of 74
April, 2002
References [1] Blauret, J. P., Spatial Hearing, Cambridge, MA, MIT Press (1983) [2] Yoichi Haneda, Shoji Makino, Yutaka Kaneda, Nobuhiko Kitawaki. (1999), “Common-
Acoustical-Pole and Zero Modeling of Head-Related Transfer Functions, IEEE Transactions on Speech and Audio Processing, vol 7,no.2, Mar 1999
[3] MATLAB description
http://www.mathworks.com/products/matlab/
[4] HRTF Measurements of a KEMAR Dummy-Head Microphone http://xenia.media.mit.edu/~kdm///hrtf.html
[5] KEMAR (Knowles Electronic Manikin for Acoustic Research) http://www.parmly.luc.edu/parmly/behav_psych_resrch.html [6] Kendall, Gary S. 1995. A 3-D sound primer: Directional hearing and stereo reproduction.
Computer Music Journal 19 (4, Winter): 23-46. [7] Gardner, W.G., K.D. Martin (1995). “HRTF measurements of a KEMAR”, J.Acoust.
Soc.Am.,97(6),pp3907-3908 [8] Linear Prediction Analysis
http://www.en.polyu.edu.hk/~mwmak/notes/BEng_SP/linear_prediction_analysis.ppt
[9] S.K.Mitra, McGraw-Hill,1998, “Digital Signal Processing- A Computer-based Approach” [10] Wataru Mayeda, Prentice Hall Inc., 1993 “ Digital Signal Processing” [11] Wightman, F.L., & Kistler, D.J. (1992) "A model of HRTFs based on principal component
analysis and minimum-phase reconstruction," Journal of the Acoustical Society of America, 91(3), 1637-1647.
[12] Wightman, F.L., & Kistler, D.J. (1989a) "Headphone simulation of free-field listening I:
Stimulus synthesis,"Journal of the Acoustical Society of America, 85(2), 858-867. [13] Gardner, W.G. (1992). “The Acoustic Room”, Master’s thesis, Dept. of Media Arts and
sciences, MIT. [14] Gardner W.G. (1998). “Reverberation Algorithms”, in Applications of Digital Signal
Processing to Audio and Acoustics, Kahrs, M., and K. Brandeburg, Ed., Kluwer Academic, Norwell, MA
[15] Gardner W.G. (1999), “3D Audio and Acoustic Environment Modeling” [16] Begault, D.R. (1994), “3-D Sound for Virtual Reality and Multimedia”, Academic Press,
Cambridge, M.A. [17] Barbara G. Shinn-Cunningham, Nathaniel I. Durlach, Richard M. Held, Massachuetts
Institute of Technology, Cambridge, (March 1998)”Adapting to supernormal auditory localization cues”
[18] . W.G. Gardner, "Efficient convolution withoutinput-output Delay", Presented at the 97th convention of the Audio Engineering Society,San Francisco. Pre-print 3897, 1994