Abstract - Hong Kong Polytechnic Universitymwmak/fyp/3D-sound/report.pdf · Abstract In this...

Department of Electronic and Information Engineering Ngan Cheuk Yin

The Hong Kong Polytechnic University, Hung Hom Page 1 of 74

April, 2002

Abstract

In this project, I use the Head Related Transfer Functions (HRTFs)

to illustrate how a monaural sound synthesizes some particular sounds in

3-D space. It comprises a study of HRTF and an explanation of creating a

3-D environment.

I employed some HRTF measurements of MIT Laboratory and my

Matlab codes to generate “spatialized” sounds. To enhance my design, I

produced a simple graphical interface for users to activate the operation

of HRTFs. In GUI window, one simply clicks the mouse and some

processes will be performed in the background. Thus, HRTF-sounds

come in different directions.

April, 2002

Acknowledgements

I thank those for encouraging me to begin, continue, finish, and

cherish my time as a graduate student here in the Hong Kong Polytechnic

University. I offer my deepest thanks:

To my supervisor, Dr. Mak Man Wai of the Electronic and

Information Engineering Department, thanks for his guidance, patience

and wisdom over the past years. Without his support, none of this could

be possible.

To all of my teachers, past and present, in and out of school, for

caring for me and training me.

Finally, thanks to my parents for their incessant love, support,

affection, and encouragement. Thank you very much!

April, 2002

Content

Abstract...........................................................................................................................1

Acknowledgements........................................................................................................2

Content............................................................................................................................3

List of Tables ..................................................................................................................5

List of Figures.................................................................................................................6

Chapter 1: Introduction..................................................................................................8

1.1 Motivation..............................................................................................................8

1.2 Objective of the work............................................................................................9

1.3 Scope of thesis .....................................................................................................9

1.4 Details of relevant theory ...................................................................................10

1.4.1 HRTF's - Head Related Transfer Functions................................................10

1.4.2 Overview of the human ear .........................................................................13

Chapter 2 Project Implementation ..............................................................................15

2.1 Review of MIT HRTF’s measurements ..............................................................15

2.1.1 Measurements Data .....................................................................................16

2.2 Review of past work ...........................................................................................18

2.3 Introduction of proposed work..........................................................................19

2.4 System Architecture and Requirements...........................................................20

2.4.1 Hardware.......................................................................................................20

2.4.2 Software ........................................................................................................20

Chapter 3 Methodology..............................................................................................21

3.1 Matlab Scripts in Details ....................................................................................21

April, 2002

3.1.1 Interface codes.............................................................................................23

3.1.1.1 hrtf.m ..................................................................................................... 23 3.1.1.2 verti_sur.m ............................................................................................ 26 3.1.1.3 hori_sur.m ............................................................................................. 28

3.1.2 HRTF codes for verti_final.m and hori_final.m..........................................30

3.2 Input Parameters ................................................................................................37

3.3 Output Parameters..............................................................................................37

Chapter 4 HRTF Results ............................................................................................38

4.1 Comparison HRTF-sound with common stereo sound ...................................39

4.2 Wave Statistics from different condition ..........................................................42

Chapter 5 User Manual of 3D Simulation ...................................................................48

5.1 Setting the input and output file name: ............................................................50

5.2 Setting Parameters of Horizontal Surround .....................................................51

5.3 Setting Parameters of Vertical Surround..........................................................53

5.4 Processing the parameter..................................................................................56

5.5 Close the window................................................................................................56

Chapter 6 Conclusion ..................................................................................................57

6.1 Problems with HRTF-based synthesis of spatial audio...................................57

6.2 Headphones ........................................................................................................58

6.3 Future Work.........................................................................................................58

Appendices...................................................................................................................60

Homepage of my project ..........................................................................................60

Pseudo-code of Matlab scripts................................................................................61

References....................................................................................................................74

April, 2002

List of Tables Table 2.1 Number of measurements and azimuth increment at each elevation…….17

Table 3.1 The Matlab Scripts in my project and their objectives………….….…..…...21

Table 3.2 Sub-functions in hrtf.m of my project and their objectives……….…….….23

Table 3.3 Sub-functions in verti_sur.m of my project and their objectives……….…26

Table 3.4 Sub-functions in hori_sur.m of my project and their objectives…………..27

April, 2002

List of Figures Figure 1.1 distribution of sound source of measurements …………………….......10 Figure 3.1 hrtf.fig in GUIDE……………………………………………………..……..22 Figure 3.2 verti_sur.fig in GUIDE………………………………………………….…..25 Figure 3.3 hori_sur.fig in GUIDE……………………………...……………….….......27 Figure 3.4 flow Chart of hori_final.m and verti_final.m……………….……..….…..29 Figure 3.5 pseudo-codes of verti_final.m and hori_final.m………………..….…....31 Figure 3.6 pseudo-codes of cir_mon.m and cirsft.m………….………….…………32 Figure 3.7 illustration of sound sample, and prediction error for sound

where the prediction error is large at the end of the section….……....33 Figure 3.8 block convolution using the overlap-save method………….…….…….34 Figure 3.8 (a) input signal x(n) divided into overlapping sections,overlap is Nh-1=2...34 Figure 3.8 (b) impulse response h(n) …..…………………………………... ………..….34 Figure 3.8 (c) output y(n) using direct convolution …..…………………...………........34 Figure 3.8 (d) output y1(n) for block circular convolution of x1(n) and h(n) ….…….....34 Figure 3.8 (e) output y2(n), (f) output y3(n) ………………………..……….…….…..….34 Figure 3.8 (g) output y4(n) ……………………..……………………….….…...……..…..34 Figure 3.8 (h) sequential concatenation of block outputs after discarding the first

two samples of each block, which is equivalent to the direct convolution result "|" represents concatenation………………...…..…..34

Figure 3.9 overlap-save methods in this program…………………………..……….35 Figure 4.1 graph of GloryBe.wav and its spectrogram………………….……………37 Figure 4.2 graph of GloryBe.wav after HRTF processing and its spectrogram…...38 Figure 4.3 simulation of stereo sound of GloryBe.wav by one application and

its spectrogram……….…………………………………………....…….....39

Figure 4.4 3D surface of the GloryBe.wav in frequency expression……………….40 Figure 4.5 flow and graph of GloryBe.wav HRTF sound flowing from the

front to the back…………………………………………………….………..41 Figure 4.6 flow and graph of GloryBe.wav HRTF sound flowing a cycle………...41

April, 2002

Figure 4.7 spectrogram of the wave in Figure 4.6 and left side is at elevation 0° and the right side is at elevation 50°…………………………………...42 Figure 4.8 flow and graph of GloryBe.wav HRTF sound flowing in the front

from -40° to 90°…………..……………...……………………………………43

Figure 4.9 flow and graph of GloryBe.wav HRTF sound flowing in the right from -40° to 90°…………………………………..…………………………...43

Figure 4.10 flow and graph of GloryBe.wav HRTF sound flowing in the back

-40° to 90°……………………………………………..….…………………...44

Figure 4.11 flow and graph of GloryBe.wav HRTF sound flowing in the left from -40° to 90°………..…………………………………………….…...…..44

Figure 4.12 flow and graph of GloryBe.wav HRTF sound flowing from the

front to the back through the above of head……………………….….…...45

Figure 4.13 flow and graph of GloryBe.wav HRTF sound flowing in from the right to the left through the above of head………………..……….…..45

Figure 5.1 Matlab 6.0 first pages………………….………………………………..…...47 Figure 5.2 initial “HRTF based surround sound” window……………………………..47 Figure 5.3 pop-up window for choosing the output file name………...……………...49 Figure 5.4 “HRTF based surround sound” after choosing input and output file name………………………………………………..……….….…50 Figure 5.5 initial window of “ Setting Horizontal Surround Sound Parameters”……………………………………………...…….……….…….50 Figure 5.6 window of “ Setting Horizontal Surround Sound Parameters” after setting parameter…………………………………………….………...51 Figure 5.7 initial window of “Setting Vertical Surround Sound Parameters”……….52 Figure 5.8 window of “Setting Vertical Surround Sound Parameters” after pressing “Route of Wave”. A picture loaded in the blank box when choosing “Spcify azimuth angle”……………………….………….………..53 Figure 5.9 window of “Setting Vertical Surround Sound Parameters” after pressing “Route of Wave”. A picture loaded in the blank box when choosing “Semi_Circle azimuth angle”……………………….……….…...54 Figure 5.10 Message box for referencing the completing of processing…………..54 Figure 5.11 Message box asking user for choosing the close operation…………...54

April, 2002

Chapter 1: Introduction

1.1 Motivation

Directional hearing refers to the sensory attribute that enables humans to

determine the location of a sound source in both azimuth (left/right) and elevation

(up/down) position. Sounds give the sensation of direction called “spatialized sounds”.

[1] The ability of humans to perceive the spatial location of sound in space is truly a

remarkable skill. Not only the auditory systems localize sounds with extraordinary

accuracy, but spatial hearing mechanisms are also exceptionally robust under the

most extreme of listening conditions. To get better control of spatial sound in various

applications, researchers from all fields have sought to understand how the human

auditory system accomplishes spatial hearing. Recently, advances in computational

power and acoustic measurement techniques have made spatial hearing possible to

empirically measure, analyze, and synthesize the spectral cues. These spectral cues

are called Head Related Transfer Functions(HRTF’s) sound.[2] Loosely speaking,

HRTF’s are filters, which describe the acoustic filtering the head, torso, and external

ear(pinna) performed on a sound, and use to simulate the illusion of spatial sound

over headphones. HRTF data sets presumably contain all of the sonic cues affecting

the perception of spatial location for a specific individual.

April, 2002

1.2 Objective of the work

I have completed a program which aims to transfer a monaural sound to a

stereo one. The headphone listener can perceive processed sound at any specified

locations in space such as front, back, right or left. I use head-related transfer

functions (HRTFs) to impart sound localization cues. The apparent direction of

incidence of sound (the signal) can be manipulated with it.

HRTF’s measurements I used are produced by Bill Gardner and Keith Martin at

the MIT Media Lab [4] which uses a KEMAR (Knowles Electronics Manikin for

Auditory Research) Dummy-Head Microphone. [5] This data was available for public

use. The tools and database for HRTF processing was described in detail below.

1.3 Scope of thesis

This thesis is structured as follows. In the rest of Chapter 1 briefs relevant

theories such as Head-Related Transfer Function (HRTF), interaural transfer function

(ITF) and the interaural time difference (ITD). [6] I give an overview of human ear.

Chapter 2 investigates the project implementation. First, I take a review of MIT

HRTF’s measurements. It explains how the MIT Laboratory gets HRTF measurements.

Then, I tell what I did last year. I will also state the system architecture and some

requirements of my program. Chapter 3 is about methodology. I explain the

architecture of Matlab scripts in greater details. For example, I draw the program flow

of HRTF.m, verti_sur.m, hori_sur.m, verti_final.m and hori_final.m.

April, 2002

In chapter 4, I paste some processed wave statistics and spectrograms of

GloryBe.wav and present out some characteristics of HRTF-based sound. Finally,

Chapter 5 provided some conclusions and future directions.

1.4 Details of relevant theory

1.4.1 HRTF's - Head Related Transfer Functions

The transformation of a sound wave from a source to the ear is normally

described by a transfer function called the head-related transfer function (HRTF). The

HRTF is a frequency-function of a signal and the location of the source with respect to

head. The source’s azimuth angle and elevation angle location use to specify the

location of source. A complete set of HRTF measurements consists of many filters

that describe a spherical map of the possible sound sources. It contains information

about frequency dependent sound delays and intensity differences between ears.

When a signal sends through

Figure 1.1 Distribution of sound source in the measurements

April, 2002

HRTF filter and then plays through headphones, the listener receives the impression

of the sound source.

In above pictures, each point represents a sound source. The distance of the

source is a constant relative to the center of the head. Sources change in the lateral

position around the azimuth or the elevation of the source. The differences in delay

and intensity between ears are greatest in the azimuthally position and less so for

elevation. These differences are also frequency dependent.

Let me consider a sound source located at azimuth angle θ with respect to the

head. Let S (ω) be the Fourier transform of the source signal, HX(ω, θ) and HY (ω, θ)

be the Fourier transform of the signals of left-ear and right-ear respectively.

X(ω, θ) = HX(ω, θ)S(ω)

Y (ω, θ) = HY (ω, θ)S(ω)

Next, we define,

F(ω, θ) = HY (ω, θ) HX(ω, θ)

F (ω, θ) is known as the interaural transfer function (ITF). [6] The interaural

transfer function captures the important binaural cues. The interaural time difference

(ITD) [6] is captured in the phase information of the ITF. More specifically, the

derivative of arg(F(ω, θ)) with respect to ω give the ITD. Note that introduction of

frequency dependent HRTF results in dependence of the interaural time (phase)

difference on frequency. The ITF could be estimated by taking the ratio of Fourier

transforms of the signals receive at the left-ear and the right-ear.

F(ω, θ) = Y (ω, θ) X(ω, θ)

It is important to note that F (ω, θ) is independent of the source spectrum and

thus could be used to find the location of any wideband source. This observation could

April, 2002

be utilized for finding a simple way of solving the sound localization problem using a

priori information. Suppose the actual interaural transfer function of the head, F (ω, θ),

is known a priori. This information is obtained from a training process. Later, in order

to estimate the direction of an unknown source signal, one can estimate the interaural

transfer function of the head from the received signals using F(ω, θ) = Y (ω, θ) X(ω, θ)

and compare it with the known F(ω, θ). The value of θ for the estimated interaural

transfer function is lost. The actual interaural transfer function gives the direction of

the source.

HRTFs are created from ITD, IID and pinna measurements associate with a

certain azimuth and possibly distance. HRTF parameters can be measured using

either real human heads or dummy heads. Tiny microphones are inserted into the ear

canals, and a recording is made of a sound source from many different azimuths and

elevations. Each particular measurement represents one sound source direction.

HRTFs provide insight into localization cues. For example, inspection of HRTFs for

sounds coming directly from the right at different elevations would show the effects of

pinna filtering. The notches in the filtered frequency response would change in

number and frequency position with different elevations as direct and indirect sounds

combine in the ear. This particular data would help reveal how we detect elevation.

In addition, HRTF processing uses to trick the ear. For example, a sound from

a stationary speaker at an arbitrary elevation can be processed by this series of

HRTFs. As its frequency response gradually changed, the ear/brain would perceive

changing elevation from the stationary source. Similarly, complex surround sound

fields can be synthesized from stereo speakers, but the ear isn't always entirely fooled.

April, 2002

The realism of the created sound field will suffer if the listener moves from the "sweet

spot" between the speakers (headphones solve this problem) because the delicate

balance of cues will be upset. Moreover, the folds of each listener's pinna are different;

generic HRTFs are exactly not matching our own psychoacoustic expectations. In PC

sound cards, using cross-talk cancellation techniques to improve perceived channel

separation cleans up the 3D sound image in two channels, but headphones, with their

superior real channel separation deliver a better two-channel experience. This type of

sound field rendering in two channels is called binaural rendering.

The human ear's purpose in the area of hearing is to convert sound waves into

nerve impulses. These impulses are then perceived and interpreted by the brain as

sound. The human ear can perceive sounds in the range of 20 to 20,000 Hz. This

section is broken down into a basic overview of the ear; the ear receives a section of

how sound, a section on how the ears communicate with the brain, and finally, a

section of human factors. Understanding how the ear worked is the key to

successfully implementing 3D sound in VR systems.

1.4.2 Overview of the human ear

The human ear is made of three distinct areas: the outer ear, the middle ear,

and the inner ear. The outer ear channels sound waves through the ear canal to the

eardrum.

April, 2002

The outer ear (pinna) is considered an auditory cue in HRTF generation. The

eardrum is a thin membrane stretched across the inner end of the canal. HRTF

generation uses small microphones embedded in a real ear canal to help simulate the

real acoustic environment. Air pressure changes in the ear canal cause the thin

membrane to vibrate. These vibrations are transmitted to three small bones called

"ossicles" which are located in the air-filled middle ear and conducted the vibrations

across the middle ear to another thin membrane called the oval window. All of these

vibrations are different when a small microphone is embedded in the ear canal, thus it

is impossible to accurately simulate the acoustic environment of the human ear using

this method (HRTF measurements). The use of a small microphone in the ear canal

approximated the real thing. It is because frequency response, position, reflection and

refraction of sounds waves are different. The problem is trying to make measurements

without affecting the measurements. The oval window separates the middle ear from

the fluid-filled inner ear. The effect that the fluid filled inner ear has on sound

transmission is not modeled at all in HRTF generation, in fact, HRTF generation

basically attempts to measure affects of external structures and immediate results

inside the ear canal. Everything happen from middle ear on back is not modeled at all.

The "cochlea" in the inner ear is the most important component of hearing. It

contains the organ of "Corti". The Corti sit in an extremely sensitive membrane called

the "basilar membrane". Whenever the basilar membrane vibrates, small sensory hair

cells inside the Corti are bent, which stimulates the sending of nerve impulses to the

brain.

April, 2002

Chapter 2 Project Implementation 2.1 Review of MIT HRTF’s measurements

For this project, I used HRTF measurements compiled by researchers working

in sound media lab in MIT. There are three types of data sets that they have made

publicly available. The "compact data set" is a set of impulse responses, which have

been preprocessed to compensate for recording equipment response and other

factors, and is ready to be used directly. The ‘full data set’ is recorded when they are

generating the data. We prefer to use the compact data set for various reasons. The

advantage of this is that data was equalized to compensate for the non-uniform

response of the Optimum Pro 7 speaker. The full data set have 512 taps for the FIR

filter instead of 128 taps in the compact data set. Finally, the diffused field set data is

compensating for the recorded equipment’s response that allows us to emphasize the

difference.

The MIT researchers use a manikin, named KEMAR, in their experiment. They

set up microphones in its ear canal and played sounds from different locations using

speaker about 1.4 meters apart from the head. Responses from total of 710 different

locations are recorded using sampling frequency of 44.1 KHz. A non-

echoic room uses for the purpose.

Utilizing the symmetry of the head, the KEMAR is set up with two

different pinnae. The left pinna in KEMAR is a normal pinna, while the

right pinna is a slightly larger one. The microphones in the KEMAR’s

April, 2002

ears also pick up the ear canal resonance of the manikin’s ears during recording

sound. When these HRTFs are used to generate sound, the listener will hear the

KEMAR ear canal resonance in addition to his own ear canal resonance. Besides that,

as mentioned earlier, the full data set contains the recording system’s response too. A

possible way to eliminate the measurement system’s response, as well as effect of ear

canal resonance is to normalize the measurements with respect to an average across

all directions (called diffuse-field equalization). Since neither the measurement system

response nor the ear canal response change as a function of sound direction, they will

be factored out of the data. To find the diffuse field data, the magnitude-squared

responses of all responses are averaged, which results in power average across all

directions. My project is using the diffuse field data. It uses the pre-computed value of

inverse diffuse field to process the sound before playing. In general, the sounds

synthesized using diffuse-field data can be localized better. The purpose is to provide

the option there is to allow the users to evaluate its effect themselves.

2.1.1 Measurements Data

The spherical space around the KEMAR is sampled at elevations from -40

degrees (40 degrees below the horizontal plane) to +90 degrees (directly overhead).

At each elevation, a full 360 degrees of azimuth is sampled in equal sized increments.

The increment sizes are chosen to maintain approximately 5-degree great-circle

increments. The table below shows the number of samples and azimuth increment at

each elevation (all angles in degrees). A total of 710 locations are sampled.

April, 2002

elevation angle no. of measurement azimuth increment

-40 56 6.43

-30 60 6

-20 72 5

-10 72 5

0 72 5

10 72 5

20 72 5

30 60 6

40 56 6.43

50 45 8

60 36 10

70 24 15

80 12 30

90 1 x

Table 2.1: Number of measurements and azimuth increment at each elevation

Each HRTF measurement yields impulse response at a 44.1 kHz sampling rate.

Most of this data is irrelevant. The impulse responses are stored as 16-bit signed

integers, with the most significant byte stored in the low address (i.e. Motorola 68000

format). The HRTF data is stored in directories by elevation. Each directory name

has the format ``elevEE'', where EE is the elevation angle. Within each directory each

filename has the format ``HEEeAAAa.dat'' where EE is the elevation angle of the

source in degrees, from -40 to 90, and AAA is the azimuth of the source in degrees,

from 0 to 355. Elevation and azimuth angles indicate the location of the source relative

to the KEMAR, such that elevation 0 azimuth 0 was directly in front of the KEMAR,

elevation 90 is directly above the KEMAR, elevation 0 azimuth 90 is directly to the

right of the KEMAR, etc. For example, the file ``H-20e270a.dat'' is the right ear

response, with the source 20 degrees below the horizontal plane and 90 degrees to

the left of the head. Note that three digits are always given for azimuth so that the

April, 2002

files appear in sorted order in each directory. To select a pair of HRTF responses, I

recommend using symmetrical responses obtained from one of the KEMAR ears.

2.2 Review of past work

I had completed stage one and wrote custom scripts in last year. These scripts

directly convolved an input file (in *.wav format) with a selected head-related transfer

function (HRTF) to simulate an incidence of source at a particular azimuth and

elevation. A stereo output sound is produced for binaural presentation to the subject.

HRTF processing and signal mixing is implemented in one MATLAB [3] script

Stereo.m.

Stereo.m is invoked from the MATLAB command line by optionally providing

parameter values in the function call. If no parameters are provided in the function call,

the script set most parameters to default values (e.g., the default for azimuth and

elevation angles was 0 degrees for both). Stereo.m opens a file requester to ask user

to specify the input wave file. Then all wave files contain in the same directory as the

file specified would then be processed and the results are saved in a subdirectory

named HRTF. If not specified in the function call, the first time a stereo input file is

encountered the user would be requested to select which side to process (the input

signal x (n) must be monophonic).

April, 2002

2.3 Introduction of proposed work

In this semester, I added more functions and a graphical user interface to the

program. GUI is enhanced the user-friendly of my project. User needs to set the input

directory, input wave file, output wave file and which surrounding surface space

(horizontal or vertical HRTF). If user is pressing horizontal or vertical button, another

pop-up window called the user to enter the range of azimuth and elevation value. To

close these pop-up windows, the parameters would call back the hrtf.m. Then users

press the “Process” button to call other Matlab script files either hori_final.m or

verti_final.m to process HRTF simulation. After trimming the ends, the sound is

created in the output file.

The way of playing sound data is using sound (vector, fs) function of MATLAB.

[3] I opted to use wavplay ( ), a freeware in Matlab, that is suitable for playing sounds

on PCs. An added advantage that I get from wavplay( ) is that it waited for the sound

port to be released if it’s already busy. The user would play the sound when press the

“Playback” in GUI window or play with other application tools by just clicking the file

April, 2002

2.4 System Architecture and Requirements

2.4.1 Hardware

Pentium, Pentium Pro, Pentium II, Pentium III, Pentium IV based personal

computer

64 MB RAM minimum, 128 MB RAM recommended

Microsoft Windows supported graphics accelerator card and sound card

Headphones, Microphone

2.4.2 Software

Microsoft Windows 95, Windows 98, Windows ME, Windows NT 4.0 (with SP5 or

SP6a), Windows 2000, Windows XP

Matlab 6

April, 2002

Chapter 3 Methodology

3.1 Matlab Scripts in Details

There are thirteen Matlab scripts in my project. Some are used in the interface

while the other in the HRTF-convolution. The following table states their names and

objectives:

Scripts Name Objectives

hrtf.m This script calls for the main GUI window

to retrieve data from a user and transfer

data to other sub-functions

verti_sur.m This script calls for the vertical HRTF

window when a user presses “Vertical

Surround” button in the main window. It

retrieves some data like the range of

azimuth or elevation angles, and returns

the data to hrtf.m.

hori_sur.m This script calls for the horizontal HRTF

window after a user presses “Horizontal

Surround” button in the main window. It

retrieves some data like the range of

azimuth or elevation angles, and returns

them to hrtf.m.

modaldlg.m This script calls for “closing confirm

window” after a user presses the “Close”

button in the main window.

verti_final.m This script is called by hrtf.m after a user

has pressed the “Process” button and

chosen the horizontal HRTF. It also calls

for the other scripts to activate HRTF-

processing.

April, 2002

hori_final.m This script is called by hrtf.m after a user

has pressed the “Process” button and

chosen the horizontal HRTF. It also calls

the other scripts to activate HRTF-

processing.

readhrtf.m This script returns HRTF measurements to

two columns of 128 rows. The first column

is left channel and the second column is

right channel.

hrtfpath.m This script returns the pathname of HRTF

data file to readhrtf.m.

group_file.m This script uses to group a sound clip with

stereo channels, left and right.

par_ser.m This script divides the mono sound clip into

parts of clips.

half_circle.m This script is called by hori_final.m when

user chooses 180º surrounding.

cir_con.m This script uses to do the circulate

convolution.

cirsft.m This script is called by cir_con.m.

Table 3.1 The Matlab Scripts in my project and their objectives

April, 2002

3.1.1 Interface codes

3.1.1.1 hrtf.m

The hrtf.m contains sub-functions that launch and control GUI and callbacks.

Each callback refers to relevant object stored in hrtf.fig. Users use it to implement

HRTF-synthesis after retrieving data and transferring data to other scripts for

processing. The pseudo-code of hrtf.m is attached in Appendix.

Figure 3.1 of hrtf.fig in GUIDE

April, 2002

The followings are descriptions of sub-functions in hrtf.m:

Sub-function Name Link with which object in

hrtf.fig

Function

listbox1_Callback “Listed File” list box It uses a list box to display

the files in a directory.

When the user double clicks

on a list item, one of the

following happens: If the item is a file, the GUI

opens the file appropriately

for the file type.

If the item is a directory, the

GUI reads the contents of

that directory into the list

If the item is a single dot (.),

the GUI updates the display

of the current directory.

If the item is two dots (..),

the GUI changes to the

directory up one level and

populates the list box with

the contents of that

directory.

load_listbox X This sub-function retrieves

the path of a directory and

set the handles structure

from the listbox1_Callback

to be input arguments.

selectedfile_Callback “Input File Name”

edit box

This function uses to display

the input file name

outfile_Callback “Output File Name”

edit box

This function uses to display

the output file name

April, 2002

savefile_name_Callback “Choose Output File

Name”

push button

This function uses to

prompt a dialog box for

user entering the output file

hori_button_Callback “Horizontal Surround”

push button

This function generates a

call to the hori_sur.m and

return parameters when

pressing the button.

verti_button_Callback “Vertical Surround“

push button

This function generates a

call to the verti_sur.m and

return parameters when

pressing the button.

initial_button_Callback “Reset” push button This function uses to

initialize all the state of

hori_button_Callback and

verti_button_Callback

processing_Callback “Process” push button This function generates a

call to process HRTF-

synthesis

playback_Callback “Play” push button This function plays the

HRTF- sound by calling

wavplay function

close_Callback “Close” push button This function calls the

modaldlg.m function

figure1_CloseRequestFcn X This function call the

close_Callback.m

Table 3.2 sub-functions in hrtf.m of my project and their objectives

April, 2002

3.1.1.2 verti_sur.m

It is another script that contains functions to launch and control the GUI and

callbacks. The pseudo code is in Appendix.

Figure 3.2 of verti_sur.fig in GUIDE

There are two special functions “uiwait“ & “uiresume”. They use to stop and

resume MATLAB program execution. In the dialog, user calls a uicontrol with a

callback (uiresume) that destroys the dialog box. The “uiwait” is a convenient way to

use the wait for command. You typically use it in conjunction with a dialog box. It

provides a way to block the execution of the M-file that created the dialog, until the

user responds to the dialog box. When one uses conjunction with a modal dialog, the

“uiwait” causes MATLAB program to wait before returning execution to the Close

button callback.

April, 2002

The followings are descriptions of sub-functions of verti_sur.m:

Sub-function Name Link with which object

in verti_sur.fig

Function

verti_azim_radio_Callback

“Specify azimuth

angle” radio button

This function calls another

functions mutual_exclude

which makes inactive to

verti_semi_radio_Callback

from group of two radio

buttons.

azim_angle_popup_Callback

“Specify azimuth

angle” popup menu

This function prompts a

list of azimuth angles when

users press the arrow.

verti_semi_radio_Callback

“Semi_Circle azimuth

angle” radio button

This function calls another

functions mutual_exclude

which makes inactive to

verti_azim_radio_Callback

from group of two radio

buttons.

sc_azim_angle_popup_Callback

“Semi_Circle azimuth

angle” popup menu

This function prompts a

list of azimuth angles when

verti_reset_button_Callback “Reset” press button This function reset all

handles objects in the

“verti_sur.fig”

mutual_exclude X This function ensures all

buttons in the group which

must be deselected when

one of them is selected and

in active

April, 2002

verti_close_button_Callback “Close” press button This function resumes

program execution, and

runs the script of delete

(fig).

load_picture X This function uses to

display the path of the

HRTF sound

Show_pic_Callback “UpdatePic”press

button

This function aims to call

load picture function.

Table 3.3 sub-functions in verti_sur.m of my project and their objectives

3.1.1.3 hori_sur.m

Figure 3.3 of hori_sur.fig in GUIDE

April, 2002

The followings are descriptions of sub-functions in hori_sur.m:

Sub-function Name Link with which object in

hori_sur.fig

Function

hori_end_azim_Callback “End Azimuth Degree”

popup menu

This function displays a list

of end azimuth angles when

hori_elevslider_Callback “hori_elevslider” slider This function uses a slider

to specify elevation angle

since these components

enable the selection of

continuous values within a

specified range.

hori_elev_Callback “Selected Elevation Degree“

edit box

This function displays the

selected elevation degree.

hori_close_button_Callback “Close” press button This function resumes

program execution of

figure1, then run the script

of delete(fig).

load_picture X This function uses to

display the path of the

HRTF sound

show_Pic_Callback “UpdatePic”press button This function aims to call

load picture function.

Table 3.4 sub-functions in hori_sur.m of my project and their objectives

April, 2002

3.1.2 HRTF codes for verti_final.m and hori_final.m

These two scripts aim to take a monophonic input signal x (n) from a wave file

and convolve it with the appropriate pair of HRTFs to make the resulting signal

(presented binaurally). It checks and receives parameters from hrtf.m. Then it lists the

measurements used in processing and gets them with the input signal.

Check parameters received from the interface windows

Check number of measurements in the chosen

Abstract - Hong Kong Polytechnic Universitymwmak/fyp/3D-sound/report.pdf · Abstract In this...

Documents

Transcript of Abstract - Hong Kong Polytechnic Universitymwmak/fyp/3D-sound/report.pdf · Abstract In this...

MUSICAL INSTRUMENT SOURCE SEPARATION IN UNISON … · ABSTRACT MUSICAL INSTRUMENT SOURCE SEPARATION IN UNISON AND MONAURAL MIXTURES Melik Berkan Ercan M.S. in Computer Engineering

Project Concepts Presentation ENEE 719V-2002 Neuromorphic VLSI 1. Monaural Vertical Auditory Localization 2. Visual Surveillance Chip 3. Pattern Recognition,

Mark Levinson No. 532 Dual Monaural Power …...2 Mark Levinson Introduction Thank you for purchasing the Nº532 Dual Monaural Power Amplifier. True to Mark Levinson® tradition, the

An auditory scene analysis approach to monaural speech segregation

PET Imaging of Differential Cortical Activation to Monaural Speech ...

Monaural Spectral Contrast Mechanism for Neural ... · Monaural Spectral Contrast Mechanism for Neural Sensitivity to Sound Direction in the Medial Geniculate Body of the Cat THOMAS

Joint optimization of masks and deep recurrent neural networks for monaural source separation

08 - Binaural Audio - Part B Μοντελοποίηση)εξωτερικού)αυτιού)(6) •Γενικά,)οι)HRTFs " Ενισχύουν)την)περιοχή)του)1kHz)για)ήχους

DUAL-MONAURAL PREAMPLIFIER OWNER’S MANUAL€¦ · 2 N0 526 / N0 523 DUAL-MONAURAL PREAMPLIFIER / O TABLE OF CONTENTS / ABOUT THIS DOCUMENT TABLE OF CONTENTS About This Document

Learning the cues associated with non-individualised HRTFs John Worley and Jonas Braasch Binaural and Spatial Hearing Group.

Monaural Speech Segregation: Representation, Pitch, and Amplitude Modulation DeLiang Wang The Ohio State University.

Headphone Technology - AES · Headphone Technology Aalborg, Denmark ... he presented an active earplug design for ... duced the audience to a procedure for measuring HRTFs with

Extracting spatial sounds ... - Columbia Universitydpwe/e6820/lectures/E... · synthetic environments (VR) • Constraints-resources-information (individual HRTFs)-delivery mechanism

Introduction to HRTFs

HRTFs can be calculated

Plantronics CS540 Monaural Convertible Wireless Headset User Guide

Learning the cues associated with non-individualised HRTFs

Analog Input Monaural Class-D Speaker Amplifier - Rohmrohmfs.rohm.com/.../ic/audio_video/audio_amplifier/bd5466gul-e.pdf · Analog Input Monaural Class-D Speaker Amplifier BD5466GUL

Engineering Convention Paper 9892 · 2017. 10. 16. · Sridhar and Choueiri Calculating HRTFs from head scans a boundary element method to compute HRTFs for frequencies below 3 kHz.

Phoneme-Dependent NMF for Speech Enhancement in Monaural