Model-based Bayesian direction of arrival analysis...

11
Model-based Bayesian direction of arrival analysis for sound sources using a spherical microphone array Christopher R. Landschoot and Ning Xiang a) Graduate Program in Architectural Acoustics, School of Architecure, Rensselaer Polytechnic Institute, Troy, New York 12180, USA (Received 9 February 2019; revised 12 April 2019; accepted 17 April 2019; published online 31 December 2019) In many room acoustics and noise control applications, it is often challenging to determine the directions of arrival (DoAs) of incoming sound sources. This work seeks to solve this problem reli- ably by beamforming, or spatially filtering, incoming sound data with a spherical microphone array via a probabilistic method. When estimating the DoA, the signal under consideration may contain one or multiple concurrent sound sources originating from different directions. This leads to a two- tiered challenge of first identifying the correct number of sources, followed by determining the directional information of each source. To this end, a probabilistic method of model-based Bayesian analysis is leveraged. This entails generating analytic models of the experimental data, individually defined by a specific number of sound sources and their locations in physical space, and evaluating each model to fit the measured data. Through this process, the number of sources is first estimated, and then the DoA information of those sources is extracted from the model that is the most concise to fit the experimental data. This paper will present the analytic models, the Bayesian formulation, and preliminary results to demonstrate the potential usefulness of this model-based Bayesian analysis for complex noise environments with potentially multiple concur- rent sources. V C 2019 Acoustical Society of America. https://doi.org/10.1121/1.5138126 [KTW] Pages: 4936–4946 I. INTRODUCTION The purpose of the research presented in this paper is to offer a solution to the problem of localizing multiple concurrent sound sources through a model-based probabilis- tic approach. This work demonstrates that, given a set of sound signals recorded on a spherical microphone array (Meyer and Elko, 2002; Rafaely, 2015), the number of sound sources, as well as the directions in which they arrive, can be predicted algorithmically. This requires a process known as beamforming, or the spatial filtering of a sound signal using spherical harmonics theory based on a spherical microphone array (Williams, 1999). This is combined with the probabil- istic methods of analysis called Bayesian model selection and parameter estimation (Knuth et al., 2015; Xiang and Fackler, 2015). Localizing multiple concurrent sound sources simulta- neously in complex sound environments can be a challenge as there may be variations in the number of sources, along with their locations, characteristics, and strengths (Blandin et al., 2012; Bush and Xiang, 2018; Escolano et al., 2014). In addition to these variations, there can be unwanted inter- ference through fluctuating background noise as well. Various solutions to this problem (Mohan et al., 2008) have begun to be explored, particularly in recent years with growing interest in immersive auditory virtual reality and augmented reality applications (Vorlander et al., 2015). Whereas the interest in spatial sound has been growing greatly in recent history, the ideas surrounding it are not completely new. Theoretical work and early applications manifested themselves in the form of ambisonics (Craven and Gerzon, 1977; DuHamel, 1952), which began to provide some spatial information about a soundscape. Although these methods did contain spatial information about a sound signal, they did not address the localization of the sound sources or their characterization in any way. Without employing microphone array technology, the spatialization of sound was inherent to the recorded audio signals them- selves rather than gleaned via post-processing (Furness, 1990). Using microphone array technology (Madhu and Martin, 2008), specifically spherical microphone arrays, an entire sound field could be analyzed without traditional microphone directionality ignorance or bias (Meyer and Elko, 2002). One aspect of decomposing complex soundscapes is performing a direction of arrival (DoA) analysis on the recorded signals. There are many methods that have been implemented to attempt to solve this problem. Recent exam- ples can be seen through the efforts of utilizing various different microphone arrays, including sparse linear micro- phone arrays (Bush and Xiang, 2018; Nannuru et al., 2018) and a two-microphone array used in a room-acoustic study (Escolano et al., 2014). A spherical microphone array, or spherical array, is sim- ply a sphere with microphone capsules sampling its surface that can record sound signals simultaneously. Because of the principles of spherical microphone arrays, there are no inherent directional constraints. The recorded signals can be processed to simulate any orientations of directionality desired. This has allowed for researchers to experiment with a) Electronic mail: [email protected] 4936 J. Acoust. Soc. Am. 146 (6), December 2019 V C 2019 Acoustical Society of America 0001-4966/2019/146(6)/4936/11/$30.00

Transcript of Model-based Bayesian direction of arrival analysis...

Page 1: Model-based Bayesian direction of arrival analysis …symphony.arch.rpi.edu/~xiangn/Papers/JASA2019...Model-based Bayesian direction of arrival analysis for sound sources using a spherical

Model-based Bayesian direction of arrival analysis for soundsources using a spherical microphone array

Christopher R. Landschoot and Ning Xianga)

Graduate Program in Architectural Acoustics, School of Architecure, Rensselaer Polytechnic Institute, Troy,New York 12180, USA

(Received 9 February 2019; revised 12 April 2019; accepted 17 April 2019; published online 31December 2019)

In many room acoustics and noise control applications, it is often challenging to determine the

directions of arrival (DoAs) of incoming sound sources. This work seeks to solve this problem reli-

ably by beamforming, or spatially filtering, incoming sound data with a spherical microphone array

via a probabilistic method. When estimating the DoA, the signal under consideration may contain

one or multiple concurrent sound sources originating from different directions. This leads to a two-

tiered challenge of first identifying the correct number of sources, followed by determining the

directional information of each source. To this end, a probabilistic method of model-based

Bayesian analysis is leveraged. This entails generating analytic models of the experimental data,

individually defined by a specific number of sound sources and their locations in physical space,

and evaluating each model to fit the measured data. Through this process, the number of sources is

first estimated, and then the DoA information of those sources is extracted from the model that is

the most concise to fit the experimental data. This paper will present the analytic models, the

Bayesian formulation, and preliminary results to demonstrate the potential usefulness of this

model-based Bayesian analysis for complex noise environments with potentially multiple concur-

rent sources. VC 2019 Acoustical Society of America. https://doi.org/10.1121/1.5138126

[KTW] Pages: 4936–4946

I. INTRODUCTION

The purpose of the research presented in this paper is to

offer a solution to the problem of localizing multiple

concurrent sound sources through a model-based probabilis-

tic approach. This work demonstrates that, given a set of

sound signals recorded on a spherical microphone array

(Meyer and Elko, 2002; Rafaely, 2015), the number of sound

sources, as well as the directions in which they arrive, can be

predicted algorithmically. This requires a process known as

beamforming, or the spatial filtering of a sound signal using

spherical harmonics theory based on a spherical microphone

array (Williams, 1999). This is combined with the probabil-

istic methods of analysis called Bayesian model selection

and parameter estimation (Knuth et al., 2015; Xiang and

Fackler, 2015).

Localizing multiple concurrent sound sources simulta-

neously in complex sound environments can be a challenge

as there may be variations in the number of sources, along

with their locations, characteristics, and strengths (Blandin

et al., 2012; Bush and Xiang, 2018; Escolano et al., 2014).

In addition to these variations, there can be unwanted inter-

ference through fluctuating background noise as well.

Various solutions to this problem (Mohan et al., 2008) have

begun to be explored, particularly in recent years with

growing interest in immersive auditory virtual reality and

augmented reality applications (Vorl€ander et al., 2015).

Whereas the interest in spatial sound has been growing

greatly in recent history, the ideas surrounding it are not

completely new. Theoretical work and early applications

manifested themselves in the form of ambisonics (Craven

and Gerzon, 1977; DuHamel, 1952), which began to provide

some spatial information about a soundscape. Although

these methods did contain spatial information about a sound

signal, they did not address the localization of the sound

sources or their characterization in any way. Without

employing microphone array technology, the spatialization

of sound was inherent to the recorded audio signals them-

selves rather than gleaned via post-processing (Furness,

1990). Using microphone array technology (Madhu and

Martin, 2008), specifically spherical microphone arrays, an

entire sound field could be analyzed without traditional

microphone directionality ignorance or bias (Meyer and

Elko, 2002).

One aspect of decomposing complex soundscapes is

performing a direction of arrival (DoA) analysis on the

recorded signals. There are many methods that have been

implemented to attempt to solve this problem. Recent exam-

ples can be seen through the efforts of utilizing various

different microphone arrays, including sparse linear micro-

phone arrays (Bush and Xiang, 2018; Nannuru et al., 2018)

and a two-microphone array used in a room-acoustic study

(Escolano et al., 2014).

A spherical microphone array, or spherical array, is sim-

ply a sphere with microphone capsules sampling its surface

that can record sound signals simultaneously. Because of the

principles of spherical microphone arrays, there are no

inherent directional constraints. The recorded signals can be

processed to simulate any orientations of directionality

desired. This has allowed for researchers to experiment witha)Electronic mail: [email protected]

4936 J. Acoust. Soc. Am. 146 (6), December 2019 VC 2019 Acoustical Society of America0001-4966/2019/146(6)/4936/11/$30.00

Page 2: Model-based Bayesian direction of arrival analysis …symphony.arch.rpi.edu/~xiangn/Papers/JASA2019...Model-based Bayesian direction of arrival analysis for sound sources using a spherical

various configurations and methods of data processing in

attempts to determine the best ways to filter sound and

decompose complex soundscapes. This includes methods

such as spherical harmonic beamforming in combination

with optimal array processing, frequency smoothing methods

(Khaykin and Rafaely, 2012), spherical harmonics smooth-

ing (Jo and Choi, 2017), and modal smoothing methods

(Morgenstern and Rafaely, 2018). Sun et al. (2012) applied a

spherical microphone array to localize reflections in rooms,

while Nadiri and Rafaely (2014) localized multiple speakers

under a reverberant environment. The array configurations do

not even have to be fully spherical. Hemispherical micro-

phone arrays (Li and Duraiswami, 2005) can be more suitably

deployed on a table top in conference room applications,

mounted in the ceiling, or deployed on the ground for outdoor

sound source DoA estimation and tracing of flight objects.

Zuo et al. (2018) have formulated the theory of spatial sound

intensity vectors in a spherical harmonic domain applicable

for a variety of acoustic scenarios. Jo and Choi (2018) pro-

posed a solution to avoid ill-conditioned singularity when

solving least-squares and eigenvalue problems to estimate the

DoAs. Wong et al. (2019) discovers rules-of-thumb on how

the estimation precision for an incident source’s azimuth-

polar DoA depends on the number of identical isotropic

sensors.

Each method tested with spherical arrays helps improve

the ability to determine the DoAs of sound sources, but some

still rely on the basic concept of predicting source locations by

correlating them directly with high sound energy levels. A

bulk of recent work also exists using spherical harmonics in

wave-field synthesis (Ahrens and Spors, 2012), sound radia-

tions (Shabtai and Vorl€ander, 2015), or noise analysis (Zhao

et al., 2018). Torres et al. (2013) applies a cylindrical micro-

phone array in room-acoustic studies. Fernandez-Grande

(2016) reconstructs an arbitrary sound field based on measure-

ments with a spherical microphone array. Richard et al. (2017)

applies a spherical microphone array to measure acoustic sur-

face impedance at oblique incidence.

As for complex sound/noise source analysis, the ability

to determine the likelihood of discrete source locations has

not been well investigated unless they are clearly separated

in space. This work applies 16 microphones uniformly

mounted flush on a rigid full sphere. The array has a spatial

resolution up to order two of spherical harmonics. With a

large number of noise sources, the signals to be analyzed can

blend together if the sound sources are located too closely to

each other in physical space. This situation requires model-

based analysis to resolve it or even higher order spherical

arrays to accurately determine noise sources, which, in turn,

requires more microphone channels.

In addition to the parameter estimation problems, which

are solely associated with the DoA estimation given the

known number of sound sources, there is a need to solve the

overarching question of how to reliably determine the

number of sound sources without having to use brute force

by adding more microphones. The answer resides in model-

based Bayesian analysis, which is a method that can estimate

the number of sources and their attributes through probabil-

istic analysis rather than just correlating high sound energy

levels to sound source locations. This method leverages

machine learning through an iterative process, allowing for a

more reliable and consistent DoA analysis. To answer this

overarching question of determining the correct number of

concurrent sound sources, this work applies Bayesian model

selection to the DoA estimation tasks when the number of

sound sources is unknown prior to the analysis. This

Bayesian formulation for model selection problems starts

with application of Bayes theorem, followed by incorpora-

tion of prior information, and then marginalization (Xiang

and Goggans, 2001). Any interest in directional parameter

values will be deferred into the background of the current

problem. This allows attention to be focused on estimating

the probabilities for the number of concurrent sound sources.

There have been many recent efforts to apply Bayesian

model selection to acoustics problems. Xiang and Goggans

(2003) apply Bayesian model selection to determine the

number of exponential decays present in acoustic enclosures

by analyzing sound energy decay functions. Bayesian model

selection has also been applied to room-acoustic modal anal-

ysis (Beaton and Xiang, 2017). Previous studies (Bush and

Xiang, 2018; Escolano et al., 2014; Nannuru et al., 2018) of

DoA analysis have also employed two levels of Bayesian

inference. However, to the best of the authors’ knowledge,

the model-based Bayesian inference has not yet been suffi-

ciently studied using spherical microphone arrays. This

paper demonstrates that the model-based Bayesian probabil-

istic approach can be applied to spatial sound field analysis

with a set of sound signals recorded on a spherical micro-

phone array, resulting in estimations of the number of sound

sources as well as DoAs. This requires two levels of

Bayesian inference.

The remainder of this paper is as follows. Section II for-

mulates the spherical harmonic models of potentially multi-

ple sound sources. Section III briefly introduces a unified

Bayesian framework. Section IV discusses experimental

results using the two levels of Bayesian inference. Section V

further discusses results pertaining to the Bayesian imple-

mentation. Section VI concludes the paper.

II. BEAMFORMING DATA AND MODELS

There are two important concepts at the heart of this

DoA analysis method. They are spherical harmonic beam-forming using spherical microphone arrays and Bayesiananalysis. Spherical harmonic beamforming is the way in

which the recorded sound signals can be processed to map

the sound energy around the spherical array. Bayesian analy-

sis is the process of solving for the number of the sound

sources and the DoAs of these sources.

A. Spherical harmonics

This work utilizes principles of spherical harmonics to

beamform spherical sound signals in order to map and model

the sound energy of a soundscape. Spherical harmonics can

be formulated by solving the spherical wave equation. They

can be mathematically represented (Williams, 1999) as

J. Acoust. Soc. Am. 146 (6), December 2019 Christopher R. Landschoot and Ning Xiang 4937

Page 3: Model-based Bayesian direction of arrival analysis …symphony.arch.rpi.edu/~xiangn/Papers/JASA2019...Model-based Bayesian direction of arrival analysis for sound sources using a spherical

Ymn ðh;/Þ �

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið2nþ 1Þ

4pðn� mÞ!ðnþ mÞ!

sPm

n ðcos hÞejm/; (1)

where h;/ are elevation and azimuth angle, respectively,

and m,n are integer numbers representing the degree and the

order of spherical harmonics, respectively. Pmn ð�Þ is the

Legendre function of degree m and order n (Williams, 1999),

and j ¼ffiffiffiffiffiffiffi�1p

. The degree, m, can be thought of as represent-

ing different orientations of the spherical harmonics, while

the order, n, provides an increase in resolution. The number

of microphones sampling the sphere determines the order of

spherical harmonics that the spherical array can achieve.

This means that more microphone channels are required to

sample the spherical array to achieve high orders of spherical

harmonics, e.g., higher resolution. For a more in-depth anal-

ysis of spherical harmonic concepts, refer to Williams

(1999).

B. Spherical harmonic beamforming models

The term “beamforming” describes the act of spatially

filtering a signal. This can also be referenced as spherical

beamforming when it spatially filters data with respect to a

sphere. Just like the frequencies of audible sound signals can

be filtered (e.g., using a bandpass filter), sound signals can

also be filtered spatially. This means that it is possible to

“listen” to a specified direction, so-called spatial filtering

and suppressing all other sounds that may be coming from

other directions. The spatial filter direction can be referred to

as a beam, and the directional pattern of this “listening”

direction is often described as a beam pattern. For filtering

multiple sound sources, an energy sum of multiple filter

directions can be expressed as

HSðWS; q;/Þ ¼XS

s¼1

AsjgðWs; h;/Þj2

max jgðWs; h;/Þj2h i ; (2)

where S is the number of concurrent sound sources. As repre-

sents amplitude, and the fraction represents normalized

energy associated with the sth source source, with

gðWs; h;/Þ ¼ 2 pXN

n¼1

Xn

m¼�n

Ymn ðWsÞ

� ��Ym

n ðh;/Þ; (3)

where WS ¼ fh1;…; hS; /1;…;/Sg is the S number of lis-

tening directions (sound sources), which are fixed, yet

unknown, while Ws ¼ fhs;/sg denotes the direction of the

sth sound source. k ¼ x=c is the propagation coefficient.

The symbol “�” denotes the complex conjugate. Ymn ð�Þ are

the spherical harmonics of order n and degree m (Williams,

1999).

Figure 1 illustrates the beamforming results. Figure 1(a)

shows two simultaneous beam patterns with order N¼ 3 [for

S¼ 2 in Eq. (2)] at h1 ¼ 70�; /1 ¼ 90� and h2 ¼ 110�;/2 ¼ 270�, with A1 ¼ A2, while Fig. 1(b) illustrates three

simultaneous beam patterns with order N¼ 10 [for S¼ 3 in

Eq. (2)] at h1 ¼ 130�; /1 ¼ 80�, h2 ¼ 60�; /2 ¼ 110�, and

h3 ¼ 120�; /3 ¼ 320�, with A1 ¼ A2 ¼ A3.

C. Spherical harmonic beamforming data

This research utilizes beamforming to help map how

sound energy is distributed around the spherical microphone

array. For more in-depth information on spherical beam-

forming, consult Rafaely (2015). The beamforming data

Dðhi;/iÞ derived from M number of microphones embedded

on the rigid spherical surface with radius r are expressed in

their normalized absolute energy values as

Dðh;/Þ ¼ jyðh;/Þj2

max jyðh;/Þj2h i ; (4)

and according to Meyer and Elko (2002),

yðh;/Þ ¼ 4pM

XN

n¼0

Xn

m¼�n

w�nmðk r; h;/Þ(

�XM

i¼1

pmicðk; r; hi;/iÞ Ymn ðhi;/iÞ

� ��); (5)

where pmicðk; r; hi;/iÞ represents the ith microphone output

among M microphone channels around a rigid sphere surface

at angular position fhi;/ig, and

w�nmðk r; h;/Þ ¼ Ymn ðh;/Þ4 p jn

jnðk rÞ � j0nðk rÞh0nðk rÞ hnðk rÞ

" #�1

;

(6)

FIG. 1. (Color online) Spherical harmonics beam patterns in the form of

two-dimensional normalized energy distribution. (a) Two different, simulta-

neous beamforming directions of order N¼ 3 with ð70�; 90�Þ; ð110�; 270�Þ.(b) Three different, simultaneous beamforming directions of order N¼ 10

with ð130�; 80�Þ; ð60�; 110�Þ, and ð120�; 320�Þ.

4938 J. Acoust. Soc. Am. 146 (6), December 2019 Christopher R. Landschoot and Ning Xiang

Page 4: Model-based Bayesian direction of arrival analysis …symphony.arch.rpi.edu/~xiangn/Papers/JASA2019...Model-based Bayesian direction of arrival analysis for sound sources using a spherical

for axis-symmetric beamforming in the plane-wave decom-

position mode (Rafaely, 2015), and the bracket represents

the spherical modal amplitude for a rigid sphere with

jnðk rÞ; hnðk rÞ being the spherical Bessel and Hankel func-

tions of the first kind, respectively. The prime denotes the

derivative with respect to the argument. Angular variables, hand /, represent the elevation and the azimuth angle,

respectively.

III. MODEL-BASED BAYESIAN FRAMEWORK

This model-based approach utilizes Bayes’s theorem to

answer the two-tiered question:

(1) How many sound sources are present?

(2) What are the parameters of the present sound sources,

e.g., incident angles and strength?

This multi-layered approach utilizes Bayes’s theorem in

two separate ways, the first is model selection and the second

is parameter estimation. For more information on these

methods, consult Kass and Raftery (1995), Wasserman

(2000), and Knuth et al. (2015). For a particularly conceptual

discussion of Bayesian inference, demonstrating the basic

ideas and key concepts and how they can be applied to

acoustics, see Xiang and Fackler (2015).

A. Model selection

Bayesian model selection is a probabilistic method of

evaluating a finite set of models, given a set of data, and then

selecting the model that most appropriately represents the

data. For this research, a large number of randomized beam-

forming evaluations are examined based on different models

and subsequently compared to experimentally measured test

data. These models are then iterated through, continually

improving them, until they are appropriately correlated to

the experimental data. This is by definition a probabilistic

method, as it uses probability to determine which model is

appropriate.

Often times in different applications, such as this

research, Bayes’s theorem is leveraged to determine the

probability to which a modeled set of data,

Hi ¼ ½HiðWi; h;/Þ�, matches an experimentally measured set

of data, D ¼ ½Dðh;/Þ�, as given in Eq. (4). Notations Hi and

D within this work are in the form of two-dimensional matri-

ces over h;/. This resides on the higher level of the two

tiers, answering the question of how many sound sources are

present. This can be determined by the probability of the

model, Hi, given the data, D, represented as PðHijDÞ. To

this end, Bayes’s theorem is applied,

pðHijDÞ ¼pðDjHiÞ pðHiÞ

pðDÞ : (7)

pðHijDÞ is termed the posterior probability of the

model, Hi, and by expressing this in terms of Bayes’s theo-

rem, only three components are required to determine

pðHijDÞ. The model, Hi, is defined over the entire parameter

space of the model. The parameter space of a model consists

of all possible variations of unique parameters or parameter

combinations that can be applied to a model. pðDÞ is the

probability of observing the experimental data, and for this

research it will act as a normalizing constant that will not be

of interest. pðHiÞ is the prior probability of the model, Hi,

and should be assigned based on any previous knowledge of

the circumstance. In this research, each model will be

assigned equal prior probability in order to avoid giving a

preference to any of the models. Finally, pðDjHiÞ is the mar-

ginal likelihood of a model, given the measured data, other-

wise known as “Bayesian evidence” (Knuth et al., 2015).

This term is key in the model selection as it will allow the

consideration of the average likelihood of a model over its

entire parameter space rather than just selecting the model

that produces the highest likelihood.

To quantify the model evaluation, Bayes’s factor (Kass

and Raftery, 1995) is applied, which is used to compare two

models, model Hi over model Hj, as

Bij ¼pðHijDÞpðHjjDÞ

¼ pðDjHiÞpðDjHjÞ

pðHiÞpðHjÞ

; 8i; j 2 1;N½ �; i 6¼ j: (8)

For convenience, the Bayes’s factor is expressed in a loga-

rithmic scale in terms of unit “decibans” (Jeffreys, 1965),

Lij ¼ 10 log10ðBijÞ ð decibansÞ: (9)

This allows for the evidence between two models to be quan-

titatively compared against one another. Among a finite set

of models, the highest positive Bayes’s factor, Lij, indicates

that the data prefer model Hi over Hj the most. Therefore,

the Bayes’s factor is also applied to select a finite number of

models under consideration in the following (Sec. IV B).

Overall, this process will tend to prefer models that are

of a more generally good fit over a large portion of their

parameter space rather than a very good fit where the maxi-

mum likelihood occurs. This essentially offers a penalty for

overcomplicated models if they only increase maximum

likelihood rather than average likelihood compared to sim-

pler models. This is the quantitative implementation of

Occam’s razor, which favors simplicity over complexity

when comparing models that represent measured data (Sivia

and Skilling, 2006).

B. Parameter estimation

On the lower level of the two-tiered problem, parame-

ters must be estimated per the selected model as stated above

in Sec. III A. The subscripts of Hi and Hi will be dropped for

simplicity throughout the following discussions, but still

bear in mind that the model, H, has been given via the model

selection, which contains a specific set of parameters,

H ¼ fh; /; Ag, including both angular and amplitude param-

eters. Bayes’s theorem can be applied to determine the corre-

sponding probabilities of these estimated parameters,

yielding

pðHjD;HÞ ¼ pðDjH;HÞ pðHjHÞpðDjHÞ : (10)

J. Acoust. Soc. Am. 146 (6), December 2019 Christopher R. Landschoot and Ning Xiang 4939

Page 5: Model-based Bayesian direction of arrival analysis …symphony.arch.rpi.edu/~xiangn/Papers/JASA2019...Model-based Bayesian direction of arrival analysis for sound sources using a spherical

Just like when Bayes’s theorem is applied to the model

selection in Eq. (7), in this context, Bayes’s theorem is

applied to determine pðHjD;HÞ, the probability of H given

the experimental data set, D, and the given model, H.

However, now, both of these quantities are dependent on the

parameter set, H. This means that they are defined by a spe-

cific location according to the parameter set, H, rather than

across the entire parameter space.

Probability pðHjD;HÞ is referred to as the posterior

probability distribution of the model parameters. pðDjH;HÞrepresents the likelihood that the measured data D would

have been generated for a given value of H. It represents,

after observing the data, D, the likelihood of obtaining the

realization observed as a function of the parameter, H,

encapsulated in the model, H. The pðHjHÞ term represents

the prior distribution of the parameters given the model, H.

This should be assigned uniformly to avoid any preference

according to the principle of the maximum entropy (Jaynes,

1968; Knuth et al., 2015). Finally, the pðDjHÞ corresponds

to the marginal likelihood, or Bayesian evidence (Skilling,

2004), or evidence, in short. Recall that this is crucial to the

model selection process of Sec. III A.

The posterior probability of the parameters fulfills the

conditionðH

pðHjD;HÞ dH ¼ð

H

pðDjH;HÞ pðHjHÞpðDjHÞ dH ¼ 1;

(11)

which becomes unity when integrated over its entire parame-

ter space. Because the evidence, pðDjHÞ, does not depend on

H, this equation can be rearranged to

pðDjHÞ ¼ð

HpðDjH;HÞ pðHjHÞ dH: (12)

The evidence of a given model, pðDjHÞ, is evaluated

over the entire parameter space by integrating the product of

the likelihood and prior distribution. This is the same evi-

dence value as in Eqs. (7) and (8), indicating that both pro-

cesses of the model selection and parameter estimation

involve evaluating the likelihood of a given model over its

parameter space. Therefore, both levels of Bayesian infer-

ence can be solved within a unified framework, as elaborated

in the following.

C. Unified Bayesian framework

Equation (10) is rewritten in simplified notation as

pðHjD;HÞ � Z ¼ LðHÞ � pðHjHÞ; (13)

where the evidence, Z ¼ pðDjHÞ, is determined by Eq. (12)

and LðHÞ ¼ pðDjH;HÞ, which is specified (Beaton and

Xiang, 2017; Jasa and Xiang, 2012) for this work as

LðHÞ / CQ

2

� �ð2pEÞ�Q=2

2; (14)

with

E ¼ 1

2

XJ

j¼1

XK

k¼1

Dðhj;/kÞ � Hðhj;/kÞ� �2

; (15)

where Q is the total number of data points, Q ¼ JK with h1

hj hJ and /1 /k /K , covering the entire angular

range under consideration. The model, Hðhj;/kÞ, and data,

Dðhj;/kÞ, are determined by Eq. (2) and Eq. (4), respectively.

Equations (12) and (13) indicate that the Bayesian evi-

dence play a central role in the model selection. The evidence

relies on exploration of the likelihood over the entire parame-

ter space, which is in line with the parameter estimation, rely-

ing on the estimation of the posterior in Eq. (10). The

formulation in both Secs. III A and III B can be accomplished

within one unified framework. In this Bayesian framework,

two terms on the right-hand side of Eq. (13) are input quanti-

ties, particularly the likelihood function in Eq. (14), while the

two terms on the left-hand side are the output quantities; the

evidence, Z, is the output for the Bayesian model selection

and the posterior, pðHjD;HÞ, is the output for the Bayesian

parameter estimation. Of central importance within this uni-

fied framework is the evidence, and the numerical sampling

for the evidence is what follows in Sec. III D.

D. Sampling methods

The Bayesian framework applied to the prediction of

DoAs for sound sources requires calculations of the evi-

dence, and so different sampling methods must be put in

place. There are a number of numerically efficient methods,

including nested sampling (Skilling, 2004) and slice sam-

pling (Neal, 2003), among others.

Nested sampling is to be the main sampling method uti-

lized in this work as it sufficiently explores the parameter

space without bias, whereas slice sampling will be used spar-

ingly in order to build robustness into the algorithm. These

sampling methods have begun to be used more often in

recent acoustics applications, and can be further explored in

Fackler et al. (2018), Jasa and Xiang (2012), and Sivia and

Skilling (2006). Nested sampling is efficient in the process

of model selection as the parameter space must be suffi-

ciently populated. The main steps in this implementation of

the sampling method are summarized as follows:

(1) Identify a model for evaluation. In this research, a

beamforming model will be used.

(2) Select a prior distribution for each parameter in the

model based on knowledge of the data under investiga-

tion. In this research, all parameters were assigned a

uniform distribution with limits based on prior knowl-

edge of the problem. This uniform assignment is based

on the principle of maximum entropy (Gregory, 2005;

Knuth et al., 2015).

(3) Create a sufficient population, P, of sample models

with parameters generated randomly from the assigned

prior distributions; in this case, P ¼ 500.

(4) Evaluate the likelihood of each sample using Eq. (14)

inside the P populations.

(5) Identify the sample with the smallest likelihood value.

4940 J. Acoust. Soc. Am. 146 (6), December 2019 Christopher R. Landschoot and Ning Xiang

Page 6: Model-based Bayesian direction of arrival analysis …symphony.arch.rpi.edu/~xiangn/Papers/JASA2019...Model-based Bayesian direction of arrival analysis for sound sources using a spherical

(6) Store the likelihood value of the least-likely sample to track

likelihood progression over the course of the analysis.

(7) Perturb the parameters of the least-likely sample in a

random fashion and reevaluate its likelihood.

(a) If the sample now has a higher likelihood, move

on to the next step. If not, repeat this step until the

sample moves to a position of higher likelihood

in the parameter space.

(b) Before moving on to the next step, replace the

previous least-likely sample by this new sample

with a higher likelihood associated with its

parameters in the population.

(8) Repeat steps (5)–(7) until the sample population has

satisfied a self-defined convergence criteria, or until

some maximum number of iterations is met.

(9) Use likelihood values tracked in step (6) to estimate the

integral in Eq. (12) based on an approximation tech-

nique established for nested sampling (Fackler et al.,2018; Skilling, 2004). The result of this integral evalu-

ates the evidence for the model selected in step (1).

(10) Repeat steps (1)–(9) for all models under consideration.

(11) Use the evidence values from step (9) to determine

which model best fits the data under investigation. This

is the model selection step.

(12) To select appropriate parameters, either

(a) Adopt the parameters from the sample model of

maximum likelihood in the final population of the

selected model, or

(b) Use the parameter distributions over the final pop-

ulation of the model selected as the new prior dis-

tributions. Repeat steps (3)–(7) using these new

prior distributions until the population parameters

have converged to within a desired tolerance.

Once these exploration criteria have been met, all of

these likelihood values can be organized into an ascending

list and then integrated over the entire set using Eq. (12) to

determine the evidence term. This allows for a quantifiable

method to examine how well a model fits the experimental

data on an overall scale rather than for a discrete set of its

parameters. Bush and Xiang (2018) and Fackler et al. (2018)

have recently detailed the nested sampling implementation.

IV. EXPERIMENTAL RESULTS

The measurement process required to validate the

Bayesian analysis method consists of arranging a number of

sound sources around the spherical microphone array and

then recording and processing the data. Measurements of

room impulse responses between one sound source and the

spherical microphone array are carried out in an enclosure

with sufficiently large dimension so as to enable the isolation

of the portion of the impulse response corresponding to the

direct sound. In this way the experimentally measured

impulse responses can be considered in a quasi-free field.

A. Experimental measurement method

To accomplish this while maintaining flexibility and

efficiency, this work utilizes a single sound source (a

loudspeaker) to take impulse response measurements at vari-

ous locations around the spherical array. The impulse

response measurements are then combined in various num-

bers to synthesize sound fields with multiple concurrent

sound events arriving from different directions. To create

noise sources from these locations, the impulse responses

taken at different locations are convolved with white noise

to simulate multiple white noise sources situated at different

locations around the spherical array. These are allowed for

the algorithm to be tested in several different ways.

When measuring the impulse responses, the loudspeaker

is situated approximately 1–1.5 m away from the spherical

array for various tests (see Fig. 2). The microphone array

itself is located approximately 1.5 m above the ground in

order to ensure the ability to separate the direct sound from

the floor reflection in the impulse responses. Logarithmic

sweep sines are played through the loudspeaker and recorded

by the spherical microphone array. All the responses to the

sweeps are averaged together to mitigate any abnormalities

and improve the signal-to-noise ratio. The output impulse

responses are then windowed to remove any reflections from

the floor and room. White noise is convolved with various

combinations of the impulse responses measured at different

locations. The data processed in Eq. (4) are a normalized

power spectrum density function over frequency, as implicit

in Eqs. (5) and (6). In order to obtain directional responses

one may also sum the signal power over the frequency of

interest for each angular direction. A rough prior knowledge

of the frequency characteristics of the sound sources is bene-

ficial to create the data with high signal-to-noise ratios, but it

is generally less critical of the signal characteristics of broad-

band in nature, such as speech signals. This work processes

the frequency range between 400 Hz and 4 kHz for potential

speech applications.

B. Results

This section presents two sets of results to demonstrate

the outcomes of the Bayesian process and its advantage over

traditional methods. The spherical harmonic beamformings

for two and three concurrent sound sources are carried out

for the experimental data. The results here demonstrate pre-

diction capability of the model in Eq. (2) for the experimen-

tal data and that the two-level Bayesian inference

FIG. 2. (Color online) The measurement setup consisted of a speaker ori-

ented in a specific manner located approximately 1–1.5 m away from the

spherical microphone array at various positions around the array for the vari-

ous measurements.

J. Acoust. Soc. Am. 146 (6), December 2019 Christopher R. Landschoot and Ning Xiang 4941

Page 7: Model-based Bayesian direction of arrival analysis …symphony.arch.rpi.edu/~xiangn/Papers/JASA2019...Model-based Bayesian direction of arrival analysis for sound sources using a spherical

quantitatively implements Occam’s razor to estimate the

number of sound sources present in the data. After the

Bayesian model selection, the estimated DoAs are also

accomplished given the selected model. Using these results,

the readers should better comprehend the two-tiered proce-

dure, which is advantageous over other alternatives.

Figure 3 illustrates the results for the set of two concur-

rent sound sources over an angular range of 360� � 180� for

azimuth, /, and elevation, h. Figure 3(a) illustrates the sound

energy distributions derived from experimentally measured

data using Eqs. (4)–(6), while Fig. 3(b) illustrates the model

predicted results using Eqs. (2) and (3), which allow for the

visualization of the sound field distribution around the spher-

ical microphone array in Cartesian coordinates. The grid res-

olution for these two-dimensional maps is 3:6� � 3:6� with

grid points of K � J ¼ 100� 50 across the azimuth and ele-

vation range as expressed in Eq. (15).

Figure 4 illustrates Bayesian evidence and Bayes’s fac-

tor estimations over the different models HS from Eq. (2).

Each model represents a different number of sound sources.

The evidence for each model is evaluated over 15 individual

runs using nested sampling. According to the Bayesian

model selection scheme discussed in Sec. III A, Fig. 4(a)

illustrates Bayesian evidence estimations over the different

models HS in Eq. (2) for S ¼ 1; 2; 3; 4, and 5. With an

increasing number of sound sources, the evidence drastically

increases from one source to two sources. From two sources

to three and beyond, the mean evidence does not signifi-

cantly increase, rather it slightly decreases. Figure 4(b)

illustrates the Bayes’s factor estimates, Lij from Eq. (9) in

decibans, from i ¼ 2;…; 5 over j ¼ 1;…; 4. The highest

Bayes’s factor is at the case of two sources, and it expresses

that the data prefer model H2 over model H1 the most, much

higher than the preference of model H3 over H2, and so on.

Figure 4(b) clearly indicates that the experimental data pre-

fer the model, H2, indicating the presence of two sound sour-

ces. After model selection, the evidence estimate of the two

source model can be readily used to estimate the posterior

probability using Eq. (10) or Eq. (13). At the same time, the

likelihood values thoroughly sampled over the entire param-

eter space are also readily available. The Bayesian parameter

estimation (in Sec. III B) finds the angular parameters as

listed in Table I. Note that both the experimental and predi-

cated data are analyzed in an angular resolution of 3:6�, and

they inevitably contain errors, as listed in Table I and graphi-

cally illustrated in Fig. 4.

In a similar fashion, Fig. 5 illustrates the results for the

set of three concurrent sound sources over an angular range

of 360� � 180� for azimuth, /, and elevation, h. The grid

resolution is 3:6� � 3:6� with grid points of K � J ¼ 100

�50 across the azimuth and elevation range as expressed in

Eq. (15). Figure 5(a) illustrates the sound energy distribu-

tions derived from experimentally measured data using Eqs.

(4)–(6), while Fig. 5(b) illustrates the model predicted results

using Eqs. (2) and (3), and for the case S¼ 3, the three sound

sources are present. Given the limited angular resolution of

the 16-channel spherical microphone array, sole visual

FIG. 3. (Color online) Directional responses of two sound sources in the

form of two-dimensional sound energy distribution. The directions of the

sound sources are at ð75�; 90�Þ and ð270�; 90�Þ. (a) Experimentally mea-

sured beamforming data. Two solid dots indicate the source DoAs. (b)

Bayesian model predicted sound energy distribution. Two solid dots indicate

the estimated DoAs.

FIG. 4. (Color online) Mean evidence estimates along with variances given

the experimental data. The data contain two sound sources at ð75�; 90�Þ and

ð270�; 90�Þ. (a) The mean evidence in decibans, evaluated over 15 individ-

ual random sampling runs using nested sampling. (b) The Bayes’s factor in

decibans, comparing the evidence of the current number of sources to the

previous number. (c) Magnified view of the evidence for three sources with

variations.

4942 J. Acoust. Soc. Am. 146 (6), December 2019 Christopher R. Landschoot and Ning Xiang

Page 8: Model-based Bayesian direction of arrival analysis …symphony.arch.rpi.edu/~xiangn/Papers/JASA2019...Model-based Bayesian direction of arrival analysis for sound sources using a spherical

inspections of the sound energy distributions would convey

poorly resolved DoA information. Yet, the model-based

Bayesian analysis still provides well-resolved DoA estima-

tions when limited information provided by the spherical

microphone array is insufficient.

Figure 6(a) illustrates Bayesian evidence estimations

over the different models, HS, from Eq. (2) for S¼ 1; 2; 3; 4, and 5. As the number of sound sources

increases, the evidence drastically increases from two sour-

ces to three, while the increase from three sources to four is

clearly less. No significant increase is observed from four

sources to five, but it is rather a slight decrease. The Bayes’s

factor estimates, Lij from Eq. (9), for i ¼ 2;…; 5 over j¼ 1;…; 4 as shown in Fig. 6(b), show the highest Bayes’s

factor is at the case of three sources. Namely, the data prefer

model H3 over model H2 the most. This preference was

much higher than that of model H4 over H3, and so on.

Figure 6(b) clearly indicates that the experimental data

prefer the model H3, indicating the presence of three sound

sources. After the selection of the three source model, the

Bayesian evidence for this model is readily available. At the

same time, the likelihood values in Eq. (10) for this model

have already been thoroughly sampled over the entire

parameter space using nested sampling. Therefore, the

parameter values can easily be extracted from the parameter

set with the highest likelihood within the three source model.

The Bayesian parameter estimation (in Sec. III B) leads to

the angular parameters as listed in Table II. Note that both

the experimental and predicated data are analyzed in an

angular resolution of 3:6�.

V. DISCUSSIONS

The spherical harmonic beamforming models estab-

lished to predict the data will never be exactly what is

TABLE I. Experimentally measured and predicted DoAs for two concurrent

sound sources. The variations are estimated using the Bayesian method over

15 runs, The errors are differences between experimental and predicted

ones. Experimental data are analyzed in an angular resolution of 3:6�.

Comparison Direction of Arrival ð/; hÞ

Experiment ð75�; 90�Þ ð270�; 90�ÞEstimates ð70:58�; 95:46�Þ ð261:2�; 80:2�ÞDeviation ð61:55�;60:73�Þ ð61:47�;60:66�ÞError ð4:42�; 5:46�Þ ð8:8�; 9:8�Þ

FIG. 5. (Color online) Directional responses of three sound sources in the

form of two-dimensional sound energy distribution. The directions of the

sound sources are ð5�; 60�Þ; ð135�; 140�Þ and ð270�; 90�Þ. (a) Experimentally

measured beamforming data. Three solid dots indicate the source DoAs. (b)

Bayesian model predicted sound energy distribution. Three solid dots indicate

the estimated DoAs.

FIG. 6. (Color online) Mean evidence estimates along with variances given

the experimental data. The data contain three sound sources at

ð5�; 60�Þ; ð135�; 140�Þ and ð270�; 90�Þ. (a) The mean evidence in decibans,

evaluated over 15 individual random sampling runs using nested sampling.

(b) The Bayes’s factor in decibans, comparing the evidence of the current

number of sources to the previous number. (c) Magnified view of the evi-

dence for four sources with variations.

TABLE II. Experimentally measured and predicted DoAs for three concur-

rent sound sources. The variations are estimated using the Bayesian methodover 15 runs, The errors are the differences between the experimental andpredicted data. Both data sets are analyzed with an angular resolution of3:6�.

Comparison Direction of Arrival ð/; hÞ

Experiment ð5�; 60�Þ ð135�; 140�Þ ð270�; 90�ÞEstimates ð8:7�; 72:4�Þ ð125:8�; 148:6�Þ ð254:1�; 71:5�ÞDeviation ð68:6

�;68:2

� Þ ð617:5�;61:4�Þ ð67:7�;67:4�ÞError ð3:7�; 12:4�Þ ð9:2�; 8:6�Þ ð15:9�; 18:5�Þ

J. Acoust. Soc. Am. 146 (6), December 2019 Christopher R. Landschoot and Ning Xiang 4943

Page 9: Model-based Bayesian direction of arrival analysis …symphony.arch.rpi.edu/~xiangn/Papers/JASA2019...Model-based Bayesian direction of arrival analysis for sound sources using a spherical

measured, but rather approximations, as they are discretized

by a finite, limited number of microphones. Further, there is

always background noise present, as well as noise introduced

by the system, so it is only possible to approximately predict

the behavior of the sound field. This current work focuses on

predicting the attributes of the most apparent sound sources

in an acoustic environment. The two-tiered Bayesian analy-

sis method, including the model selection and parameter

estimation, seems to be an appropriate method for determin-

ing the number of sources first and then their DoAs.

Figure 3(a) shows two well-separated sound sources.

One can recognize their directions, which should be located

at the two solid dots. But correlating the highest sound

energy with the DoA indicates that physically placing the

sound sources at the listed directions ð75�; 90�Þ and

ð270�; 90�Þ may also be inaccurate. For this reason, predic-

tion errors, as listed in Tables I and II, need to be evaluated

considering this source of experiment errors.

Tables I and II also indicate that this work can predict

source parameters within varying tolerances depending on

the number of sources. With three concurrent sources, there

are more variations in the estimations, and so the estimation

errors may dip to 618:5� for some source locations.

Although this variance is large compared to the other predic-

tions, ranging closer to 65� or 610�, in reality, it is still

well within the resolution of a second-order spherical micro-

phone array. Note that the variance of measurement angles

are not absolute in their error, and one source of errors also

comes from experimental errors when placing the sound

sources. For example, an azimuthal (/) measurement error

of 18� may sound like an incredible amount of error.

However, this amount of error has a drastically different

meaning depending on the elevation (h) component. If the

elevation component has been correctly predicted as 0� or

180� (i.e., the top or the bottom of the sphere), the azimuthal

component effectively does not alter the location of the

source at all, as ð180�; 180�Þ and ð0�; 180�Þ for ð/; hÞ would

effectively represent the same location in physical space.

Therefore, the effect of the estimation errors in the azimuthal

angle drastically decreases as the elevation component

approaches the top or bottom of the sphere (i.e., 180� or 0�).Since the Bayesian model selection and the Bayesian DoA

estimation discussed in this paper rely essentially on forward

evaluations of the beamforming model in Eq. (2) and the

spherical harmonic beamforming process in Eq. (4), singu-

larities near the poles and equator as discussed, for example,

in some other DoA estimation approaches (Jo and Choi,

2018; Wong et al., 2019; Zuo et al., 2018) are avoided.

This trend in decreased performance with an increase in

the number of sound sources or ambiguity of the sound field

also manifests itself in the confidence of the model selection

process. For the case of two concurrent sources, the

Bayesian evidence estimates alone present stable estimations

among individual sampling runs. They also show behavior

consistent with Occam’s razor, knowing the test scenario to

be the two sound sources. As the number of sources

increased throughout all of the test runs, the variance over

individual sampled runs became slightly larger. The experi-

mentally measured data, given a second-order spherical

microphone array, are considered to carry sufficient informa-

tion. The Bayes’s factor, which represents relative Bayesian

evidence, is at a maximum for the three source model.

The trend of increased ambiguity from two sound sour-

ces to three, resulting in less confidence in results makes log-

ical sense, as this ambiguity can be a result of a higher

number of sound sources or a higher noise level. A remedy

for the ambiguity produced by higher numbers of sources is

to increase the order of the spherical microphone array with

which the data are recorded. The spherical microphone array

utilized in this work can only beamform up to the third order

given its 16 channels, providing for a rather limited spatial

resolution. Increasing the number of microphones on the

spherical array, thus increasing its order, would provide a

higher spatial resolution and an increase in the confidence in

model selection as well as parameter estimation. There is, of

course, a limit to the increase in confidence level. Just as

three sound sources appeared more ambiguous for a third-

order spherical array, higher order spherical arrays will reach

a limit in spatial resolution as well. This will lead to a gen-

eral trend of decreasing confidence level with an increase in

the number of sound sources.

Full spherical microphone/sensor arrays are more suit-

able for applications when sound sources are expected

around the arrays from all possible directions, such as moor-

ing in deep oceans or hanging in open spaces. Furthermore,

the Bayesian formulation based on the spherical harmonic

theory is also straightforwardly extended to hemispherical or

cylindrical array configurations.

There have been many previous attempts to determine

DoAs appropriately; however, as the Bayesian model selec-

tion leverages probability rather than simply correlating high

sound energy to noise sources, this simple approach would

fail to identify the correct number of sources and DoAs for

the case of three concurrent sources, as illustrated in Fig. 5.

The Bayesian approach presented in this paper allows DoAs

to be algorithmically predicted without visual inspection,

making it especially effective when sound sources may be

partially masked or close to others. A good example is the

triple source data presented in Fig. 5. Visually, it would be

very challenging to determine the number of sources present.

If just the peak energy values were measured, the correct

number of sources may not be correctly determined, let

alone their correct locations. This Bayesian method provides

an improvement of sound source localization without having

to increase the resolution of the spherical microphone array.

VI. CONCLUDING REMARKS

The present work applies the Bayesian method to beam-

formed models, comparing them to experimental data that

are recorded by a spherical microphone array in order to esti-

mate the DoAs of concurrent sound sources. Through a two-

tiered approach to this problem of first estimating the num-

ber of sound sources and then estimating their DoAs, both of

these pieces of information can be reliably estimated. This

Bayesian method provides an improvement of the detection

of sound sources over previous methods. The Bayesian

4944 J. Acoust. Soc. Am. 146 (6), December 2019 Christopher R. Landschoot and Ning Xiang

Page 10: Model-based Bayesian direction of arrival analysis …symphony.arch.rpi.edu/~xiangn/Papers/JASA2019...Model-based Bayesian direction of arrival analysis for sound sources using a spherical

method in this work is integral to solving this problem, as

well as other problems that may arise in acoustics.

Although this work demonstrates the feasibility of

Bayesian model selection as a means to determine the DoAs

of sound sources, there is a great deal that can be improved

upon in future research. This research focuses on estimating

the locations of direct sound sources and ignoring any reflec-

tions in the space, which is only directly relevant to open

space scenarios. For future room acoustical applications, this

method of DoA analysis could potentially be extended to

determine the locations of reflections within an enclosed

space. Spatial impulse responses can be measured with a

spherical array in a space such as a concert hall. This proce-

dure may be able to identify the origins of troublesome

reflections, accomplishing a task that would otherwise be

left to the acoustician’s intuition.

This experiment has also only been tested on a 16-

channel spherical microphone array for proof-of-concept.

This second-order spherical array offers relatively limited

spatial resolution. Increasing the microphone order would

increase the spatial clarity, thus allowing for a more defini-

tive localization of concurrent sound sources. This also

allows for more sound sources to be localized. Whereas this

research only tested up to three sound sources, many com-

plex sound fields have far more than simply three distinct

sources occurring at the same time. In addition to this, there

are other methods of beamforming and ways to improve the

quality of the signal in recent works, which can be applied to

increase the quality of the measurement and DoA analysis.

Another topic to be explored is applying this Bayesian

algorithm to different types of sound sources. For this exper-

iment, only white noise was chosen because of its rich spec-

tral content. But it would be interesting to determine how

well this algorithm will perform for other, more natural

sounds, such as speech or playing a musical instrument. This

could open up the door for a range of applications.

Implementation of alternative Markov chain Monte Carlo

sampling techniques could also provide another avenue to

evaluate computational efficiency that hopefully motivates

practical applications.

ACKNOWLEDGMENTS

The authors are grateful to Dr. J. Botts, Dr. J. Braasch,

and Dr. S. Clapp for the insightful discussions during the

early stages of this work. The authors also thank Mr.

Jonathan Kawasaki and Mr. Jonathan Matthew for help in

the instrumentation of the spherical harmonics microphone

arrays.

Ahrens, J., and Spors, S. (2012). “Wave field synthesis of a sound field

described by spherical harmonics expansion coefficients,” J. Acoust. Soc.

Am. 131(3), 2190–2199.

Beaton, D., and Xiang, N. (2017). “Room acoustic modal analysis using

Bayesian inference,” J. Acoust. Soc. Am. 141(6), 4480–4493.

Blandin, C., Ozerov, A., and Vincent, E. (2012). “Multi-source TDOA esti-

mation in reverberant audio using angular spectra and clustering,” Signal

Process. 92, 1950–1960.

Bush, D., and Xiang, N. (2018). “A model-based Bayesian framework for

sound source enumeration and direction of arrival estimation using a

coprime microphone array,” J. Acoust. Soc. Am. 143(6), 3934–3945.

Craven, P. G., and Gerzon, M. A. (1977). “Coincident microphone simula-

tion covering three dimensional space and yielding various directional out-

puts,” U.S. patent US�4042779A (August 16, 1977).

DuHamel, R. H. (1952). “Pattern synthesis for antenna arrays on circular,

elliptical and spherical surfaces,” Tech. Rep. 16, Electrical Engineering

Research Laboratory, University of Illinois, IL.

Escolano, J., Xiang, N., Perez-Lorenzo, J. M., Cobos, M., and Lopez, J. J.

(2014). “A Bayesian direction-of-arrival model for an undetermined num-

ber of sources using a two-microphone array,” J. Acoust. Soc. Am. 135(2),

742–753.

Fackler, C. J., Xiang, N., and Horoshenkov, K. (2018). “Bayesian acoustic

analysis of multilayer porous media,” J. Acoust. Soc. Am. 144(4),

3582–3592.

Fernandez-Grande, E. (2016). “Sound field reconstruction using a spherical

microphone array,” J. Acoust. Soc. Am. 139(3), 1168–1178.

Furness, R. K. (1990). “Ambisonics—An overview,” in 8th Int. Conf.: TheSound of Audio (May 1990), pp. 181–190.

Gregory, P. C. (2005). Bayesian Logical Data Analysis for the PhysicalSciences (Cambridge University Press, Cambridge, UK), pp. 184–242.

Jasa, T., and Xiang, N. (2012). “Nested sampling applied in Bayesian room-

acoustics decay analysis,” J. Acoust. Soc. Am. 132(5), 3251–3262.

Jaynes, E. (1968). “Prior probabilities,” IEEE Trans. Syst. Sci. Cyber. 4(3),

227–241.

Jeffreys, H. (1965). Theory of Probability, 3rd ed. (Oxford University Press,

Oxford, UK), pp. 193–244.

Jo, B., and Choi, J.-W. (2017). “Spherical harmonic smoothing for localiz-

ing coherent sound sources,” IEEE/ACM Trans. Audio Speech Lang.

Process. 25(10), 1969–1984.

Jo, B., and Choi, J.-W. (2018). “Direction of arrival estimation using nonsin-

gular spherical ESPRIT,” J. Acoust. Soc. Am. 143(3), EL181–EL187.

Kass, R. E., and Raftery, A. E. (1995). “Bayes factors,” J. Amer. Stat.

Assoc. 90(430), 773–795.

Khaykin, D., and Rafaely, B. (2012). “Acoustic analysis by spherical micro-

phone array processing of room impulse responses,” J. Acoust. Soc. Am.

132(1), 261–270.

Knuth, K. H., Habeck, M., Malakare, N. K., Mubeen, A. M., and Placek, B.

(2015). “Bayesian evidence and model selection,” Digital Signal Process.

47, 50–67.

Li, Z., and Duraiswami, R. (2005). “Hemispherical microphone arrays for

sound capture and beamforming,” in Workshop on Applications of SignalProcessing to Audio and Acoustics (IEEE), pp. 106–109.

Madhu, N., and Martin, R. (2008). Acoustic Source Localization withMicrophone Arrays (John Wiley & Sons, Chichester, UK), pp. 135–166.

Meyer, J., and Elko, G. (2002). “A highly scalable spherical microphone

array based on an orthonormal decomposition of the soundfield,” in

ICASSP, Proc. IEEE Int. Conf. on Acoust., Speech and Signal Process.

Vol. 2, pp. 1781–1784.

Mohan, S., Lockwood, M. E., Kramer, M. L., and Jones, D. L. (2008).

“Localization of multiple acoustic sources with small arrays using a coher-

ence test,” J. Acoust. Soc. Am. 123(4), 2136–2147.

Morgenstern, H., and Rafaely, B. (2018). “Modal smoothing for analysis of

room reflections measured with spherical microphone and loudspeaker

arrays,” J. Acoust. Soc. Am. 143(2), 1008–1018.

Nadiri, O., and Rafaely, B. (2014). “Localization of multiple speakers under

high reverberation using a spherical microphone array and the direct-path

dominance test,” IEEE/ACM Trans. Audio Speech Lang. Process. 22(10),

1494–1505.

Nannuru, S., Koochakzadeh, A., Gemba, K. L., Pal, P., and Gerstoft, P.

(2018). “Bayesian learning for beamforming using sparse linear arrays,”

J. Acoust. Soc. Am. 144(5), 2719–2729.

Neal, R. M. (2003). “Slice sampling,” Ann. Statist. 31(3), 705–741.

Rafaely, B. (2015). Fundamentals of Spherical Array Processing, 1st ed.

(Springer GmbH, Berlin), Chap. 5.

Richard, A., Fernandez-Grande, E., Brunskog, J., and Jeong, C.-H. (2017).

“Estimation of surface impedance at oblique incidence based on sparse

array processing,” J. Acoust. Soc. Am. 141(6), 4115–4125.

Shabtai, N. R., and Vorl€ander, M. (2015). “Acoustic centering of sources with

high-order radiation patterns,” J. Acoust. Soc. Am. 137(4), 1947–1961.

Sivia, D., and Skilling, J. (2006). Data Analysis: A Bayesian Tutorial, 2nd

ed. (Oxford University Press, New York), Chap. 9.

Skilling, J. (2004). “Nested sampling,” AIP Conf. Proc. 735(1), 395–405.

Sun, H., Mabande, E., Kowalczyk, K., and Kellermann, W. (2012).

“Localization of distinct reflections in rooms using spherical microphone

array eigenbeam processing,” J. Acoust. Soc. Am. 131(4), 2828–2840.

J. Acoust. Soc. Am. 146 (6), December 2019 Christopher R. Landschoot and Ning Xiang 4945

Page 11: Model-based Bayesian direction of arrival analysis …symphony.arch.rpi.edu/~xiangn/Papers/JASA2019...Model-based Bayesian direction of arrival analysis for sound sources using a spherical

Torres, A. M., Lopez, J. J., Pueo, B., and Cobos, M. (2013). “Room acous-

tics analysis using circular arrays: An experimental study based on sound

field plane-wave decomposition,” J. Acoust. Soc. Am. 133(4), 2146–2156.

Vorl€ander, M., Schr€oder, D., Pelzer, S., and Wefers, F. (2015). “Virtual real-

ity for architectural acoustics,” J. Build. Perform. Simul. 8(1), 15–25.

Wasserman, L. (2000). “Bayesian model selection and model averaging,”

J. Math. Psychol. 44(1), 92–107.

Williams, E. (1999). Fourier Acoustics: Sound Radiation and Near FieldAcoustical Holography, 2nd ed. (Academic, London), pp. 185–232.

Wong, K. T., Morris, Z. N., and Nnonyelu, C. J. (2019). “Rules-of-thumb to

design a uniform spherical array for direction finding—Its Cram�er-Rao

bounds’ nonlinear dependence on the number of sensors,” J. Acoust. Soc.

Am. 145(2), 714–723.

Xiang, N., and Fackler, C. (2015). “Objective Bayesian analysis in

acoustics,” Acoust. Today 11(2), 54–61.

Xiang, N., and Goggans, P. M. (2001). “Evaluation of decay times in cou-

pled spaces: Bayesian parameter estimation,” J. Acoust. Soc. Am. 110(3),

1415–1424.

Xiang, N., and Goggans, P. M. (2003). “Evaluation of decay times in cou-

pled spaces: Bayesian decay model selection,” J. Acoust. Soc. Am.

113(3), 2685–2697.

Zhao, S., Dabin, M., Cheng, E., Qiu, X., Burnett, I., and Liu, J. C. (2018).

“Mitigating wind noise with a spherical microphone array,” J. Acoust.

Soc. Am. 144(6), 3211–3220.

Zuo, L., Pan, J., and Ma, B. (2018). “Fast DOA estimation in the spectral

domain and its applications,” Prog. Electromagn. Res. M 66, 73–85.

4946 J. Acoust. Soc. Am. 146 (6), December 2019 Christopher R. Landschoot and Ning Xiang