ENHANCED 3D SOUND FIELD SYNTHESIS AND REPRODUCTION...

5
Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-00), Verona, Italy, December 7-9, 2000 DAFX-1 ENHANCED 3D SOUND FIELD SYNTHESIS AND REPRODUCTION SYSTEM BY COMPENSATING INTERFERING REFLEXIONS A. Sontacchi, R. Hoeldrich Institute of Electronic Music and Acoustics University of Music and Dramatic Arts Graz [email protected], [email protected] ABSTRACT The antique stereophonic recording and playback format is going to be replaced by new surround sound formats in the near future. At the moment, various surround techniques are being investigated in many artistic and technical applications. The main concern is to find an appropriate recording and playback format which supports the natural spatial hearing cues. Therefore, surround sound systems should provide a homogeneous and coherent sound field image, both for recorded and synthesized sound fields [1]. In a homogeneous sound reproduction system, no direction is treated preferentially. Coherent sound field image means that the image remains stable under changes of the listener position, though the image may change as a natural sound field does. The Holophony and Ambisonic system described by Nicol and Emerit [2] is the basic approach. This system will be extended by a new approach to compensate the interfering reflections of the reproduction room. Further possibilities to determine higher order Ambisonic signals using the beam forming approach are investigated. 1. INTRODUCTION This paper deals with a improved implementation of a sound spatialisation system rendering a recorded or synthesized 3D sound field using an arbitrary loudspeaker array in any desired room. First of all, the main spatial hearing cues must be considered. The following mechanisms indicate the position of sound in 3D space to the auditory cortex: Time differences between the ears (ITD, interaural time difference) Intensity differences between the ears (ILD, interaural level difference) Complex distance cues (reverberation, spectral balance modifications) Spectral cues caused by the shape of the head and the body (HRTFs Head Related Transfer Functions) Sensory augmentation (supplementary visual cues) Conventional cinema or TV surround systems (e.g. Dolby Prologic, DTS etc.) do not accomplish all these requirements because they only attempt to give a general impression of the space. Due to additional visual cues they work quite well and satisfying. HRTF related systems perform very accurately but 2. AMBISONIC AND HOLOPHONY 2.1. Ambisonic

Transcript of ENHANCED 3D SOUND FIELD SYNTHESIS AND REPRODUCTION...

Page 1: ENHANCED 3D SOUND FIELD SYNTHESIS AND REPRODUCTION …decoy.iki.fi/dsound/ambisonic/motherlode/source/... · 2017-03-10 · Proceedings of the COST G-6 Conference on Digital Audio

Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-00), Verona, Italy, December 7-9, 2000

DAFX-1

ENHANCED 3D SOUND FIELD SYNTHESIS AND REPRODUCTIONSYSTEM BY COMPENSATING INTERFERING REFLEXIONS

A. Sontacchi, R. Hoeldrich

Institute of Electronic Music and AcousticsUniversity of Music and Dramatic Arts Graz

[email protected], [email protected]

ABSTRACT

The antique stereophonic recording and playback format is goingto be replaced by new surround sound formats in the near future.At the moment, various surround techniques are beinginvestigated in many artistic and technical applications. The mainconcern is to find an appropriate recording and playback formatwhich supports the natural spatial hearing cues. Therefore,surround sound systems should provide a homogeneous andcoherent sound field image, both for recorded and synthesizedsound fields [1]. In a homogeneous sound reproduction system,no direction is treated preferentially. Coherent sound field imagemeans that the image remains stable under changes of the listenerposition, though the image may change as a natural sound fielddoes. The Holophony and Ambisonic system described by Nicoland Emerit [2] is the basic approach. This system will beextended by a new approach to compensate the interferingreflections of the reproduction room. Further possibilities todetermine higher order Ambisonic signals using the beamforming approach are investigated.

1. INTRODUCTION

This paper deals with a improved implementation of a soundspatialisation system rendering a recorded or synthesized 3Dsound field using an arbitrary loudspeaker array in any desiredroom. First of all, the main spatial hearing cues must beconsidered. The following mechanisms indicate the position ofsound in 3D space to the auditory cortex:

• Time differences between the ears (ITD, interaural timedifference)

• Intensity differences between the ears (ILD, interaural leveldifference)

• Complex distance cues (reverberation, spectral balancemodifications)

• Spectral cues caused by the shape of the head and the body(HRTFs Head Related Transfer Functions)

• Sensory augmentation (supplementary visual cues)

Conventional cinema or TV surround systems (e.g. DolbyPrologic, DTS etc.) do not accomplish all these requirements

because they only attempt to give a general impression of thespace. Due to additional visual cues they work quite well andsatisfying. HRTF related systems perform very accurately but

2. AMBISONIC AND HOLOPHONY

2.1. Ambisonic

Page 2: ENHANCED 3D SOUND FIELD SYNTHESIS AND REPRODUCTION …decoy.iki.fi/dsound/ambisonic/motherlode/source/... · 2017-03-10 · Proceedings of the COST G-6 Conference on Digital Audio

Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-00), Verona, Italy, December 7-9, 2000

DAFX-2

Figure 1. Recording of a single source with

direction .

Let us consider1 the sound field in a point ),(rr = , caused by aplanar wave (reference wave) arriving from direction , as givenin Eq. 1 (and Fig.1).

)cos(),(Referenz

== rjkePrkjePrS (1)

The vector k represents the wavenumber 2=k and the

direction of arrival. Furthermore, the vector r describes thepoint of interest (listening position). During reproduction, Nloudspeakers are arranged symmetrically along a circuit (Fig. 2).Each of the loudspeakers ( nnP , ) radiates a plane wave (Eq. 2).

)cos(.),( nn jkrenPrkjenPrnS == (2)

Figure 2. Reproduction

The superposition of the individual waves is described by thesystem wave SAmbisonic.

==

====

N

n

njkrenP

N

n

rkjenPN

nrnSrS n

1

)cos(

11),(),(Ambisonic

(3)

In order to reproduce the original sound field, the sound fieldcaused by the Ambisonic system has to meet Eq. 4.

),(Ambisonic),(Referenz rSrS (4)

Planar waves are specified by a constant pressure term and anoscillating phase term (e.g. see Eq.1). The oscillating term can bedeveloped in a Bessel-Fourier series [7]. Using this property, wecan rewrite equation 1 and 3:

1 The following considerations are restricted to the 2-dimensional

case.

=+=

=

1)](cos[)(2)(0

)cos(),(Referenz

mmkrmJmiPkrJP

jkrePrS

=++=

1)]sin()sin()cos())[cos((2)(0

mmmmmkrmJmikrJP

==

====

N

n

njkrenP

N

n

rnkjenP

N

nrnSrS

1

)cos(11

),(),(Ambisonic

=++=

N

mnmmnmmkrmJmikrJnP

1 1)]sin()sin()cos())[cos((2)(0

Equation 4 is met if the coefficients of the mcos and msinfunctions are identical. Therefore we get the matching conditionsgiven in equation 5.

==

==

==

N

nnmnPmP

N

nnmnPmP

N

nnPP

1)sin()sin(

1)cos()cos(

1 = ,...,1m (5)

This system consists of an infinite number of equations. Underpractical considerations the number of equation is limited by thechosen system order m. The higher the system order, the moreaccurate the original sound field will be reproduced. Furthermore,the area of correct reproduction will be enlarged.

2.1.1 Encoding

On the left hand side of Eq. 5, we already see the necessary

SW = 707.0cos= SX sin= SY

2cos= SU 2sin= SV

Page 3: ENHANCED 3D SOUND FIELD SYNTHESIS AND REPRODUCTION …decoy.iki.fi/dsound/ambisonic/motherlode/source/... · 2017-03-10 · Proceedings of the COST G-6 Conference on Digital Audio

Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-00), Verona, Italy, December 7-9, 2000

DAFX-3

2.1.2 Decoding

The decoding equations can be obtained by rewriting Equation 5in matrix form.

xAb = (6)

with Tb ,...],,,,[ VUYXW=

[ ]Tx NPPPP ,...3,2,1=

=

)sin()2sin()1sin(

)sin()2sin()1sin(

)cos()2cos()1cos(

111

Nmmm

N

NA

Vector b represents the Ambsionic signals, and the matrix A isdefined by the arrangement of the loudspeaker array. Theunknown loudspeaker feeds Pn are described by vector x. Weobtain the loudspeaker feeds by solving the following normalequations

bAAAx TT= 1)( (7)

The loudspeaker feeds are found by weighting the Ambisonicsignals according to the loudspeaker position. If the loudspeakerarray is symmetrically arranged along a circuit (2 dimensionalcase, in the 3 dimensional case along a sphere) this will lead to:

))2sin(2)2cos(2)sin(2)cos(2(1

iViUiYiXWNiP ++++=

As a restriction for the decoding array, the number of theloudspeakers must exceed the number of the encoded Ambisonicsignals and the loudspeaker array should be as symmetrical aspossible, otherwise a parameterization of the loudspeaker feeds isnecessary [8], [9]. Using signals of higher order spherical

2.2. Holophony

[ ]=S

dSnSrPSSrRrGSrRrGSSrPRrP )()()()(4

1)(

SrRr

SrRrjke

SrRrG =)(

Page 4: ENHANCED 3D SOUND FIELD SYNTHESIS AND REPRODUCTION …decoy.iki.fi/dsound/ambisonic/motherlode/source/... · 2017-03-10 · Proceedings of the COST G-6 Conference on Digital Audio

Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-00), Verona, Italy, December 7-9, 2000

DAFX-4

Figure 5. Playback situation with Holophony.

The advantages of the Holophony system are the simplerecording procedure and the simultaneous elimination of thereproduction room during playback. On the other hand, adrawback of the playback system is the use of adaptive filters.The number of filters is MN . The length of the filter impulseresponse is determined by the length of the room reverberationtime.

Considering the various advantages and disadvantages, itseems to be useful to combine both systems. Under a fewassumptions, it can be proofed that Ambisonic is a special case ofHolophony. In [2], it is shown how higher order Ambisonicsignals can be derived out of the Holophony array signals.Therefore, the Holophony approach is used for recording. Therecorded signals are transformed into Ambsionic signals, and theAmbsonic approach is applied to the reproduction system.Nevertheless the approach does not account for the reflections ofthe reproduction room. In the following part a new extension forthe combined system is proposed that compensate thesereflections.

3. COMPENSATION OF REFLECTIONS

In general, we are confronted with interfering room reflectionsduring reproduction. These reflections can be interpreted asadditional sources. The new approach presents a possibility tocompensate these reflections by using the three dimensional roomimpulse response in a prefiltering process of the Ambisonicsignals. The room impulse response of each loudspeaker isrecorded with a microphone array and transformed to anAmbisonic representation.

Consider a 1st order, 2 dimensional Ambisonic system. Theroom impulse response caused by the jth loudspeaker at angle j

is described by equation 8.

),(1

)0()(, kjtt

kkatttjh

kjj=

+= (8)

Each reflection can be expressed as a delayed (tj,k) source with thegain ak originating from angle kj, . The associated Ambisonic

representation of the reflective part of the 3D room impulseresponse is expressed by equation 9 (discrete time index n).

),(1

)(, kjmn

kkan

rjhW=

= (9a)

),cos(),(1

)(, kjkjmn

kkan

rjhX=

= (9b)

),sin(),(1

)(, kjkjmn

kkan

rjhY=

= (9c)

In order to compensate the reflections, they are interpreted asnegative sources. In the free field condition, the loudspeaker feedsLi for i=1 to N are obtained by equation 10.

=

oYoXoW

D

NL

L1(10)

Wo, Xo and Yo are the original Ambisonic signals. The matrix Dwith elements1 di,c describes the weighting of each signal withregard to the geometrical arrangement of the loudspeaker array.The jth loudspeaker feed of the original (uncompensated) system isgiven in equation 11.

)(j,3)(j,2)(j,1)(j noYdnoXdnoWdnL ++= (11)

To compensate the interfering reflections caused by the jth loudspeaker its feed Lj (Eq. 10) is convolved with the reflectivepart of its room impulse response,

)(,

)(j)(,,

nrjhWnLn

comprjLW = (12a)

)(,

)(j)(,,

nrjhXnLn

comprjLX = (12b)

)(,

)(j)(,,

nrjhYnLn

comprjLY = (12c)

and we obtain the loudspeaker feeds Li,j that compensate thereflection caused by the jth

speaker.

+=

comprjLYcomprjL

XcomprjLW

D

oYoXoW

D

L

L

,,

,,

,,

jN,

j,1(13)

1 The index i of the decoder matrix D element di,c describesthe used loudspeaker Li and the index c depends on the usedchannel (Ambisonic signal).

Page 5: ENHANCED 3D SOUND FIELD SYNTHESIS AND REPRODUCTION …decoy.iki.fi/dsound/ambisonic/motherlode/source/... · 2017-03-10 · Proceedings of the COST G-6 Conference on Digital Audio

Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-00), Verona, Italy, December 7-9, 2000

DAFX-5

Eq. 13 can be rewritten as:

+

+

+

=

oYoXoW

nrjhYnn

rjhYnrjhY

nrjhXn

rjhXnnrjhX

nrjhWn

rjhWnrjhWn

D

L

L

)(,

d)()(,

d)(,

d

)(,

d)(,

d)()(,

d

)(,

d)(,

d)(,

d)(

jN,

j1,

j,3j,2j,1

j,3j,2j,1

j,3j,2j,1

If all impulse responses caused by the N loudspeakers are takeninto account, we obtain the compensated loudspeaker feeds givenin equation 14.

=+

==

==+

=

===+

=

oYoXoW

N

in

rihYnN

in

rihYN

in

rihY

N

in

rihXN

in

rihXnN

in

rihX

N

in

rihWN

in

rihWN

i rihWn

D

1)(

,i,3d)(1

)(,i,2d

1)(

,i,1d

1)(

,i,3d1

)(,i,2d)(

1)(

,i,1d

1)(

,i,3d1

)(,i,2d

1 ,i,1d)(

*NL

*1L

(14)

The proposed procedure requires C2 FIR filters depending on thenumber of channels C. (C=2m+1 for two dimensional andC=(m+1)2 for three dimensional Ambisonic systems of order m).However, the amount of filters is independent of the number ofloudspeakers.

4. THE BEAM FORMING APPROACH

In the chapter 2, the approach of Emerit and Nicol [2] has beenproposed to obtain higher order Ambisonic signals. Thetransformation is mathematical very elegant but a drawback arisesconcerning the required number and the arrangement of themicrophones. Therefore, a new approach using beam forming toobtain the Ambisonic signals is considered. At the moment,investigations are directed to find the proper microphone array. Infigure 6, the signal flow chart is depicted for one target function.

Figure 6. Beam forming signal flow chart for broad bandsignals.

The target functions are defined by the different microphonecharacteristics.

5. CONCLUSION

The basic approaches of Ambisonic and Holophony have beenintroduced. Both systems can be merged to combine theadvantages of each approach. A new extension of the Ambisonicsystem has been proposed which offers the possibility tocompensate the interfering room reflections of the playbackroom. Further investigations should yield arbitrarily higher orderAmbisonic signals using the beam forming approach.

The proposed extension concerning the interfering roomreflections can be used to generate artificial 3D reverb and seemsto be very promising for virtual reality applications.

REFERENCES

[1]