Noise reduction and binaural cue preservation of multi- microphone algorithms Simon Doclo, Tim van...

23
Noise reduction and binaural Noise reduction and binaural cue preservation of multi- cue preservation of multi- microphone algorithms microphone algorithms Simon Doclo , Tim van den Bogaert, Marc Moonen, Jan Wouters Dept. of Electrical Engineering (ESAT-SCD), KU Leuven, Belgium Dept. of Neurosciences (ExpORL), KU Leuven, Belgium Oldenburg, June 28 2007

Transcript of Noise reduction and binaural cue preservation of multi- microphone algorithms Simon Doclo, Tim van...

Noise reduction and binaural Noise reduction and binaural

cue preservation of multi-cue preservation of multi-

microphone algorithms microphone algorithms

Simon Doclo, Tim van den Bogaert, Marc Moonen, Jan Wouters

Dept. of Electrical Engineering (ESAT-SCD), KU Leuven, Belgium Dept. of Neurosciences (ExpORL), KU Leuven, Belgium

Oldenburg, June 28 2007

22

OverviewOverview

• Problem statement

o Improve speech intelligibility + preserve spatial awareness

o Bilateral vs. binaural processing

• Binaural signal processing using multi-channel Wiener filter

o MWF: noise reduction and preservation of speech cues, noise cues are distorted

o Extension of MWF to preserve binaural cues of all components:

– MWFv: partial estimation of noise component

– MWF-ITF: extension with Interaural Transfer Function

o Physical and perceptual evaluation

• Reduce bandwidth requirements of wireless link

o Distributed binaural MWF

33

Problem statementProblem statement

• Many hearing impaired are fitted with a hearing aid at both earso Signal processing to selectively enhance useful speech signal

and improve speech intelligibility

o Signal processing to preserve directional hearing and spatial awareness

o Multiple microphone available: spectral + spatial processing• Binaural auditory cues:

o Interaural Time Difference (ITD) – Interaural Level Difference (ILD)

o Binaural cues, in addition to spectral and temporal cues, play an important role in binaural noise reduction and sound localisation

o ITD: f < 1500Hz, ILD: f > 2000Hz

IPD/ITD

ILD

Problem statement

-bilateral/binaural Binaural processing

Bandwidthreduction

Conclusions

44

Bilateral vs. BinauralBilateral vs. Binaural

Bilateral system

Independent left/right processing:Preservation of binaural cuesfor localisation ?

Binaural system

More microphones: better performance ? preservation of binaural cues ?

Need of binaural link

Problem statement

-bilateral/binaural Binaural processing

Bandwidthreduction

Conclusions

55

• Bilateral system:

o Independent processing of left and right hearing aid

o Localisation cues are distorted

RMS error per loudspeaker when accumulating all responses of the different test conditions (NH = normal hearing, NO = hearing impaired without hearing aids, O = omnidirectional configuration, A = adaptive directional configuration)

[Van den Bogaert et al., 2006]

Bilateral vs. BinauralBilateral vs. Binaural

Problem statement

-bilateral/binaural Binaural processing

Bandwidthreduction

Conclusions

also effect on intelligibility through binaural hearing advantage

77

• Bilateral system:

o Independent processing of left and right hearing aid

o Localisation cues are distorted

• Binaural system:

o Cooperation between left and right hearing aid (e.g. wireless link)

o Assumption: all microphone signals are available at the same timeObjectives/requirements for binaural

algorithm:

1. SNR improvement: noise reduction, limit speech distortion

2. Preservation of binaural cues (speech/noise) to exploit binaural hearing advantage

3. No assumption about position of speech source and microphones

[Van den Bogaert et al., 2006]

Bilateral vs. BinauralBilateral vs. Binaural

Problem statement

-bilateral/binaural Binaural processing

Bandwidthreduction

Conclusions

1010

Configuration and signalsConfiguration and signals

• Configuration: microphone array with M microphones at left and right hearing aid, communication between hearing aids

noise component

0, 0, 0, 0( ) = ( ) , 0) =( 1m m mVY X m M 00, 0,, 0( ) = ( )( 0) , = 1mm mY V m MX

speech component

0 0 1 1( ) = ( ) ( ), ( ) = ( ) ( )H HZ Z W Y W Y

• Use all microphone signals to compute output signal at both ears

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

1111

Overview of cost functionsOverview of cost functions

Multi-channel Wiener filter (MWF): MMSE estimate of speech component in microphone signal at both ears

trade-off noise reduction and speech

distortion

Speech-distortion weighted multi-channel Wiener filter

(SDW-MWF)[Doclo 2002, Spriet 2004]binaural cue

preservation of speech + noise

Partial estimation of noise component

(MWFv)[Klasen 2005]

Extension with ITD-ILD or Interaural Transfer

Function (ITF)

[Doclo 2005, Klasen 2006, Van den Bogaert 2007]

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

1212

• Binaural SDW-MWF: estimate of speech component in microphone signal at both ears (usually front microphone) + trade-off between noise reduction and speech distortion

Binaural multi-channel Wiener Binaural multi-channel Wiener filterfilter

0

1

= , = ,x v M xx y v

M x v x

R R 0 rR r R R R

0 R R r

0

1

2 2

0, 0 0

11, 1

( )H H

r

HHr

XJ E

X

W X W VW

W VW X1=SDWW R r

speech componentin front microphones noise

reductionspeech distortion

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

estimate

o Depends on second-order statistics of speech and noise

o Estimate Ry during speech-dominated time-frequency segments, estimate Rv during noise-dominated segments, requiring robust voice activity detection (VAD) mechanism

o No assumptions about positions of microphones and sources

o Adaptive (LMS-based) algorithm available [Spriet 2004, Doclo 2007]

1313

Binaural multi-channel Wiener Binaural multi-channel Wiener filterfilter

• Interpretation for single speech source:

o Spectral and spatial filtering operation

with (spatial) coherence matrix and P (spectral) power

o Equivalent to superdirective beamformer (diffuse noise field) or delay-and-sum beamformer (spatially white noise field) +single-channel WF-based postfilter (spectral subraction)

Spatial separation between speech and noise sources

SNR

0

1 1*

,0 0,1 1 /

Hv v

SDW rH Hv v v s

AP P

Γ A A Γ AW

A Γ A A Γ A

• Binaural cues (ITD-ILD) :

Perfectly preserves binaural cues of speech component

Binaural cues of noise component speech component !!

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

1414

• Partial estimation of noise component

o Estimate of sum of speech component and scaled noise component

o Relationship with SDW-MWF: mix with reference microphone signals

reduction of noise reduction performance

works for multiple noise sources

Partial noise estimation (MWFv)Partial noise estimation (MWFv)

0

1

0

1

0,

2

0, 0

1, 11,

( )r

Hr

Hr

r

X

X

V

VJ E

W YW

W Y

0

1

0

1

2 2

0, 0 0

1,

0,

1 11,

0 1( ) ,r

r

H Hr

H Hr

XJ

VE

X V

W X W VW

W X W V

0

1

0 0, ,0

1 1, ,1

(1 )

(1 )

r SDW

r SDW

Z Y Z

Z Y Z

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

1515

Interaural Wiener filter (MWF-Interaural Wiener filter (MWF-ITF)ITF)

• Extension of SDW-MWF with binaural cueso Add term related to binaural cues of noise (and speech)

component

o Possible cues: ITD, ILD, Interaural Transfer Function (ITF)

( ) = ( ) ( ) ( )x vtot SDW cue cueJ J J J W W W W

0 0

1 1

Hv v

out Hv

ZITF

Z

W V

W V

0 10

1 1 1

*0, 1,0, 0 1

*1, 1 11, 1,

( , )

( , )

r rrv vin

r vr r

E V VV r rITF

V r rE V V

R

Re.g.

0

1

2 2

0, 0 0

11, 1

2 2

0 1 0 1

( ) =H H

r

tot HHr

H x H H v Hin in

XJ E

X

E ITF E ITF

W X W VW

W VW X

W X W X W V W V

ITF preservation speech ITF preservation noise

o Closed form expression!o large changes direction of speech increase weight o Implicit assumption of single noise source

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

1717

Simulation setupSimulation setup

• Identification of HRTFs:o Binaural recordings on CORTEX MK2 artificial head

o 2 omni-directional microphones on each hearing aid (d=1cm)

o LS = -90:15:90, 90:30:270, 1m from head

o Room reverberation: T60=140 ms (and T60=510 ms)

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

1818

Experimental resultsExperimental results

• Simulations:

o SxNy, SNR = 0 dB on left front microphone (broadband)

o fs = 20.48 kHz

• MWF algorithmic parameters:

o batch procedure, perfect VAD

o L=96, =5

o MWFv for different , MWF-ITF for different ,• Physical evaluation:

o Speech = HINT, noise = babble noise

o Speech intelligibility: SNR

o Localisation: ITD / ILD

• Perceptual evaluation:

o Preliminary study with NH subjects

o Speech intelligibility: SRT

o Localisation: localise S and N

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

1919

Physical evaluationPhysical evaluation

• Performance measures:

o Intelligibility weighted SNR improvement (left/right)

o ILD error (speech/noise component) power ratio

x xx out i in i

i

ILD ILD ILD

o ITD error (speech/noise component) phase of cross-correlation

x i x ii

ITD I ITD 1

* *0, 1, 0 10

{ } { }x i r r x xITD E X X E Z Z

L i L ii

SNR I SNR

importance of i-th frequencyfor speech intelligibility

low-pass filter 1500 Hz

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

0 0.2 0.4 0.6 0.8 10

10

20

S

NR

left

[dB

]

0 0.2 0.4 0.6 0.8 10

10

20

S

NR

rig

ht [

dB]

0 0.2 0.4 0.6 0.8 10

2

4

6

IL

D s

peec

h [d

B]

0 0.2 0.4 0.6 0.8 10

2

4

6

IL

D n

oise

[dB

]

0 0.2 0.4 0.6 0.8 10

0.5

1

IT

D s

peec

h [r

ad]

auditec 60deg (=5, L=96, N=4)

0 0.2 0.4 0.6 0.8 10

0.5

1

IT

D n

oise

[ra

d]

Physical evaluation: MWFvPhysical evaluation: MWFvS0N60

2323

• Procedure:o headphone experiments, using measured HRTFs

o Filters are calculated off-line on VU speech-weighted noise as S and multitalker babble noise as N

o All stimuli presented at comfort level, 5 NH subjects (ongoing)

Perceptual evaluationPerceptual evaluation

Headphones

HRTFx

HRTFv

speech

noise

GBinaural filter

Mic L

R

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

• Speech intelligibility:o Adaptive procedure to find 50% Speech Reception Threshold (SRT)

• Localisation:o S and N components (telephone) are sent separately through filter

o Localise S and N in room where HRTFs were measured

o Level roving 6 dB, 3 repetitions per condition for each subject

2424 N270 N315 S0 S45 N60 S90

-90

-75

-60

-45

-30

-15

0

15

30

45

60

75

90

v a mwf02 0dB

Perceptual evaluation: MWFvPerceptual evaluation: MWFv• Algorithms: unprocessed, state-of-the art bilateral, MWF, MWFv

(=0.2)

• Conditions: S0N60, S45N315 and S90N270 (T60=510 ms)

N270 N315 S0 S45 N60 S90

-90

-75

-60

-45

-30

-15

0

15

30

45

60

75

90

v a unproc 0dB

N270 N315 S0 S45 N60 S90

-90

-75

-60

-45

-30

-15

0

15

30

45

60

75

90

v a classic 0dB

N270 N315 S0 S45 N60 S90

-90

-75

-60

-45

-30

-15

0

15

30

45

60

75

90

v a mwf0 0dB

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

2525

Perceptual evaluation: MWFvPerceptual evaluation: MWFv

• With state-of-the-art systems: preservation of binaural cues only within central angle of frontal hemisphere.

• Binaural MWF:

o preserves localization cues for speech source

o preserves localization cues for noise source(s) with small mixing

o Recent SRT experiments (N=2) show no substantial SRT difference between =0 and =0.2

• Ongoing research:

o Perceptual evaluation (SRT and localisation) for MWF-ITF

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

2828

Bandwidth constraintsBandwidth constraints

• Binaural MWF:o 2M microphone signals are transmitted over wireless link

• Reduce bandwidth requirement of wireless link:o Transmit one signal from contralateral ear

Problem statement

Binaural processing

Bandwidthreduction

Conclusions

– Front contralateral microphone signal

– Output of contralateral fixed (e.g. superdirective) beamformer

– Output of MWF using only M contralateral microphone signals

– Iterative distributed binaural MWF scheme

2929

Physical evaluationPhysical evaluation

60 90 120 180 270 300 -60 60 -120 120 120 210 60 120 180 210 60 120 180 270

8

10

12

14

16

18

20

22Performance comparison of MWF-based binaural algorithms

noise source(s) angle (°)

AI

wei

ghte

d S

NR

impr

ovem

ent

(dB

)

MWF-fullMWF-frontMWF-contraMWF-iter

Problem statement

Binaural processing

Bandwidthreduction

Conclusions

Performance of dB-MWF close to full binaural MWF !

3030

Contralateral directivity patternContralateral directivity pattern

T60=140 ms

S0N120

Left HA -50

-45

-40

-35

30

210

60

240

90

270

120

300

150

330

180 0

Left HA - contralateral (N=4,120 deg,=5,SNR=14.6277)

Fullband

SNR=14.6dB

B-MWF

-45

-40

-35

-30

30

210

60

240

90

270

120

300

150

330

180 0

Left HA - front contralateral (N=3)

Fullband

MWF-front

SNR=10.5dB

-55

-50

-45

-40

-35

30

210

60

240

90

270

120

300

150

330

180 0

Left HA - MWF contralateral (N=4,120 deg,=5,SNR=14.2051)

Fullband

MWF-contra

SNR=14.2dB

-50

-45

-40

-35

30

210

60

240

90

270

120

300

150

330

180 0

dB-MWF

SNR=14.2dB

3131

ConclusionsConclusions

• State-of-the art signal processing in (bilateral) HAs: preservation of binaural cues only within central angle of frontal hemisphere

• Binaural MWF:

o Substantial noise reduction (MWF 4 3 > 2)

o Preservation of binaural speech cues

o Distortion of binaural noise cues

o No assumptions about positions and microphones VAD

• Compromise between noise reduction and binaural cue preservation can be achieved with extensions of MWF

o Mixing with microphone signals

o Interaural Transfer Function

• Reduction of bandwidth using distributed MWF

Problem statement

Binaural processing

Bandwidthreduction

Conclusions