Equalization of Audio Channels

47
Equalization of Audio Channels A Practical Approach for Speech Communication Nils Westerlund November, 2000

Transcript of Equalization of Audio Channels

Page 1: Equalization of Audio Channels

Equalization of Audio ChannelsA Practical Approach for Speech Communication

Nils Westerlund

November, 2000

Page 2: Equalization of Audio Channels

Abstract

Many occupations of today requires the usage of personal preservative equip-ment such as a mask to protect the employee from dangerous substances or theusage of a pair of ear-muffs to damp high sound pressure levels. The goal ofthis Master thesis is to investigate the possibility of placing a microphone forcommunication purposes inside such a preservative mask as well as the possibil-ity of placing the microphone inside a persons auditory meatus and perform adigital channel equalization on the speech path in question in order to enhancethe speech intelligibility.

Subjective listening tests indicates that the speech quality and intelligibilitycan be considerably improved using some of the methods described in this thesis.

Page 3: Equalization of Audio Channels

Acknowledgements

I would like to express my gratitude to Dr. Mattias Dahl for his support andextraordinary ability to explain complex systems and relationships in an under-standable way. I would also like to express my appreciation to Svenska EMC-Lab in Karlskrona for letting me use their semi-damped room for measurementpurposes.

Page 4: Equalization of Audio Channels

Contents

1 Channel Equalization — An Introduction 2

1.1 Non-Adaptive Methods . . . . . . . . . . . . . . . . . . . . . . . 31.2 Adaptive Channel Equalization . . . . . . . . . . . . . . . . . . . 4

2 Equalization of Mask Channel 8

2.1 Gathering of Measurement Data . . . . . . . . . . . . . . . . . . 82.2 Coherence Function . . . . . . . . . . . . . . . . . . . . . . . . . 92.3 Channel Equalization using tfe . . . . . . . . . . . . . . . . . . . 102.4 Adaptive Channel Equalization . . . . . . . . . . . . . . . . . . . 10

2.4.1 The LMS Algorithm . . . . . . . . . . . . . . . . . . . . . 112.4.2 The NLMS Algorithm . . . . . . . . . . . . . . . . . . . . 142.4.3 The LLMS Algorithm . . . . . . . . . . . . . . . . . . . . 162.4.4 The RLS Algorithm . . . . . . . . . . . . . . . . . . . . . 16

2.5 Minimum-Phase Approach . . . . . . . . . . . . . . . . . . . . . . 182.6 Results of Mask Channel Equalization . . . . . . . . . . . . . . . 21

3 Equalization of Mouth-Ear Channel 22

3.1 Gathering of Measurement Data . . . . . . . . . . . . . . . . . . 233.2 Coherence Function of Mouth-Ear Channel . . . . . . . . . . . . 243.3 Channel Equalization Using tfe . . . . . . . . . . . . . . . . . . 253.4 Adaptive Channel Equalization . . . . . . . . . . . . . . . . . . . 25

3.4.1 The LMS Algorithm . . . . . . . . . . . . . . . . . . . . . 263.5 Results of mouth-ear channel equalization . . . . . . . . . . . . . 28

4 Identification of “True” Mouth-Ear Channel 31

4.1 Basic Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

5 Conclusions 35

5.1 Further Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

A MatLab functions 37

A.1 LMS Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37A.2 NLMS Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 38A.3 LLMS Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 39A.4 RLS Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40A.5 Minimum-Phase Filter Design . . . . . . . . . . . . . . . . . . . . 41A.6 Coherence and Transfer Function . . . . . . . . . . . . . . . . . . 42A.7 Estimate of “True” Channel . . . . . . . . . . . . . . . . . . . . . 43

1

Page 5: Equalization of Audio Channels

Chapter 1

Channel Equalization — An

Introduction

x(n)

Y(z)H(z)

X(z)

h(n)y(n)

Figure 1.1: System with input and output signals and the corresponding systemin z-domain.

A linear time-invariant system h(n) takes input signal x(n) and produces anoutput signal y(n) which is the convolution of x(n) and the unit sample responseh(n) of the system, see fig. 1.1. The input, output and system is assumed tobe real and only real signals will be considered in this thesis. The convolutiondescribed above can be written as

y(n) = x(n) ∗ h(n) (1.1)

where the convolution operation is denoted by an asterisk ∗. In z-domain, theconvolution represents a multiplication given by

Y (z) = X(z)H(z) (1.2)

where Y (z) is the z-transform of the output y(n), X(z) is the z-transform ofthe input x(n) and H(z) is the z-transform of the unit sample response h(n) ofthe system.

In many practical applications there is a need to correct the distortion causedby the channel and in this way recover the original signal x(n). In this thesis,this corrective operation will be called channel equalization.

2

Page 6: Equalization of Audio Channels

1.1 Non-Adaptive Methods

A cascade connection of a system h(n) and its inverse hI(n) is illustrated infig. 1.2. Suppose the distorting system has an impulse response h(n) and let

x(n) y(n)h(n)

Identity system

d(n) x(n)=h (n)I

Figure 1.2: System h(n) cascaded with its inverse system hI(n) results in anidentity system.

hI(n) denote the impulse response of the inverse system. We can then write

d(n) = x(n) ∗ h(n) ∗ hI(n) = x(n) (1.3)

where d(n) is the desired signal, i.e. the original input signal x(n). This impliesthat

h(n) ∗ hI(n) = δ(n) (1.4)

where δ(n) is a unit impulse. In z-domain, (1.4) becomes

H(z)HI(z) = 1 (1.5)

Thus, the transfer function for the inverse system is

HI(z) =1

H(z)(1.6)

Note that the zeros of H(z) becomes the poles of the inverse system and viceversa.

If the characteristics of the system is unknown, it is often necessary to excitethe system with a known input signal, observe the output, compare it with theinput and then determine the characteristics of the system. This operationis called system identification [1]. If we obtain an output signal y(n) from asystem h(n) excited with a known input signal x(n), we could of course use thez-transforms of y(n) and x(n) to form

H(z) =Y (z)

X(z)(1.7)

However, this is an analytical example and the transfer function H(z) is mostlikely infinite in duration. A more practical approach is based on a correlationmethod. The crosscorrelation of the signals x(n) and y(n) is given by

rxy(l) =

∞∑

n=−∞

x(n)y(n − l) , l = 0,±1,±2, . . . (1.8)

3

Page 7: Equalization of Audio Channels

The index l is the lag parameter1 and the subscripts xy on the crosscorrelationsequence rxy(l) indicate the sequences being correlated. If the roles of x(n) andy(n) is reversed, we obtain

ryx(l) =

∞∑

n=−∞

y(n)x(n − l) , l = 0,±1,±2, . . . (1.9)

Thus,rxy(l) = ryx(−l) (1.10)

Note the similarities between the computation of the crosscorrelation of twosequences and the convolution of two sequences. Hence, if the sequence x(n)and the folded sequence y(−n) is provided as inputs to a convolution algorithm,the convolution yields the crosscorrelation rxy(l), i.e.

rxy(l) = x(l) ∗ y(−l) (1.11)

In the special case when x(n) = y(n) the operation results in the autocorrelationof x(n), rxx(l).

Recall that y(n) = x(n)∗h(n). The insertion of this expression for y(n) into(1.11) yields

rxy(l) = h(−l) ∗ rxx(l) (1.12)

In z-domain, (1.12) becomes

Pxy(z) = H∗(z)Pxx(z) (1.13)

where H∗(z) is the complex conjugate of H(z) and Pxx is the power spectraldensity of x(n). The transfer function for the identified system is then

H∗(z) =Pxy(z)

Pxx(z)(1.14)

where Pxy(z) is the cross spectral density between x(n) and y(n). If rxy(l) isreplaced by ryx(−l) in (1.12), the complex conjugate in (1.14) is eliminated andwe obtain the following estimate of the transfer function:

H(z) =Pyx(z)

Pxx(z)(1.15)

The MatLab2 function tfe3 uses this method to estimate a transfer function

of the system in question [4]. In later sections it will be clear that this methodis both straightforward and powerful when identifying a given system.

1.2 Adaptive Channel Equalization

Another trail to equalize a channel is to use adaptive algorithms. There are avast amount of application areas for adaptive algorithms and the mathematicaltheory is quite complex and reaches beyond the scope of this thesis. Therefore,

1Also commonly referred to as (time) shift parameter2MatLab is a trademark of The MathWorks, Inc.

3Transfer Function Estimate

4

Page 8: Equalization of Audio Channels

in this section, only a brief description of the basic principles of adaptive filteringwill be given [2].

A block diagram of an adaptive filter is shown in fig. 1.3. It consists of ashift-varying filter and an adaptive algorithm for updating the filter coefficients.The goal of adaptive FIR-filters, is to find the Wiener filter w(n) that minimizes

Adaptivefilter

x(n) d(n)

d(n)

e(n)

Adaptivealgorithm

Figure 1.3: Basic structure for an adaptive filter.

the mean-square error

ξ(n) = E{|d(n) − d(n)|2} = E{|e(n)|2} (1.16)

where E{·} is the expected value and d(n) is the estimate of the desired signald(n).

We know that if x(n) and d(n) are jointly wide-sense stationary processes,the filter coefficients that minimize the mean-square error ξ(n) are found bysolving the Wiener-Hopf equations [2]

Rxxw = rdx (1.17)

where Rxx denotes the autocorrelation matrix of x(n), w denotes the vectorcontaining filter coefficients and rdx denotes the crosscorrelation vector of d(n)and x(n).

The calculation of the Wiener-Hopf equations is a complex mathematicaloperation including an inversion of the autocorrelation matrix Rxx. If the in-put signal or the desired signal is nonstationary, this operation would have tobe performed iteratively. Instead, the requirement that w(n) should minimizethe mean-square error at each time n can be relaxed and a coefficient updateequation of the form

w(n + 1) = w(n) + ∆w(n) (1.18)

can be used. In this equation ∆w(n) is a correction that is applied to the filtercoefficients w(n) at time n to form a new set of coefficients, w(n + 1), at timen+1. Equation (1.18) is the heart of all adaptive algorithms used in this thesis.4

Since the error function ξ(n) is a quadratic function, its curve can be viewedas a “bowl” with the minimum error at the bottom of this bowl. The idea of

4Except for the RLS algorithm described in section 2.4.4

5

Page 9: Equalization of Audio Channels

adaptive filters is to find the optimal vector w(n) by taking small steps towardsthe minimum error. The update equation for this vector is

w(n + 1) = w(n) − µ∇ξ(n) (1.19)

where µ is the step size and ∇ξ(n) is the gradient vector of ξ(n). Note that thesteps are taken in the negative direction of the gradient vector since this vectorpoints in the direction of steepest ascent.

The gradient can be directly estimated by the product of e(n) and x(n).Introducing this estimate in (1.19) yields

w(n + 1) = w(n) + µe(n)x(n) (1.20)

which is the well known Least Mean Squares (LMS) algorithm. Further devel-opments of this algorithm includes Normalized LMS (NLMS) and Leaky LMS(LLMS). All of these algorithms will be evaluated in later sections of this thesis[2].

In fig. 1.4, a block scheme that can be used for adaptive channel equalizationis shown. The original signal s(n) is passed through some sort of system (achannel) that distorts the input signal and this distorted signal is then used

as input to the adaptive algorithm. The output signal d(n) from the adaptivecausal filter is subtracted from the desired signal d(n) and the result forms theerror e(n). The error is the second input signal to the adaptive algorithm.

If the system is considered as a non-trivial system, it will not only affectthe spectral characteristics of the input signal but also introduce a delay on thesame5. This is the reason why the delay ∆ is so important.

Another important property is that if the channel to be equalized is causal,the equalizing filter will be non-causal if no delay of the filtered signal x(n)is acceptable. However, only causal Finite Impulse Response (FIR) adaptivefilters will be used in this thesis and these filters will indeed introduce a delayon the signal. Also note that an FIR filter of course only can approximate anInfinite Impulse Response (IIR) filter with a certain precision if such a filter isneeded for an optimal solution [3].

5That is, if the impulse response of the system is more complex than a zero-centered unitsample .

6

Page 10: Equalization of Audio Channels

Adaptivefilter

Adaptivealgorithm

x(n)

d(n)

e(n)d(n)Channel

D

Delay

Input signals(n)

Figure 1.4: Basic structure for an adaptive channel equalizer.

7

Page 11: Equalization of Audio Channels

Chapter 2

Equalization of Mask

Channel

In this chapter, a protective mask is studied. The goal was to equalize thedistortion of human speech caused by this mask. In order to collect the necessarydata to perfom this study, a measurement setup was assembled in order to recorddata on site.

2.1 Gathering of Measurement Data

The gathering of measurement data was made with the help of a test dummyhead1, two DAT-recorders2 and a signal analyzer3. The test dummy used,was constructed specially for audio measurements and was equipped with aloudspeaker placed in its mouth. A microphone was mounted on the inside ofthe mask and the mask was then attached to the test dummy head, see fig. 2.1.To damp disturbing environmental noise, the complete arrangement was placedbehind particle boards covered with insulation wool.

The signal analyzer was used to generate noise bandlimited to 12.8 kHzand one of the DAT-recorders, the SV3800 model, was used to record noise andspeech sequences while the other was used for playback of speech sequences. Thesampling frequency was 48 kHz with a resolution of 16 bits and the informationon the DAT-tapes was then stored as wav-files using the software CoolEdit2000. The wav-files was finally read by MatLab for further processing. Ablock scheme of the complete setup is shown in fig. 2.2.

The first action taken, was to reduce the amount of data by sampling rateconversion. Using the MatLab function decimate, the sampling frequency wasreduced in two steps: First from 48 kHz to 24 kHz and then from 24 kHz to12 kHz. Hence, the amount of data was reduced to one fourth. For a detaileddescription of how decimate works, see [5].

1Head Acoustic2Sony TCD-D8 and Panasonic SV38003Hewlett-Packard 36570A

8

Page 12: Equalization of Audio Channels

Figure 2.1: (a) Test dummy head equipped with a loudspeaker in its mouth. Themicrophone is placed inside the mask. (b) Placement of the microphone in themask.

Figure 2.2: Block scheme of the complete measuring setup.

2.2 Coherence Function

A powerful tool for investigating the properties of input-output signals, is thecoherence function. If Pxx and Pyy are the power spectral densities of inputsignal x(n) and output signal y(n) respectively, and Pxy is the cross spectrumof the input and output signal, the coherence function Cxy can be calculated as

Cxy =|Pxy|

2

PxxPyy

(2.1)

A coherence function equal to one, means that a perfectly linear and noise-freesystem is being measured. Thus, a coherence function gives a direct measure ofthe quality of the estimated frequency response.

In appendix A.6 a MatLab function that calculates the coherence functionis listed.

9

Page 13: Equalization of Audio Channels

The coherence function Cxy of the mask is shown in fig. 2.3. The length ofthe FFTs (Fast Fourier Transform) used for calculating Cxy was 2048.

0 1000 2000 3000 4000 5000 60000

0.2

0.4

0.6

0.8

1

Frequency [Hz]

Figure 2.3: The coherence function Cxy of the mask. The input signal was flatbandlimited noise sequence with variance σ2 = 1 (FFT-length 2048).

2.3 Channel Equalization using tfe

First, the impulse response of the system was estimated using the MatLab

function tfe. For a detailed description of how this function works, see [4]. Ashort resum of the theory behind tfe is given in section 1.1. An alternativefunction, custom made by the author, is listed in appendix A.6.

The data was divided into non-overlapping sections and then windowed bya Hanning window. The magnitude squared of the Discrete Fourier Transforms(DFT) of the input noise sections were averaged to form Pxx. The products ofthe DFTs of the input and output noise sections were averaged to form Pxy.A one-sided spectrum is returned by tfe and in order to perform an InverseFFT (IFFT), the spectrum has to be converted to a two-sided spectrum. Thisspectrum can then be used as input to the MatLab function ifft and inthis way the corresponding impulse response for the transfer function can becalculated. For a detailed description of the MatLab function ifft, see [5]

The channel transfer function and impulse response for different filter lengthsare shown in fig. 2.4. Calculating a channel equalizing filter for the mask usingtfe is easily done simply by switching the input parameters. That is, if the tfefunction call to estimate a channel is Txy=tfe(inputNoise, outputNoise),the function call to estimate an equalizing filter to the same channel wouldbe Txy inv=tfe(outputNoise, inputNoise). The result of this operation isshown in fig. 2.5.

2.4 Adaptive Channel Equalization

The MatLab function tfe calculates the transfer function using “brute force”.However, an alternative approach is the usage of adaptive methods. In thissection an investigation based on LMS, NLMS, LLMS and RLS (Recursive LeastSquares) adaptive FIR filters will take place.

10

Page 14: Equalization of Audio Channels

10 20 30 40 50

−0.2

0

0.2L=

50

0 1000 2000 3000 4000 5000−60

−40

−20

0

20

dB

20 40 60 80 100

−0.2

0

0.2

L=11

0

0 1000 2000 3000 4000 5000−60

−40

−20

0

20

dB50 100 150 200 250

−0.2

0

0.2

L=25

6

0 1000 2000 3000 4000 5000−60

−40

−20

0

20

dB

100 200 300 400 500−0.4

−0.2

0

0.2

L=51

2

Filter taps0 1000 2000 3000 4000 5000

−60

−40

−20

0

20

dB

Frequency [Hz]

Figure 2.4: The left column shows impulse responses for the mask. The fil-ters were calculated using the correlation method. The filter lengths are L=50,L=110, L=256 and L=512. The right column shows the corresponding transferfunctions. The transfer functions were calculated using the MatLab functionfreqz [5].

2.4.1 The LMS Algorithm

The first adaptive algorithm used for channel equalization was the LMS algo-rithm according to

w(n + 1) = w(n) + µe(n)x(n) (2.2)

An implementation of the LMS algorithm is listed in appendix A.1

Step size

The correct choice of the step size µ is of great importance when using the LMSalgorithm or other LMS-based algorithms. Using (2.3), the maximum step sizecan easily be approximated by

0 < µ <2

pE{|x(n)|2}(2.3)

where p is the filter length and E{|x(n)|2} is estimated with

E{|x(n)|2} =1

p

n∑

m=n−p+1

|x(m)|2 (2.4)

11

Page 15: Equalization of Audio Channels

10 20 30 40 50−4

−2

0

2

4L=

50

0 1000 2000 3000 4000 5000−20

0

20

40

dB

20 40 60 80 100−5

0

5

L=11

0

0 1000 2000 3000 4000 5000−20

0

20

40

dB

50 100 150 200 250

−5

0

5

L=25

6

0 1000 2000 3000 4000 5000−20

0

20

40

dB

100 200 300 400 500

−5

0

5

L=51

2

Filter taps0 1000 2000 3000 4000 5000

−20

0

20

40

dB

Frequency [Hz]

Figure 2.5: The left column shows impulse responses for the mask channel equal-izing filter. The filters were calculated using the correlation method. The filterlengths are L=50, L=110, L=256 and L=512. The right column shows the cor-responding transfer functions. The transfer functions were calculated using theMatLab function freqz.

In reality, this step size approximation can seldom or never be used. Instead,as a rule of thumb, a step size at least an order of magnitude smaller than themaximum value allowed, should be used [2]. Nevertheless there are applicationsthat may allow larger step sizes.

Delay

The choice of delay has a substantial effect on the quality of the channel equal-izer. The Mean Square Error (MSE) measures the quality in this case. As arule of thumb, the delay can be chosen equal to half the adaptive filter length[3]. In fig. 2.6 the MSE is plotted as a function of the delay. It is clear that adelay of about 100 samples gives the least MSE if the filter length is 200. Notethat the introduction of a delay is crucial for the quality of a channel equalizerbut that the length of the delay is not critical. According to the figure, thedelay could have been as short as 50 samples and as long as 150 samples withmaintained low level of the MSE. However, to leave out the delay results in anunacceptably high MSE.

The physical delay that the limited speed of sound propagation c introduces

12

Page 16: Equalization of Audio Channels

0 50 100 150 200 250 300 350 4000.5

1

1.5

2

2.5

3

3.5

4

4.5

5x 10−4

Mea

n S

quar

e E

rror

Delay [Samples]

Figure 2.6: MSE plotted as a function of the delay ∆. The length of the adaptivefilter is L=200.

to the system, is best illustrated with a plot of the impulse response of the mask,i.e. the crosscorrelation between the loudspeaker and the microphone. This plotis shown in fig. 2.7 and is based on an estimate made by the Hewlett-Packard36570A signal analyzer. Note that the amplitude of the impulse response is notcorrectly scaled.

The crosscorrelation is approximately zero during the time 0–0,2 ms. Thisdelay is due to the propagation time for the first sound wave that reaches themicrophone. If we approximate c ≈ 330 m/s as the speed of sound and ∆ ≈2 · 10−4 ms delay, the distance L between the loudspeaker and the microphonecan be calculated by

L = ∆ · c (2.5)

which yields a distance between the loudspeaker and the microphone of about6.5 cm. This distance corresponds well to the real distance.

Filter length

The filter length is of course a key parameter in all sorts of filter design. The-oretically, the length can be chosen arbitrarily but in a realization of a filterin, for example, a digital signal processor (DSP), the length of the filter is lim-ited by memory size as well as by mathematical complexity. Hence, we havea classical trade-off situation between efficiency and quality. To motivate the

13

Page 17: Equalization of Audio Channels

0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016−4

−3

−2

−1

0

1

2

3x 10−4

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

x 10−4

−4

−3

−2

−1

0

1x 10−4

Time [Seconds]

Figure 2.7: The plots shows the impulse response of the mask, i.e. the cross-correlation between the loudspeaker and the microphone. The lower plot is azoomed version of the upper plot.

choice of filter length, fig. 2.8 shows the MSE plotted as a function of the filterlength. The result from all LMS-based adaptive algorithms used in this thesisare plotted. Note that when the filter lengths increases beyond a certain pointthe MSE actually increases. The reason for this is that as the number of filtercoefficients is increased, the error due to stochastic “jumps” of these coefficientson the error surface also increases. This error is called the Excess MSE.

Results

The LMS algorithm was used to perform both a channel identification and achannel equalization. The corresponding plots is shown in fig. 2.9-2.10.

2.4.2 The NLMS Algorithm

Normalized LMS (NLMS) uses a time varying step size as follows

µ(n) =β

xT (n)x(n) + ε=

β

||x(n)||2 + ε(2.6)

In this thesis, only real signals are used, hence the transpose in the denominatorof (2.6). If x(n) was a complex signal, the transpose would be a hermitian trans-

14

Page 18: Equalization of Audio Channels

0 20 40 60 80 100 120 140 160 180 2000

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5x 10−4

Mea

n S

quar

e E

rror

Filter Length

LMS, 0.025 of max step size LMS, 0.050 of max step size LMS, 0.100 of max step size NLMS, beta=0.05 NLMS, beta=0.10 NLMS, beta=0.20 LLMS, 0.025 of max step sizeLLMS, 0.050 of max step sizeLLMS, 0.100 of max step sizeRLS, lambda=1

Figure 2.8: The MSE plotted as a function of filter length for LMS, NLMS,LLMS and RLS. The delay is half the length of the filter plus eight samples dueto the physical delay introduced by the system. Three different step sizes wasused for each algorithm (except for the RLS algorithm). The input signal was50 000 samples of flat bandlimited noise with the variance σ2 = 1 except for theRLS algorithm where 10 000 samples of noise were used.

pose. Also note that to avoid division by zero, a small constant ε is introducedin the denominator.

If equation 2.6 is inserted into equation 2.2, we obtain

w(n + 1) = w(n) + βx(n)

||x(n)||2e(n) (2.7)

With a correct statistical assumption it can be shown that the NLMS algorithmwill converge if 0 < β < 2 [2]. Therefore, the NLMS algorithm requires noknowledge about the statistics of the input signal in order to calculate the stepsize.

Another advantage of the NLMS algorithm, is its insensitivity to the am-plification of the gradient noise that a high-amplitude input signal introduces.This insensitivity comes from the normalization in (2.7).

The delay requirement is the same as when using the LMS algorithm.

15

Page 19: Equalization of Audio Channels

10 20 30 40 50−0.5

0

0.5L=

50

0 1000 2000 3000 4000 5000−60

−40

−20

0

20

dB

20 40 60 80 100

−0.2

0

0.2

L=11

0

0 1000 2000 3000 4000 5000−60

−40

−20

0

20

dB50 100 150 200 250

−0.2

0

0.2

L=25

6

0 1000 2000 3000 4000 5000−60

−40

−20

0

20

dB

100 200 300 400 500

−0.2

0

0.2

L=51

2

Filter taps0 1000 2000 3000 4000 5000

−60

−40

−20

0

20

dB

Frequency [Hz]

Figure 2.9: The left column shows impulse responses for the mask. The fil-ters were calculated using the LMS algorithm. The filter lengths are L=50,L=110, L=256 and L=512. The right column shows the corresponding transferfunctions. The transfer functions were calculated using the MatLab functionfreqz.

2.4.3 The LLMS Algorithm

If the eigenvalues of an autocorrelation matrix is zero, the LMS algorithm doesnot converge as expected. The LLMS algorithm (Leaky LMS) solves this prob-lem by adding a “leakage coefficient” γ to the filter coefficients according to

w(n + 1) = (1 − µγ)w(n) + µe(n)x(n) (2.8)

This leakage coefficient forces the filter coefficients to zero if either the inputsignal or the error signal becomes zero. The obvious drawback of this method, isthat a bias is introduced to the solution. This bias becomes evident in fig. 2.8. Inthis case, the LLMS algorithm has approximately twice as large MSE comparedto the other algorithms in the plot.

The delay requirement is the same as when using the LMS and NLMS algo-rithms.

2.4.4 The RLS Algorithm

If an increased computational complexity is acceptable, the time for convergencecan be reduced considerably by using the RLS algorithm. For a more thorough

16

Page 20: Equalization of Audio Channels

10 20 30 40 50−2

0

2L=

50

0 2000 4000 6000−40

−20

0

20

dB

20 40 60 80 100−2

0

2

L=11

0

0 2000 4000 6000−40

−20

0

20

dB50 100 150 200 250

−1

0

1

L=25

6

0 2000 4000 6000−40

−20

0

20

dB

100 200 300 400 500

−0.5

0

0.5

L=51

2

Filter taps0 2000 4000 6000

−40

−20

0

20

dB

Frequency [Hz]

Figure 2.10: The left column shows impulse responses for the equalizing filter.The filters were calculated using the LMS algorithm. The filter lengths are L=50,L=110, L=256 and L=512. The right column shows the corresponding transferfunctions. The transfer functions were calculated using the MatLab functionfreqz.

description of this algorithm, see [2]. One important property of the RLS algo-

rithm, is that the step size depends on the size of the error: If the estimate d(n)is close to the desired signal d(n), small corrections of the filter coefficients willbe made. Hence, the step size will be large at the beginning of the convergenceand then, as d(n) approaches d(n), become smaller and smaller.

A plot of the MSE as a function of the filter length is shown in fig. 2.8. Dueto the complexity of the algorithm, the plot has been calculated from 10 000samples of flat bandlimited noise (σ2 = 1).

17

Page 21: Equalization of Audio Channels

2.5 Minimum-Phase Approach

If we were to design a channel equalizer for hifi audio purposes, a linear phasefilter would be the only acceptable choice since all frequencies are delayed equallywhen passed through such a filter. In the case of the mask, this constraintis substantially relaxed. This channel equalizer is supposed to operate in atelephone network (PSTN) using the frequency band 300-3400 Hz. Since thechannel equalizer is designed to operate in such a large system, it is desirableto reduce the delay caused by the filtering and in this way minimize the totaldelay introduced by the whole system, i.e. the PSTN.

One powerful method of minimizing the delay of a system, is to design itas a minimum-phase filter. A minimum-phase filter has all of its zeros insideof or possibly on the unit circle. This type of filters can be obtained from alinear-phase filter by reflecting all of the zeros that are outside the unit circle tothe inside of the unit circle. The resulting filter will have minimum-phase and,except for a scaling factor, the same magnitude as the linear-phase filter [6].

0 20 40 60 80 100 120−2

−1

0

1

2

0 20 40 60 80 100 120−1

0

1

2

3

Filter taps

Figure 2.11: The upper plot shows the impulse response for a minimum-phasefilter and the lower plot shows the impulse response for a linear-phase filter.Note how the “centre of gravity” of the linear-phase filter has been shifted toform the minimum-phase filter.

The plots in fig. 2.11 shows the impulse response for a mask channel equal-izing minimum-phase filter and the corresponding linear-phase filter. The filterlength of the linear-phase filter is 128 and thus the delay when using this filter,

18

Page 22: Equalization of Audio Channels

will be 64 samples due to its symmetry. In figure 2.12 the corresponding am-plitude functions are plotted. It is clear that the minimum-phase filter indeed

0 1000 2000 3000 4000 5000 6000−40

−30

−20

−10

0

10

20

Am

plitu

de [d

B]

Desired Frequency Response Minimum Phase Frequency Response

0 1000 2000 3000 4000 5000 6000−40

−30

−20

−10

0

10

20

Am

plitu

de [d

B]

Frequency [Hz]

Desired Frequency Response Linear Phase Frequency Response

Figure 2.12: Amplitude for the minimum-phase filter (upper plot) and linear-phase filter (lower plot). Note that the overall performance is approximately thesame for both filters

results in approximately the same amplitude as the linear-phase filter.Another interesting question, is how the phase behaves over the frequency

band. This is illustrated in fig. 2.13.The group delay τg is defined to be

τg = −dθ(ω)

dω(2.9)

where θ(ω) is the phase. For a linear-phase system the group delay is, bydefinition, constant. The group delay for the different filters is shown in fig. 2.13.It is clear that the usage of a linear-phase filter of length L will introduce aconstant delay of L/2 samples. If a minimum-phase filter is used, the delay willbe reduced substantially but on the other hand it will not be constant over thefrequency band.

19

Page 23: Equalization of Audio Channels

0 1000 2000 3000 4000 5000 6000−5

0

5

Rad

ians

0 1000 2000 3000 4000 5000 6000−200

−150

−100

−50

0

Rad

ians

Frequency [Hz]

Figure 2.13: The phase for the minimum-phase filter (upper plot) and linear-phase filter (lower plot).

0 1000 2000 3000 4000 5000 6000−100

−50

0

50

100

Sam

ples

Frequency [Hz]

Minimum PhaseLinear Phase

Figure 2.14: The group delay for the minimum-phase filter and the linear-phasefilter.

20

Page 24: Equalization of Audio Channels

2.6 Results of Mask Channel Equalization

When talking about speech quality and speech intelligibility it is hard to decidewhat is “high quality speech” and “low quality speech”. One needs some sortof measure to be able to draw conclusions on whether one speech sample is“better” than another. There are nevertheless a great deal of subjective feelingsabout speech quality and intelligibility.

In the case of the mask channel equalization, both correlation methods andadaptive methods proved to be a powerful tool in channel equalization. Bothmethods managed to substantially improve the speech quality and intelligibilityusing reasonable filter lengths. A subjective listening test showed that at asampling frequency of Fs = 12 kHz, a filter length of about L = 100 tapssignificantly improved the speech quality. There was also little or no differenceat all between the results of the different adaptive algorithms and this is thereason why only the LMS algorithm is used as adaptive method in chapter 3where an equalization of the mouth-ear channel is performed.

When a minimum-phase filter was used to equalize the mask channel, asubjective listening test could not differ a speech sample filtered by such afilter from a speech sample filtered by a corresponding linear-phase filter. Thissuggests that minimum-phase filters can be used without loss off speech qualityin speech communication system.

21

Page 25: Equalization of Audio Channels

Chapter 3

Equalization of Mouth-Ear

Channel

In chapter 2 we saw that it is possible to equalize the channel that a maskrepresents using both correlation methods and adaptive methods. We now moveon to next issue: Placing the microphone inside a persons auditory meatus andidentify and equalize the channel between the mouth and the ear. The firstproblem that arises, is how to generate a noise signal. When using the testdummy head, a signal analyzer could be used to generate the reference inputsignals (see section 2.1). Now, when placing the microphone inside a humanauditory meatus, the skull itself represents the channel to be equalized. Thus,the test subject himself must generate a broadband noise-like signal to excitethe channel/skull. This may seem like an impossible task but in fact it is quitepossible to generate a broadband noise-like sound. The power spectral densityfor such a noise-like sound, made by a human speech organ, is shown in fig. 3.1.

0 1000 2000 3000 4000 5000−100

−80

−60

−40

−20

0

Frequency [Hz]

Am

plitu

de [d

B]

Figure 3.1: Power spectrum of broadband noise-like sound generated by humanspeech organ.

22

Page 26: Equalization of Audio Channels

3.1 Gathering of Measurement Data

The equipment used was a DAT-recorder1, a custom made microphone amplifier,two microphones2 and a pair of ear-muffs3. One of the microphones was placedin front of the test subjects mouth and the other was placed inside the test sub-jects auditory meatus. The ear-muffs was then placed on the test subjects head.This is advantageous since the signal path outside the skull is damped consider-ably. Also, a pair of ear-muffs damps disturbing or even harmful environmentalnoise.

The test subject was placed in a semi-damped room and pronounced a num-ber of sentences chosen a priori. He also tried to make noise-like sounds. Thetwo-channel data was recorded at 44.1 kHz and then the sampling frequencywas reduced to 11.025 kHz in the same manner as the data from the mask mea-surements (see section 2.1.) For a block scheme of the complete measurementsetup, see fig. 3.2 and 3.3.

Figure 3.2: Block scheme of the complete measurement setup.

Figure 3.3: Microphone placement in auditory meatus.

1Sony TCD-D82Sennheiser3Hellberg

23

Page 27: Equalization of Audio Channels

3.2 Coherence Function of Mouth-Ear Channel

Using the noise signal generated by the human speech organ, the coherencewas calculated as in (2.1). The result is shown in fig. 3.4. As described in

0 1000 2000 3000 4000 50000

0.2

0.4

0.6

0.8

1

Frequency [Hz]

Figure 3.4: Coherence function of mouth-ear channel (FFT-length 2048).

section 2.2, the ideal coherence function is equal to one which means that aperfectly linear and noise-free system is being measured. As fig. 3.4 illustrates,this is not the case with the mouth-ear channel. It has been showed that soundpropagation through the skull is perfectly linear in the frequency band at interest[7]. Furthermore, the signal recorded in the auditory meatus is severely damped(see section 3.3) and this indicates that the problem is a poor signal-to-noiseratio (SNR) rather than a non-linear problem. An additional problem is thatthe excitation signal is not used as reference signal when identifying the system.

However, one should not concentrate on the coherence function to the exclu-sion of other information about the signals. It will be clear in later sections, thata satisfactory channel equalization can be performed even though the coherencefunction at some frequencies (or frequency bands) falls far below unity.

24

Page 28: Equalization of Audio Channels

3.3 Channel Equalization Using tfe

The MatLab function tfe uses correlations to calculate a transfer function, asdescribed in (1.14), section 1.1. The transfer functions and impulse responsesfor a number of different filter lengths are shown in fig. 3.5-3.6. The procedureof calculating the impulse response from the transfer function given by tfe, wasthe same as in section 2.3. It is evident that the skull performs a relativelysimple low-pass filtering with a cut-off frequency of about 500 Hz and a stop-band damping of about 30-40 dB.

The strange behaviour of the impulse response can probably be explainedby the aggravating circumstances mentioned in section 3.2.

10 20 30 40 50−0.04

−0.02

0

0.02

0.04

L=50

0 1000 2000 3000 4000 5000−60

−40

−20

0

dB

20 40 60 80 100

−0.02

0

0.02

L=11

0

0 1000 2000 3000 4000 5000−60

−40

−20

0

dB

50 100 150 200 250−0.02

0

0.02

0.04

L=25

6

0 1000 2000 3000 4000 5000−60

−40

−20

0

dB

100 200 300 400 500−0.02

0

0.02

0.04

0.06

L=51

2

Filter taps0 1000 2000 3000 4000 5000

−60

−40

−20

0

dB

Frequency [Hz]

Figure 3.5: The left column shows impulse responses for the mouth-ear channel.The filters were calculated using the correlation method. The filter lengths areL=50, L=110, L=256 and L=512. The right column shows the correspondingtransfer functions. The transfer functions were calculated using the MatLab

function freqz.

3.4 Adaptive Channel Equalization

Due to the small differences between the results of the different adaptive algo-rithms used in chapter 2.4, only the standard LMS algorithm will be used as anexample of adaptive channel equalization of the mouth-ear channel.

25

Page 29: Equalization of Audio Channels

10 20 30 40 50−20

0

20L=

50

0 1000 2000 3000 4000 5000−20

0

20

40

dB

20 40 60 80 100−20

0

20

L=11

0

0 1000 2000 3000 4000 5000−20

0

20

40

dB50 100 150 200 250

−20

0

20

L=25

6

0 1000 2000 3000 4000 5000−20

0

20

40

dB

100 200 300 400 500

−20

0

20

L=51

2

Filter taps0 1000 2000 3000 4000 5000

−20

0

20

40

dB

Frequency [Hz]

Figure 3.6: The left column shows impulse responses for the mouth-ear channelequalizing filter. The filters were calculated using the correlation method. Thefilter lengths are L=50, L=110, L=256 and L=512. The right column shows thecorresponding transfer functions. The transfer functions were calculated usingthe MatLab function freqz.

3.4.1 The LMS Algorithm

As in the case with the mask, a number of parameters must be calculated toobtain an effective equalizing filter.

Step size and Filter Length

To find a proper step size, the MSE was plotted as a function of the filter lengthL. Different fractions of the maximum step size was used and the result isshown in fig. 3.7. According to this figure, a step size of about one fifth of themaximum allowed step size, seems to be a reasonable choice. For a start, thedelay was chosen as half the filter lengths. Later, a more thorough investigationof an optimal delay is performed.

Delay

As in the case of the mask channel equalization, a proper delay must be chosen.However, a problem arises when the human speech organ is considered. The testdummy head used in the mask channel equalization had a loudspeaker placed in

26

Page 30: Equalization of Audio Channels

0 50 100 150 200 250 3002

3

4

5

6

7

8

9

10x 10−3

Mea

n S

quar

e E

rror

Filter Lenght

0.025 of maximum my0.05 of maximum my 0.1 of maximum my 0.2 of maximum my 0.4 of maximum my

Figure 3.7: The MSE plotted as a function of filter length for the LMS algorithm.The delay was half the filter length. Five different step sizes was used and theinput signal was noise generated by a human speech organ.

its “mouth”. Hence, the source of the speech or noise was generated at a certainisolated point. When a person is talking, this is not the case. Instead, the vocalchords acts together with the throat, mouth and nostril cavities to form sounds.This means that the speech or noise no longer is generated at one isolated point.Rather, the sound is a result of many systems cooperating. Since we are forcedto use a human skull instead of a test dummy head to collect data, it is difficultto predict a certain optimal delay for a mouth-ear channel equalizer.

According to [3], a delay of half the filter length is a rule of thumb. Thisrule can of course be used without further investigations but a simple plot ofthe MSE as a function of the delay can offer interesting information about theoptimal delay. Fig. 3.8 shows the MSE plotted as a function of delay for eightmouth-ear channel equalizing filters. The filter lengths are L = 10, L = 30,L = 50, L = 70, L = 90, L = 110, L = 256 and L = 512 and the delaywas 0–2L. Using this information, the LMS algorithm was used to identify andequalize the mouth-ear channel. The results of these operations are shown infig. 3.9-3.10.

27

Page 31: Equalization of Audio Channels

0 5 10 15 202468

1012

x 10−3L=

10

0 20 40 602468

1012

x 10−3

L=30

0 20 40 60 80 1002468

1012

x 10−3

L=50

0 50 1002468

1012

x 10−3

L=70

0 50 100 1502468

1012

x 10−3

L=90

0 50 100 150 2002468

1012

x 10−3

L=11

0

0 100 200 300 400 5002468

1012

x 10−3

L=25

6

Delay [Samples]0 200 400 600 800 1000

2468

1012

x 10−3

L=51

2

Delay [Samples]

Figure 3.8: The MSE plotted as a function of delay for eight mouth-ear channelequalizing filters, each of different length L and with a delay of 0–2L. The filterswere calculated using the LMS algorithm.

3.5 Results of mouth-ear channel equalization

The mouth-ear channel represents a far more complex system and measurementenvironment, than the mask channel does. The speech signal inside the audi-tory meatus is severely damped and this means that great demands are madeupon the microphones and amplifiers. Furthermore, the signal that is used asthe reference signal, i.e. the speech signal at the mouth, is not the excitationsignal of the system/skull. Most likely, these factors concurs to a poor coher-ence function and a poor estimate of the channel. Nevertheless, it is possibleto significantly enhance the speech intelligibility by using some of the methodsdescribed in this chapter. The performance of the correlation method was par-ticularly good while some problems were encountered when trying to make theadaptive algorithms converge properly.

28

Page 32: Equalization of Audio Channels

10 20 30 40 50−10

0

10

L=50

0 1000 2000 3000 4000 5000−60

−40

−20

0

20

dB

20 40 60 80 100−0.05

0

0.05

L=11

0

0 1000 2000 3000 4000 5000−60

−40

−20

0

20dB

50 100 150 200 250−0.05

0

0.05

L=25

6

0 1000 2000 3000 4000 5000−60

−40

−20

0

20

dB

100 200 300 400 500−0.05

0

0.05

L=51

2

Filter taps0 1000 2000 3000 4000 5000

−60

−40

−20

0

20

dB

Frequency [Hz]

Figure 3.9: The left column shows impulse responses for the mouth-ear channel.The filters were calculated using the LMS algorithm. The filter lengths are L=50,L=110, L=256 and L=512. The right column shows the corresponding transferfunctions. The transfer functions were calculated using the MatLab functionfreqz.

29

Page 33: Equalization of Audio Channels

10 20 30 40 50−5

0

5

L=50

0 1000 2000 3000 4000 5000

−40

−20

0

20

dB

20 40 60 80 100

−2

0

2

L=11

0

0 1000 2000 3000 4000 5000

−40

−20

0

20

dB

50 100 150 200 250

−1

0

1

L=25

6

0 1000 2000 3000 4000 5000

−40

−20

0

20

dB

100 200 300 400 500

−0.5

0

0.5

L=51

2

Filter taps0 1000 2000 3000 4000 5000

−40

−20

0

20

dB

Frequency [Hz]

Figure 3.10: The left column shows impulse responses for the mouth-ear channelequalizing filter. The filters were calculated using the LMS algorithm. The filterlengths are L=50, L=110, L=256 and L=512. The right column shows thecorresponding transfer functions. The transfer functions were calculated usingthe MatLab function freqz.

30

Page 34: Equalization of Audio Channels

Chapter 4

Identification of “True”

Mouth-Ear Channel

Most types of measurements in some way affects the item being measured. Inthe case of the mouth-ear channel identification and equalization, the cables,analog-to-digital converters (ADC) and the microphones forms a system thatdistorts the signal in some way. However, it is possible to equalize this systemas well and in this way find an approximation of the “true” channel.

4.1 Basic Approach

In fig. 4.1 a principal block scheme illustrates how the measurements are per-formed. SE is the signal recorded in the auditory meatus, i.e. the Ear, and SM

is the signal recorded at the Mouth. HT is the true ear-mouth channel and His the true ear-mouth channel distorted by the measurement equipment.

The microphones, cables and ADCs can be viewed upon as a system. Sup-pose we perform a measurement and uses equipment/system GM to record dataat the mouth and equipment/system GE to record data in the auditory meatus.We then have the situation shown in fig. 4.2. H1 is the first estimate of thechannel. This setup means that

SM = SEHT (4.1)

The output from H1 will be SEGEH1 and the output from GM will be SMGM .This means that

H1 =SMGM

SEGE

(4.2)

Then, the microphones are switched so that the equipment that was used torecord data in the auditory meatus in the first measurement now is placed infront of the mouth and vice versa. Fig. 4.3 shows this new setup. Note that GE

and GM are switched. This means that

H2 =SMGE

SEGM

(4.3)

If (4.1) is substituted into (4.2) and (4.3) and H1 and H2 are multiplied weobtain

HT =√

H1H2 (4.4)

31

Page 35: Equalization of Audio Channels

HT

ADC

etc.

ADC

etc.

H

SE

SM

Figure 4.1: Block scheme of how the measurements are performed.

i.e. the true channel.The result of applying the operations described in this section to the mouth-

ear channel equalizing problem is shown in fig. 4.4.A MatLab function for estimating HT is listed in appendix A.7.

32

Page 36: Equalization of Audio Channels

HT

H1

SE

S =S HM E T

GE G

M

S G H =S GE E 1 M M

Figure 4.2: Block scheme where the measurement equipment is viewed upon asa system, one in front of the mouth, GM , and one in the auditory meatus, GE.

S G H =S GE M 2 M E

HT

H2

SE

S =S HM E T

GE

GM

Figure 4.3: Block scheme of the second measurement where GM and GE areswitched, resulting in another identified channel, H2.

33

Page 37: Equalization of Audio Channels

0 1000 2000 3000 4000 5000−30

−20

−10

0

10

20

30

40

Frequency [Hz]

Am

plitu

de [d

B]

Transfer Function − Left channel at mouth Transfer Function − Right channel at mouthEstimate of "true" channel

50 100 150 200 250 300 350 400 450 500−15

−10

−5

0

5

10

15

20

25

Filter taps

Figure 4.4: Estimated transfer functions (upper plot) and impulse response forthe true channel (lower plot).

34

Page 38: Equalization of Audio Channels

Chapter 5

Conclusions

The goal of this Master thesis has been to investigate the possibility of placinga microphone for communication purposes inside a preservative mask as wellas the possibility of placing the microphone inside a persons auditory meatusand digitally equalize the speech path in question. A number of methods hasbeen evaluated, both adaptive and non-adaptive. The work shows that the cor-relation method is a powerful and straightforward way of identifying a system.Subjective listening tests indicates that this method was able to identify andequalize the mask channel with a satisfactory result and with reasonable filterlengths.

The mouth-ear channel presented more difficulties because of its “non-ideal”circumstances. The mask was attached to a test dummy head equipped witha loudspeaker in its mouth and bandlimited noise was used as reference signal.When the mouth-ear channel was to be identified, a real human skull had tobe used and the test subject had to excite the skull himself. Partly becauseof this, a proper transfer function for this channel was difficult to find. Thework also shows that the speech signal detected inside the auditory meatusis substantially damped and this raises the requirements on the measurementequipment because of the low SNR. Another factor that affects the final resultis that the excitation signal of the skull is not used as reference/desired signal.Instead, the speech at the test subjects mouth is used as the desired signal whenidentifying an equalizing filter. This makes the identification process far morecomplex than in the case with the test dummy head and the protective mask.

Nevertheless, subjective listening tests revealed that a substantial improve-ment in speech intelligibility was achieved when using the correlation method.The adaptive methods performed less well, mainly because of convergence prob-lems.

5.1 Further Work

Further improvements may be achieved by using one ore more of the suggestionsbelow:

• To identify a channel between the mouth and the auditory meatus, low-noise microphones and amplifiers probably have to be used. This wouldmost likely raise the SNR and improve the final results.

35

Page 39: Equalization of Audio Channels

• It has been shown that the sound pressure level varies depending on whereinside the auditory meatus the microphone is placed [8]. It is possible thata small change in the position of the microphone may increase the SNRto some extent.

• The excitation signal used when identifying the mouth-ear channel prob-ably causes problems. This signal is far from ideal and furthermore itdoes not stem from the vocal chords. Another way of exciting the skullwould be to simply talk for a few minutes and in this way excite all thefrequencies needed for an identification of the channel.

36

Page 40: Equalization of Audio Channels

Appendix A

MatLab functions

A.1 LMS Algorithm

function [yout,eout,f]=lms(x,d,mu,nord)

% [yout,eout,f]=lms(x,d,mu,nord)

%

% x - Input Signal

% d - Desired Signal

% mu - Step size

% nord - Filter length

% yout - Filter output

% eout - Error during convergence

% f - Filter taps

%

% (c)Nils Westerlund, 2000

L=length(x);

f=zeros(nord,1);

yout=zeros(1,L);

eout=zeros(1,L);

for K=nord:L,

xn=x(K:-1:K-nord+1);

y=xn’*f;

yout(K)=y;

e=d(K)-y;

eout(K)=e;

f=f+mu*e*xn;

end

37

Page 41: Equalization of Audio Channels

A.2 NLMS Algorithm

function [yout,eout,f]=nlms(x,d,mu,nord)

% [yout,eout,f]=nlms(x,d,mu,nord)

%

% x - Input Signal

% d - Desired Signal

% mu - Step size

% nord - Filter length

% yout - Filter output

% eout - Error during convergence

% f - Filter taps

%

% (c)Nils Westerlund, 2000

L=length(x);

f=zeros(nord,1);

yout=zeros(1,L);

eout=zeros(1,L);

for K=nord:L,

xn=x(K:-1:K-nord+1);

y=xn’*f;

yout(K)=y;

e=d(K)-y;

eout(K)=e;

nrm=xn’*xn+eps;

f=f+mu*e*(xn/nrm);

end

38

Page 42: Equalization of Audio Channels

A.3 LLMS Algorithm

function [yout,eout,f]=llms(x,d,mu,gamma,nord)

% [yout,eout,f]=llms(x,d,mu,gamma,nord)

%

% x - Input Signal

% d - Desired Signal

% mu - Step size

% gamma - Leakage factor

% nord - Filter length

% yout - Filter output

% eout - Error during convergence

% f - Filter taps

%

% (c)Nils Westerlund, 2000

L=length(x);

f=zeros(nord,1);

yout=zeros(1,L);

eout=zeros(1,L);

for K=nord:L,

xn=x(K:-1:K-nord+1);

y=xn’*f;

yout(K)=y;

e=d(K)-y;

eout(K)=e;

f=(1-mu*gamma)*f+mu*e*xn;

end

39

Page 43: Equalization of Audio Channels

A.4 RLS Algorithm

function [W]=rls(x,d,nord,lambda)

% [W]=rls(x,d,nord,lambda)

%

% x - Input Signal

% d - Desired Signal

% nord - Filter length

% lambda - Forgetting factor

% W - Filter taps

%

% (c)Nils Westerlund, 2000

x=x(:)’;

d=d(:)’;

delta=0.001;

P=inv(delta)*eye(nord);

xflip=fliplr(x);

xflip=[zeros(1,nord-1) xflip zeros(1,nord-1)];

W=zeros(length(xflip)-2*nord+2,nord);

z=zeros(5,1);

g=zeros(50,1);

alpha=0;

for k=1:length(xflip)-2*nord+1,

z=P*xflip(end-k-nord+1:end-k)’;

g=z/(lambda+xflip(end-k-nord+1:end-k)*z);

alpha=d(k+1)-xflip(end-k-nord+1:end-k)*W(k,:).’;

W(k+1,:)=W(k,:)+alpha*g.’;

P=(P-g*z.’)/lambda;

end

W=W(end,:);

40

Page 44: Equalization of Audio Channels

A.5 Minimum-Phase Filter Design

function [x2,h]=minfas(Admag,Ndft)

% [x2,h]=minfas(Admag,Ndft)

%

% Admag - Desired frequency response

% Ndft - Length of DFT

% x2 - Minimum-phase Impulse response

% h - Linear-phase Impulse response

%

Admag=Admag(:);

Admag=Admag’;

fs=12000;

f=100*(1.2589).^(3:length(Admag)+2);

Admagi=Admag;

Admagi=[Admagi 0];

Ad=10.^(Admagi/20);

Ad=Ad(:);

Mag(1:Ndft/2+1)=Ad;

Mag(Ndft/2+2:Ndft)=flipud(Ad(2:Ndft/2));

xehat=real(ifft(log(Mag)));

xhat(1)=xehat(1);

xhat=2*xehat;

N=Ndft/2;

x2=real(ifft(exp(fft(xhat(1:N),Ndft))));

x2=x2(1:N);

% ------------------------------------------

% Linear phase - FFT method

%-------------------------------------------

L=Ndft/2;

M=(L)/2;

Adh=Ad(1:length(Ad)-1).*exp(-j*2*pi*M*((0:length(Ad)-2))’/Ndft);

Magh(1:Ndft/2)=Adh;

Magh(Ndft/2+1)=Ad(Ndft/2+1);

Magh(Ndft/2+2:Ndft)=flipud(conj(Adh(2:Ndft/2)));

h=real(ifft(Magh));

h=h(1:L+1);

41

Page 45: Equalization of Audio Channels

A.6 Coherence Function and Estimate of

Transfer Function

function [Txy_H1,Txy_H2,Cxy]=sysest(x,y,nfft,winflag)

% [Txy_H1,Txy_H2,Cxy]=sysest(x,y,nfft,winflag)

%

% x - Input Signal

% y - Output Signal

% nfft - FFT Length

% winflag - 1-> windowing 0-> no windowing

% Txy_H1 - H1-estimate of transfer function

% Txy_H2 - H2-estimate of transfer function

% Cxy - Coherence Function

%

% (c)Nils Westerlund, 2000

x=x(:);

y=y(:);

win=hanning(nfft);

k=fix(length(x)/nfft);

u=inv(nfft)*sum(abs(win).^2);

Pxx=zeros(nfft,1);

Pxy=zeros(nfft,1);

Pyy=zeros(nfft,1);

if(winflag)

disp(’Windowing...’)

else

disp(’No windowing...’)

end

for i=0:k-1

if(winflag)

xw=win.*x(i*nfft+1:(i+1)*nfft);

yw=win.*y(i*nfft+1:(i+1)*nfft);

else

xw=x(i*nfft+1:(i+1)*nfft);

yw=y(i*nfft+1:(i+1)*nfft);

end

X=fft(xw,nfft);

X2=abs(X).^2;

Y=fft(yw,nfft);

Y2=abs(Y).^2;

XY=Y.*conj(X);

Pxx=Pxx+X2;

Pyy=Pyy+Y2;

Pxy=(Pxy+XY);

end

Txy_H1=Pxy./Pxx;

42

Page 46: Equalization of Audio Channels

Txy_H2=Pyy./conj(Pxy);

Txy_H1=Txy_H1(1:nfft/2);

Txy_H2=Txy_H2(1:nfft/2);

Cxy=(abs(Pxy).^2)./(Pxx.*Pyy);

Cxy=Cxy(1:nfft/2);

A.7 Estimate of “True” Channel

function [Htrue,htrue,lchm_Hinv,rchm_Hinv]=...

truechan(lchm_innoise,lchm_outnoise,rchm_innoise,rchm_outnoise)

% [Htrue,htrue,lchm_Hinv,rchm_Hinv]=...

% truechan(lchm_innoise,lchm_outnoise,rchm_innoise,rchm_outnoise)

% lchm_innoise - Left channel at mouth, input noise

% lchm_outnoise - Left channel at mouth, output noise

% rchm_innoise - Right channel at mouth, input noise

% lchm_outnoise - Right channel at mouth, output noise

% Htrue - "True" channel transfer function

% htrue - Impulse response for "true" channel

% lchm_Hinv - Est. of equ. transfer func., left ch. at mouth

% rchm_Hinv - Est. of equ. transfer func., right ch. at mouth

%

% (c)Nils Westerlund, 2000

[lchm_Hinv,F]=tfe(lchm_outnoise,lchm_innoise,512);

[rchm_Hinv,F]=tfe(rchm_outnoise,rchm_innoise,512);

nfft=2*length(lchm_Hinv);

Htrue=sqrt(lchm_Hinv.*rchm_Hinv);

Htrue=[Htrue;flipud(conj(Htrue(2:end-1)))];

htrue=real(ifft(Htrue));

htrue=[htrue(nfft/2+1:end);htrue(1:nfft/2)];

Htrue=Htrue(1:nfft/2);

43

Page 47: Equalization of Audio Channels

Bibliography

[1] Proakis J. G., Manolakis D. G. (1996). Digital Signal Processing, Principles,Algorithms and Applications (Prentice-Hall)

[2] Hayes M. H. (1996). Statistical Digital Signal Processing and Modeling (Wi-ley).

[3] Widrow B., Stearns S. D. (1985). Adaptive Signal Processing (Prentice-Hall).

[4] MatLab Reference Guide - System Identification Toolbox.

[5] MatLab Reference Guide.

[6] Parks T. W., Burrus C. S. (1987). Digital Filter Design (Wiley).

[7] Hakansson B., Carlsson P., Brandt A., Stenfelt S. (1995). “Linearity of soundtransmission through the human skull in vivo,” J. Acoust. Soc. Am. 99,2239-2243.

[8] Hellstrom P-A., Axelsson, A. (1991). “Miniature microphone probe tubemeasurements in the external auditory canal,” J. Acoust. Soc. Am. 93(2),907-919.

44