[IEEE 2011 National Conference on Communications (NCC) - Bangalore, India (2011.01.28-2011.01.30)]...

Significance of the LP-MVDR Spectral RatioMethod in Whisper Detection

Arpit Mathur and Rajesh M. HegdeIndian Institute of Technology Kanpur

Kanpur 208016 India{arpitmat, rhegde}@iitk.ac.in

Abstract—A new spectral ratio method is proposed in this pa-per for detecting whispered segments within a normally phonatedspeech stream. The method is based on computing the ratio ofthe linear Prediction(LP) spectrum to the minimum variancedistortion less response (MVDR) spectrum. Both the linearprediction method and the LP residual method by themselves arefound to be inadequate in modelling medium to high frequenciesin the speech signal. On the contrary, the MVDR methodshows robustness in modelling spectra of all frequencies. Thisdifference in spectral estimation between the two is utilized in theproposed spectral ratio method to separate whispered segmentshaving less harmonics and more noise from normally phonatedsegments of speech. A comparative analysis of the proposedmethod with other methods like the LP residual and the spectralflatness methods is described. Whisper Detection experimentsare conducted on the CHAINS database. The proposed methodindicates reasonable improvements as noted from the ROC curvesand the whisper diarization error rate.

I. INTRODUCTION

Whispered speech is a natural speech mode when secrecy orquietness of the conversation are required. Whispered speechis different from voiced speech in many ways. Whisper is theresult of a rigid vocal chord vibration leading to the character-istic suppression in intensity as well as comprehensibility ofthe generated speech signal. Whisper is hence characterizedby shift of formants to higher frequencies, concentration ofspectral energies in higher frequency bands and lack of aspecific harmonic structure. Conventional speech recognitionengines are reported to give poor results with whisperedspeech signals. Hence there is need to recognize the whisperedsegments and adopt suitable modelling techniques for betterspeech recognition results. Linear Prediction [1], is the mostwidely used and accepted method, for whisper detection [2]andalso whisper island detection [3], [4]. Other methods like vocaleffort change point detection have also been used for improvedwhisper detection within normally phonated audio streams inthis context [5].

These methods in general parameterize an AR spectra usinga least square errors method. However the LP model workswell only for speech formants as in voiced speech and morespecifically at low frequencies. The most undesirable effect ofthese techniques is that the LP model tends to overestimate thepower at formant frequencies. Moreover increasing the modelorder increases the overestimation rather than correcting it.Hence this method is able to resolve harmonics but is poor atestimating the power at the frequencies in the spectrum. Hence

it leads to poor characterization of the vocal tract transfer func-tion. The minimum variance distortion less (MVDR) spectra[6], on the other hand, is capable of modelling the power of thespectra efficiently at all harmonic frequencies due to the natureof the estimation method. The MVDR spectra also respondsto increase in model order and improves the model at higherharmonic frequencies [7]. As whisper is characterized byformants shifts and overall increased concentration of power(due to more noise) in high frequency bands, the models mustessentially provide good modelling results in those bands.Thus exploiting the use of MVDR spectra is desirable inrobust whisper detection. Moreover, MVDR coefficients canbe computed from LP coefficients themselves [8], making theprocess computationally less expensive.

The paper begins with a discussion on the inadequacy ofthe LP spectrum to model signal power at higher frequenciesand hence in detecting whispered segments within normallyphonated speech. The proposed spectral ratio method is de-scribed next with an algorithm to detect whispered speechsegments. Whisper segmentation plots are illustrated for theproposed spectral ratio method and also two other conventionalmethods like the spectral flatness [9], [10], and the LP resid-ual method. Experimental results on the CHAINS database[11], [12], are reported next with concluding remarks on thesignificance and future scope of the work presented herein.

II. THE LP SPECTRUM AND WHISPER DETECTION

Let the input speech signal be s(n). The LP method modelsspeech as an autoregressive process

ˆ𝑠(𝑛) =𝑚∑

𝑘=1

𝑎𝑘𝑠(𝑛− 𝑘) (1)

Hence the residual signal or the modelling error (e(n)) is

𝑒(𝑛) = 𝑠(𝑛)−𝑚∑

𝑘=1

𝑎𝑘𝑠(𝑛− 𝑘) (2)

𝐸(𝑧) = 𝑆(𝑧)(1−𝑚∑

𝑘=1

𝑎𝑘𝑧−𝑘) (3)

This is mathematically equivalent to applying an FIR filterA(z)= 1-

∑𝑚𝑘=1 𝑎𝑘𝑧

−𝑘, on the input signal 𝐺𝑢(𝑛). The outputof the filter is also the error spectrum E(z).

𝐸(𝑧) = 𝑆(𝑧)𝐴(𝑧) (4)

978-1-61284-091-8/11/$26.00 ©2011 IEEE

It is well known that an infinite order AR model can alwaysmodel the signal arbitrarily closely [1]. Hence the equation(4), is an exact equation. But limiting the order of the ARmodel to ’m’, leads to an approximation. Also as the signalis finite in (n=0-N-1), the error can only be minimized at bestwith a finite order AR model. Let the minimum possible errorfor this model be 𝐸𝑚𝑖𝑛. Hence the equation becomes

𝑆(𝑧) =𝐸𝑚𝑖𝑛

1−∑𝑚𝑘=1 𝑎𝑘𝑧

−𝑘(5)

with non zero 𝐸𝑚𝑖𝑛 and the transfer function being 𝐴(𝑧). Theestimated power 𝑃 (𝜔) of the speech signal 𝑆(𝑧) can now becalculated by taking the square of the modulus as

𝑃 (𝜔) =∣𝐸(𝜔)∣2

∣1−∑𝑚𝑘=1 𝑎𝑘𝑧

−𝑘∣2 (6)

However the error minimization strategy is the key to theinadequacy of LP spectra in estimating the power spectra atmedium and high frequencies. To illustrate this let us definethe total error as

𝐸𝑛𝑒𝑡 =12𝜋𝑇

∫ 𝜋𝑇

−𝜋𝑇

∣𝐸(𝜔)∣2 𝑑𝜔 (7)

The standard Parsevals relation has been used in this context.Further substituting the value of ∣𝐸(𝜔)∣ from (4) and 𝑆(𝜔)from (5), we have the total error as

𝐸𝑛𝑒𝑡 =𝐸2

𝑚𝑖𝑛𝑇

2𝜋

∫ 𝜋𝑇

−𝜋𝑇

𝑃 (𝜔)

𝑃 (𝜔)𝑑𝜔 (8)

Hence the total error can be represented in terms of integralof the ratio of the actual power to the estimated power. As themodel is just an approximation of the actual spectrum, it isimportant for a model to approximate the powers at spectralfrequencies relevant for whisper detection. To illustrate thisfurther consider the ratio inside the integral in 8. The numer-ator is the actual spectral power and the denominator is theestimated spectral power.

In this context two cases can be made out with respect tothe spectral ratio.

A. Case1 : Estimated power (𝑃 (𝜔))is over estimated by 𝜖times the actual power (𝑃 (𝜔))

. The ratio within the integral in this case turns out to be

(𝑃 (𝜔))

(𝑃 (𝜔))(1 + 𝜖)=

1

1 + 𝜖= 1− 𝜖

1 + 𝜖(9)

B. Case2 : Estimated power (𝑃 (𝜔))is under estimated by 𝜖times the actual power (𝑃 (𝜔))

. The ratio within the integral in this case turns out to be

(𝑃 (𝜔))

(𝑃 (𝜔))(1 − 𝜖)= 1 +

𝜖

1− 𝜖(10)

Comparing the two afore mentioned cases, the effect onerror is more when 𝑃 (𝜔) > 𝑃 (𝜔) than when the inequalitysign is reversed. Hence the error in 𝐶𝑎𝑠𝑒2 is much largerthan in 𝐶𝑎𝑠𝑒1. However error minimization strategies do not

take these cases into account resulting in overestimation atcertain frequencies. This overestimation at crucial harmonicfrequencies is liable to give poor modelling results for whisperdetection. The inaccurate modelling at higher frequencies byLP, compounds the problem for whisper detection.

III. WHISPER DETECTION USING MVDR

As discussed in the preceding Section, the LP envelope overestimates the actual speech spectrum at harmonic frequencies.This over estimation is of concern in whisper detection asthe harmonic frequencies shift to the higher frequency regionsin whispered segments. A whisper spectrum is characterizedby low SNR and hence high noise content. Hence there is apronounced over estimation effect in LP based spectrum esti-mation at the formants that are shifted to a higher frequency,typically 1.2 - 1.4 times the normal frequency [10]. Howeverthe MVDR spectral estimation method ensures the signal ofinterest is not distorted in any frequency range. From a whisperdetection standpoint the smoothness of the MVDR spectrumin all spectral regions assumes great importance. We brieflydescribe the MVDR method of spectral estimation in the nextsection to set the prelude for the discussion on the proposedspectral ratio for whisper detection.

A. The minimum variance spectrum estimation and whisperdetection

The MVDR spectrum estimate [7], is a non parametric, dataadaptive technique that can be used to obtain better resolutionthan the DFT based spectrum estimation methods. The MVDRspectral estimate of order 𝑀 is given by

𝑅𝑚𝑣𝑑𝑟𝑀 (𝑒𝑗𝜔) =

1

v𝐻(𝜔)R𝑥−1v(𝜔)

, (11)

where R𝑥 is the (𝑀)× (𝑀) data autocorrelation matrix and

v(𝜔) = [1, 𝑒𝑗𝜔, 𝑒𝑗2𝜔, 𝑒𝑗3𝜔 , ....., 𝑒𝑗(𝑀−1)𝜔 ]𝑇. (12)

This estimate has some interesting properties which we brieflymention below.

It can be efficiently computed exploiting the relationshipwith linear prediction methods as

𝑅𝑚𝑣𝑑𝑟𝑀 (𝑒𝑗𝜔) =

1∑𝑀𝑘=−𝑀 𝜇(𝑘)𝑒−𝑗𝜔𝑘

(13)

where the parameters 𝜇(𝑘) are obtained by a simple non iter-ative computation involving the linear prediction coefficientsby minimizing the prediction error variance 𝑅𝑒 [8], as

𝜇(𝑘)

{1𝑃𝑒

∑𝑀−𝑘𝑖=0 (𝑀 + 𝑖− 𝑘 − 2𝑖)𝑎𝑖𝑎

∗𝑖+𝑘 𝑘 = 0...𝑀

𝜇∗(−𝑘) 𝑘 = −𝑀...− 1(14)

The filter bank interpretation of MVDR is most insightful forour problem. The MVDR spectrum at a given frequency 𝜔𝑘

can be viewed as the power at the output of a FIR filter whosecoefficients 𝛽 = [ℎ(0), ℎ(1), ...., ℎ(𝑀 − 1)]

𝑇 are obtained asa solution to the following constrained optimization problem

min𝛽

𝛽𝐻R𝑥𝛽 subject to𝛽𝐻v(𝜔) = 1.

The linear constraint ensures the signal of interest is notdistorted and the minimization of the output power minimizesleakage from other frequencies.

B. The LP-MVDR spectral ratio method for whisper detection

It is important to note that MVDR spectrum is a smootherspectrum when compared to the LP spectrum. This is onaccount of the fact that an MVDR spectrum at any frequencycan be represented as a harmonic average of the LP spectraof a particular order [8].

1

𝑃𝑀𝑉 (𝜔)=

𝑝∑𝑘=0

1

𝑃𝐿𝑃 (𝑘)(𝜔)(15)

This averaging effect smooths out the spectrum at the regionsof sharp rise i.e. at the harmonics. Thus the MVDR spectrumtends to have lower amplitude than that of corresponding LPspectra at the harmonics. Figure 1 shows the various spectra(80th order) for a short segment of normally phonated speech.From the illustration in Figure 1, it is clear that a LP to MVDR

1000 2000 3000 4000 5000 6000 7000 8000

−15

−10

−5

0

Frequency in Hz

Ma

gn

itu

de

in

dB

FFTLPMVDR

Fig. 1. Diagram illustrating the various spectra for a short segment ofnormally phonated speech . MVDR (red), LP (blue), and FFT (gray), spectra.

ratio spectrum can be used to identify the whisper segments inspeech, since this ratio is expected to be high where the speechsignal has significant harmonics in the higher frequency regionthan in the normal phonated speech spectrum. In the contextof whispered speech where the harmonic shifts are prominent,this ratio is expected to be high. Also the ratio is expected tobe robust to wide band noise because the LP spectrum of highorder can still model the spectrum and the averaging effect ofMVDR will eliminate the effect of wide band noise when aspectral ratio is taken. Formally the the LP-MVDR spectralratio is defined as

𝑋 =ˆ𝑃𝐿𝑃 (𝜔)

𝑃𝑀𝑉 (𝜔)=

(∣𝐸(𝜔)∣2)(∑𝑀𝑙=−𝑀 𝜇(𝑘)𝑒−𝑗𝜔𝑙)

∣1−∑𝑚𝑘=1 𝑎𝑘𝑧

−𝑘∣2 (16)

The LP-MVDR ratio spectrum is computed on a short timebasis for each speech data window. The ratio spectrum isfurther smoothed and a threshold is decided depending on thepenalties fixed based on the False Alarm Rate and Detection

Failure Rate. The whispered segments can then be segmentedfrom the normally phonated segments of speech.

The salient steps used for whisper detection using the LP-MVDR ratio spectrum is described below:

∙ Hamming Window the test speech waveform using aframe size of 20 ms and a frame overlap of 50%.

∙ Compute the MVDR coefficients using Equation 14, foreach frame.

∙ Compute the smooth MVDR Power spectrum for suffi-cient number of frequency points.

∙ Compute the linear predictor coefficients for each frame.∙ Compute the LP power spectrum for the same number

of frequency points as used for computing the MVDRpower spectrum.

∙ Compute the LP to MVDR ratio spectrum.∙ Select the threshold according the penalties required for

False Alarm Rate and Detection Failure Rate to segmentthe whisper from within normally phonated speech.

IV. PERFORMANCE EVALUATION

In this Section, we evaluate the proposed LP-MVDR ratiospectrum for detection of whispered segments within normallyphonated speech streams from the CHAINS corpus [11], [12].The results of detection using the proposed are also comparedwith the conventional spectral flatness measure and also thewidely used LP residual autocorrelation method. The resultsare presented are ROC curves and using a whisper diarizationerror rate.

A. Whisper database

The CHAINS speech corpus [11], consists of 36 speakerswith recordings done in two different sessions that weretwo months apart. Two recording sessions provided speechin six different speaking styles. The first recording session(solo) was carried out in a professional recording studioand speakers were recorded in a sound-attenuated booth.The recordings in the released corpus were done using aNeumann U87 condenser microphone. The second recordingsession (whisper) was carried out in a quiet office environment,using an AKG C420 headset condenser microphone. Boththe whispered speech reading and solo reading parts of thecorpus are used in our experiments. In the solo mode, thespeakers are asked to speak with their natural pace and tone.On the other hand in the whispered mode, subjects read alltexts in a whisper. Any involuntary switch to modal voicing isinterpreted as a dysfluency and leads to a restart of the phrase.The texts spoken in the corpus contain sections of four famousfables, twenty TIMIT sentences and nine sentences fromCSLU speaker identification corpus. Segments of solo andwhispered speech that form a complete sentence are appendedin our experiments to test the changes in the proposed ratiospectrum with the change in the mode of speaking.

B. Results of whisper segmentation

In order to illustrate the segmentation performance of theproposed ratio spectrum, experiments were conducted using

a forty point LP-MVDR ratio spectrum. The segmentationresults illustrated herein are based on the following paragraphfrom the CHAINS corpus.

”One fine day it occurred to the Members of the Body thatthey were doing all the work and the Belly was having all thefood......the Hands could hardly move.”

Whisper segmentation results with the proposed ratio spec-trum are illustrated in Figure 2. The ratio spectrum was

2 3 4 5 6 7 8 9 10 11 12

x 104

−1.5

−1

−0.5

0

0.5

1

Sample Number

Am

plit

ud

e

Speech Signal with WhisperLP − MVDR Ratio Spectrum

Fig. 2. Whisper segmentation using the LP-MVDR ratio spectrum. LP-MVDR ratio spectrum (red) and speech signal with whisper (blue).

smoothed using a moving average filter and the correspondingsegmentation result is shown in Figure 3.

0 2 4 6 8 10 12

x 104

−1.5

−1

−0.5

0

0.5

1

Sample Number

Am

plit

ud

e

Speech Signal with WhisperLP − MVDR Ratio Spectrum

Fig. 3. Whisper segmentation using the smoothed LP-MVDR ratio spectrum.

C. Comparison with other methods

In this section we compare the results of whisper detectionwith other methods like the spectral flatness method and theLP residual autocorrelation method.

1) Spectral flatness method of whisper detection: Spectralflatness [9] measures the flatness of the speech spectrum.A perfectly flat spectrum has a spectral flatness measure ofzero. On the other hand a spectrum with peaks and troughswill yield a value greater than zero but less than one. This

measure is often used used to separate voiced and unvoicedsegments of speech [9]. Whispered speech has properties veryclose to unvoiced speech, i.e, high noise content, lack of clearharmonic structure and concentration of energy in the higherfrequency region. The spectral flatness measure is thereforean ideal candidate for whisper detection. An analysis of thespectral flatness method for whisper detection can be found in[10]. The results of whisper segmentation using the spectral

0 2 4 6 8 10 12

x 104

−1

−0.5

0

0.5

1

1.5

Sample Number

Am

plit

ud

e

Spectral FlatnessSpeech Signal with Whisper

Fig. 4. Whisper segmentation using the Spectral Flatness method (in red).

flatness method is shown in Figure 4.2) LP residual Auto-correlation method: In this method

first the speech is segmented by a hamming window of 20msand 50% overlap. Then the corresponding LP residual is usedto segment the whispered and normally phonated parts [2].LP over-estimates the spectrum at the harmonic frequenciesas explained earlier. Hence the difference between actualspectrum and LP estimate is expected to show troughs at theharmonic frequencies. Correlation of this residual is foundbetween the two halves of a window and the maximum valueis computed. This correlation measure is further smoothenedover short duration windows and data clustering is done usingk-means clustering. As whisper contains uncorrelated noise,the correlation thus computed is expected to be lesser thanthat for the voiced segment. The result is shown in Figure5, where the upper level of output represents the whisperedsegment.

D. Experimental results on whisper detection

Whisper detection experiments were conducted on theCHAINS database and the results are illustrated as an ROCcurve in Figure 6. In Figure 6, True positive rate(TPR) isdefined as

𝑇𝑃𝑅 =𝑁𝑜. 𝑜𝑓 𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦 𝑑𝑒𝑡𝑒𝑐𝑡𝑒𝑑 𝑤ℎ𝑖𝑠𝑝𝑒𝑟 𝑠𝑒𝑔𝑚𝑒𝑛𝑡𝑠

𝑇𝑜𝑡𝑎𝑙 𝑛𝑜. 𝑜𝑓 𝑤ℎ𝑖𝑠𝑝𝑒𝑟 𝑠𝑒𝑔𝑚𝑒𝑛𝑡𝑠(17)

while the False Positive Rate(FPR) is defined as

𝐹𝑃𝑅 =𝑁𝑜. 𝑜𝑓 𝑤𝑟𝑜𝑛𝑔𝑙𝑦 𝑑𝑒𝑡𝑒𝑐𝑡𝑒𝑑 𝑤ℎ𝑖𝑠𝑝𝑒𝑟 𝑠𝑒𝑔𝑚𝑒𝑛𝑡𝑠

𝑇𝑜𝑡𝑎𝑙 𝑛𝑜. 𝑜𝑓 𝑝ℎ𝑜𝑛𝑎𝑡𝑒𝑑 𝑠𝑒𝑔𝑚𝑒𝑛𝑡𝑠(18)

0 2 4 6 8 10

x 104

0

0.5

1

1.5

2

2.5

Sample Number

Am

plit

ud

e a

nd

Qu

an

tize

d A

mp

litu

de

Quantized LP Residual ACFSpeech Signal with Whisper

Fig. 5. Whisper segmentation using the LPR-ACF method (in red).

In our experiments on whisper detection, short pauses inphonated speech also exhibited high values of the LP-MVDRratio spectrum. This can be observed in the ROC curve inFigure 6. This is expected as short pauses usually have onlynoise content due to inhalation of air through the nostrils.But the FPR is increased due to the presence of short pauses.Hence an increased FPR is observed at higher threshold valuesas indicated at the start of the ROC curve. Removal of shortpauses is expected to give even better results. Note that theproposed ratio spectrum gives reasonably better performancethan the other two conventional methods in terms of the areaunder the ROC curve. For evaluating the performance of the

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

False Positive Rate (FPR)

True

Pos

itive

Rat

e (T

PR

)

LP−MACFSpectral FlatnessLP−MVDR ratio spectrum

Fig. 6. ROC curve illustrating the whispered detection performance of theLP-MVDR ratio spectrum along with other methods

detection methods, a performance index similar to the onein [5], is used. The possible errors in whisper detection aregenerally the false alarm(FA) and the detection failure(DF)failure. Hence the whisper diarization Error Rate (WDER)using the above sources of error can be defined as

𝑊𝐷𝐸𝑅 =𝐶1.𝐹𝐴+ 𝐶2.𝐷𝐹

𝑁𝑓(19)

where 𝑁𝑓 denotes the total number of speech frames, and 𝐶1

and 𝐶2 are the weights assigned to FA and DF respectively.Note that 𝐶1 and 𝐶2 are selected based on the penalties fixedfor false alarm rate and detection failure rate respectively. Thewhisper detection results in terms of WDER is shown in TableI.

TABLE ICOMPARISON OF WDER FOR THE THREE WHISPER DETECTION METHODS

Spectral Flatness Modified Autocorrelation LP-MVDR Ratio0.4046 0.3507 0.2869

V. CONCLUSIONS

A new method based on the linear prediction to minimumvariance spectral ratio is proposed for detection of whisperedspeech segments in normally phonated speech streams. Thedifference in spectral estimation between the two techniquesis utilized in the proposed spectral ratio method to effectivelyseparate the whispered segments from normally phonatedsegments of speech. The ratio spectrum gives reasonably betterperformance in terms of the area under the ROC curve andthe WDER. However the presence of short pauses and breathreduce the efficiency of the method in certain regions ofthe ROC curve. Along with this issue we are also currentlyaddressing the issue of model order selection which can alsofurther improve the whisper detection performance.

VI. ACKNOWLEDGMENT

This work was funded by the BITCOE under project num-bers 20080252 and 20080253.

REFERENCES

[1] J. Makhoul, Linear prediction: A tutorial review, In Proc. IEEE,vol. 63,no. 4, pp. 561-580, Apr. 1975

[2] Michael A. Carlin , Brett Y. Smolenski , Stanley J. Wenndt ”UnsupervisedDetection of Whispered Speech in the Presence of Normal Phonation” InProc. INTERSPEECH-2006, paper 1990-Mon3CaP.13., 2006

[3] Chi Zhang and John H.L. Hansen ”Advancements in Whisper-IslandDetection Using The Linear Predictive Residual”, In Proc. ICASSP2010,pp.5170-5173, 2010

[4] Chi Zhang and John H.L. Hansen”Advancements in whisper-island detec-tion within normally phonated audio streams”, In Proc. INTERSPEECH-2009, pp.860-863, 2010

[5] Chi Zhang and John H.L Hansen ”Effective Segmentation based onVocal Effort Change Point Detection”, In Proc. ITRW, Aalborg, 2008

[6] P.J.Sherman and K.N.Lou ”On the family of ML Spectral Estimates formixed spectrum identification”, In IEEE Trans. Signal Processing, vol.39,pp 644-655, Mar.1991

[7] Manohar N. Murthi and Bhaskar D. Rao”Minimum Variance Distortion-less Response(MVDR) Modelling of Voiced Speech”, In Proc. ICASSP1997, Munich, Germany, Vol. 3, pp.1687-1690, 1997

[8] J.P.Burg ””The relationship between maximum entropy spectra andmaximum likelihood spectra”, Geophysics, vol. 37, pp 375-376, 1972

[9] Augustine Grey and John D. Markel ”A Spectral Flatness measure forstudying the Autocorrelation method of Linear Prediction of Speech Anal-ysis”, In IEEE Transactions on Acoustics,Speech and Signal ProcessingVol ASSP-22,No.3,pp.207-217, 1974

[10] Taisuke Ito and Kazuya Takeda and Fumitada Itakura ”Analysis andrecognition of whispered speech”, Speech Communication, 45(2), pp.139-152, 2005

[11] F. Cummins, M. Grimaldi, T. Leonard, and J. Simko ”The CHAINScorpus: Characterizing Individual Speakers”, In Proc of SPECOM,pp.431-435, 2006

[12] M. Grimaldi and F. Cummins ”Speaker Identification Using Instan-taneous Frequencies”, In IEEE Trans. Audio, Speech, and LanguageProcessing, 16(6), pp.1097-1111, 2008

[IEEE 2011 National Conference on Communications (NCC) - Bangalore, India (2011.01.28-2011.01.30)]...

Documents

Transcript of [IEEE 2011 National Conference on Communications (NCC) - Bangalore, India (2011.01.28-2011.01.30)]...