Calibration based on duration quality measures function in...

Post on 09-Jun-2020

0 views 0 download

Transcript of Calibration based on duration quality measures function in...

Calibration based on duration quality measures function in noise robust speaker recognition for NIST SRE’12

Miranti Indar Mandasari, Rahim Saeidi and David van Leeuwen.

Biometric Technologies in Forensic ScienceBTFS Conference, 14 October 2013

Outline

● Introduction,

● Speaker recognition system,

● Corpora,

● Experiment setup,

● Calibration techniques,

– Conventional linear, and

– Quality measure function (QMF).

● Performance measures,

● Results, and

● Conclusion.

Introduction

● The importance of likelihood ratio calibration in speaker recognition:

– Likelihood ratio as a preferable form of score for forensic purposes,

– Acknowledged by the speaker recognition community through speaker recognition evaluation (SRE) by NIST, and

– Often, scores produced by the system are not in likelihood ratio form.

● Classic challenges in speaker recognition:

– Short duration, and

– Noisy speech.

Speaker recognition system

● Speech enhancement and feature extraction stage:

– Dynamic noise suppression rule and Wiener filter,

– 60 dimensional MFCCs feature, and

– Speech activity detection and feature warping.

● Modeling stage:

– Gender-dependent and 2048 components universal background model (UBM),

– 400 dimensional i-vectors,

– 200 dimensional linear discriminant analysis (LDA),

– Pre-PLDA modeling: i-vector centering, within class covariance normalization (WCCN), and i-vector length-normalization, and

– Probabilistic linear discriminant analysis (PLDA) scoring.

Corpora

● NIST SRE'12 database:

– Duration variability, and

– Noise conditions (crowd & HVAC):

● Clean / no-alteration, ● 15 dB noisy, and ● 6 dB noisy.

● Three datasets in the experiments:

– Development set from I4U (Dev-I4U),

– Evaluation set from I4U (Eval-I4U), and

– NIST SRE 2012 protocols (Eval-SRE'12).

● I4U is a joint effort from 9 research Institutes and Universities across 4 continents in joining the NIST SRE'12 evaluation.

Calibration

● Calibration is:

– The ability to set a threshold optimally if scores are used for decisions, or

– The ability to produce likelihood ratios that lead to minimum Bayes' risk for any cost function.

● Calibration techniques:

– Linear calibration with 2 parameters (conventional), and

– Linear calibration with additional quality measure function (QMF).

● Calibration stages:

– Training calibration parameters: Dev-I4U, and

– Evaluation of calibration: Dev-I4U, Eval-I4U, and Eval-SRE'12.

Linear Calibration

LikelihoodRatio

OffsetParameter

ScalingParameter

RawScore

● This two parameterized linear calibration refer to as conventional calibration,

● A monotonously increasing score-to-likelihood-ratio transformation so the discriminability stays the same, and

● The parameters w0 and w1 are found by minimizing cross-entropy (or Cllr) on a development set, i.e., by logistic regression.

QMF calibration

● QMF stands for quality measure function,

● QMF calibration is a linear calibration approach with quality measures as extra terms, and

● There are 4 proposed duration QMFs.

Quality Measure Function (QMF)

Duration of Model Segment

Duration of Test Segment

Extra OffsetParameters

LikelihoodRatio

OffsetParameter

ScalingParameter

RawScore

Duration-dependent Offset parameters

Quality measure functions

Performance measures(the lower the values, the better the performance)

● Equal error rate, E= or EER.

– Showing discrimination performance.

● Primary cost, Cprimary, of NIST SRE'12.

– Showing discrimination and calibration performances.

● Cost of log likelihood ratio, Cllr.

– Showing discrimination (minimum Cllr) and calibration (Cmc)

performances.

Results

EER on Dev-I4U

Clean 15 dB 6 dB0

0.5

1

1.5

2

2.5

3

3.5

4

No calibrationsConventional calibrationQMF calibration - Q1QMF calibration - Q2QMF calibration - Q3QMF calibration - Q4E

ER

(%

)Performance

Measure(EER & C-primary)

DatasetCalibrationTechnique

Trials Based on Noise Conditions

Cllr on Dev-I4UN

.A. O Q1

Q2

Q3

Q4

N.A

. O Q1

Q2

Q3

Q4

N.A

. O Q1

Q2

Q3

Q4

Clean 15 dB 6 dB

0

0.05

0.1

0.15

0.2

0.25

Cmcminimum Cllr

Cllr

PerformanceMeasure

(Cllr, min.Cllr and Cmc)

Dataset

PerformanceMeasuresCmc or miscalibration cost.

Cmc = Cllr - min.Cllr

Trials Based on Noise Conditions

Results

Dev-I4U

EER on Dev-I4U

Clean 15 dB 6 dB0

0.5

1

1.5

2

2.5

3

3.5

4

No calibrationsConventional calibrationQMF calibration - Q1QMF calibration - Q2QMF calibration - Q3QMF calibration - Q4E

ER

(%

)

C-primary on Dev-I4U

Clean 15 dB 6 dB0

0.05

0.1

0.15

0.2

0.25

0.3

No calibrationsConventional calibrationQMF calibration - Q1QMF calibration - Q2QMF calibration - Q3QMF calibration - Q4

C-p

rim

ary

Cllr on Dev-I4UN

.A. O Q1

Q2

Q3

Q4

N.A

. O Q1

Q2

Q3

Q4

N.A

. O Q1

Q2

Q3

Q4

Clean 15 dB 6 dB

0

0.05

0.1

0.15

0.2

0.25

Cmcminimum Cllr

Cllr

Results on

Eval-I4U

EER on Eval-I4U

Clean 15 dB 6 dB0

0.5

1

1.5

2

2.5

3

No calibrationsConventional calibrationQMF calibration - Q1QMF calibration - Q2QMF calibration - Q3QMF calibration - Q4E

ER

(%

)

C-primary on Eval-I4U

Clean 15 dB 6 dB0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

No calibrationsConventional calibrationQMF calibration - Q1QMF calibration - Q2QMF calibration - Q3QMF calibration - Q4

C-p

rim

ary

Cllr on Eval-I4UN

.A. O Q1

Q2

Q3

Q4

N.A

. O Q1

Q2

Q3

Q4

N.A

. O Q1

Q2

Q3

Q4

Clean 15 dB 6 dB

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

Cmcminimum Cllr

Cllr

Results

Eval-SRE'12

EER on Eval-SRE'12

Clean 15 dB 6 dB0

1

2

3

4

5

6

7

8

No calibrationsConventional calibrationQMF calibration - Q1QMF calibration - Q2QMF calibration - Q3QMF calibration - Q4E

ER

(%

)

Cllr on Eval-SRE12N

.A. O Q1

Q2

Q3

Q4

N.A

. O Q1

Q2

Q3

Q4

N.A

. O Q1

Q2

Q3

Q4

Clean 15 dB 6 dB-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

Cmcminimum Cllr

Cllr

C-primary on Eval-SRE'12

Clean 15 dB 6 dB0

0.2

0.4

0.6

0.8

1

1.2

No calibrationsConventional calibrationQMF calibration - Q1QMF calibration - Q2QMF calibration - Q3QMF calibration - Q4

C-p

rim

ary

Distribution of active speech duration in I4U and SRE'12 trials.

Conclusion

● The linear calibration with QMF as the additional terms shows a positive gain in the system performance compared to the conventional linear calibration with two terms.

● It is shown that by adding 1–2 extra parameters in the linear calibration through QMF approach, there is a potential to improve the calibration and discrimination performances of a speaker recognition system.

● In applying a QMF, it is important to design a development set that match the variability of duration in the evaluated set.

Thank you!

&

Questions?