Andrew Rosenberg- Lecture 20: Model Adaptation

8/3/2019 Andrew Rosenberg- Lecture 20: Model Adaptation

1/35

Lecture20:ModelAdaptaon

MachineLearning

April15,2010


2/35

Today

AdaptaonofGaussianMixtureModelsMaximumAPosteriori(MAP)MaximumLikelihoodLinearRegression(MLLR)

Applicaon:SpeakerRecognionUBM-MAP+SVM


3/35

TheProblem

IhavealiOlebitoflabeleddata,andalotofunlabeleddata.

Icanmodelthetrainingdatafairlywell.

ButwealwaysfittrainingdatabeOerthantesngdata.

CanweusethewealthofunlabeleddatatodobeOer?


4/35

LetsuseaGMM

GMMstomodellabeleddata. Insimplestform,onemixturecomponentperclass.


5/35

LabeledtrainingofGMM

MLEesmatorsofparameters

rthesecanbeusedtoseedEM.

i =

tp(i|xt

)xt

tp(i|xt)

=

xt

x

nkt =

tp(i|xt)N

= niN

i =

xt

(xt )(xt )T

nk


6/35

Adapngthemixturestonewdata

Essenally,letEMstartwithMLEparametersasseeds. ExpandtheavailabledataforEM,proceedunlconvergence


7/35

Adapngthemixturestonewdata

Essenally,letEMstartwithMLEparametersasseeds. ExpandtheavailabledataforEM,proceedunlconvergence


8/35

ProblemwithEMadaptaon

TheiniallabeledseedscouldcontributeveryliOletothefinalmodel


9/35

neProblemwithEMadaptaon

TheiniallabeledseedscouldcontributeveryliOletothefinalmodel


10/35

MAPAdaptaon

Constrainthecontribuonofunlabeleddata.

Letthealphatermsdictatehowmuchweighttogivetothenew,unlabeleddatacomparedtotheexingesmates.

i = i

u p(i|xu)xuu p(i|xu)

+ (1 i )i

i =

i

up(i|xu)

U+ (1

i)i

i =

i

up(i|xu)(xu i)(xu i)

T

U

+ (1 i

)i


11/35

MAPadaptaon

Themovementoftheparametersisconstrained.


12/35

MLLRadaptaon

Anotheridea MaximumLikelihoodLinearRegression. Applyanaffinetransformaontothemeans. Dontchangethecovariancematrices

=W


13/35

MLLRadaptaon

Anotherviewonadaptaon. Applyanaffinetransformaontothemeans. Dontchangethecovariancematrices

=W


14/35

MLLRadaptaon

ThenewmeansaretheMLEofthemeanswiththenewdata.

i = Wii =

x

p(i|x,i, i,i

)xi

xp(i|x,i, i,i)


15/35

MLLRadaptaon


i = Wii =

x

p(i|x,i, i,i

)xi

xp(i|x,i, i,i)


16/35

MLLRadaptaon


i = Wii =

xp(i|x,i, i,i)xi

x p(i|x,

i

, i

,i

)Wi =

xp(i|x,i, i,i)xixp(i|x,i, i,i)

(1)T


17/35

WhyMLLR?

Wecanethetransformaonmatricesofmixturecomponents.

Forexample: Youknowthattheredandgreenclassesaresimilar Assumpon:Theirtransformaonsshouldbesimilar


18/35

WhyMLLR?

Wecanethetransformaonmatricesofmixturecomponents.

Forexample: Youknowthattheredandgreenclassesaresimilar Assumpon:Theirtransformaonsshouldbesimilar


19/35

ApplicaonofModelAdaptaon

SpeakerRecognion. Task:Givenspeechfromaknownsetofspeakers,idenfythespeaker.

Assumethereistrainingdatafromeachspeaker. Approach:

Modelagenericspeaker. Idenfyaspeakerbyitsdifferencefromthegenericspeaker

Measurethisdifferencebyadaptaonparameters


20/35

SpeechRepresentaon

Extractafeaturerepresentaonofspeech. Samplesevery10ms.

MFCC16dims


21/35

Similarityofsounds

MFCC1

MFCC2 /s/

/b/

/o//u/


22/35

UniversalBackgroundModel

Ifwehadlabeledphoneinformaonthatwouldbegreat.

Butitsexpensive,andmeconsuming. SojustfitaGMMtotheMFCCrepresentaonofallofthespeechyouhave.

Generallyallbutoneexample,butwellcomebacktothis.


23/35

MFCCScaOer

MFCC1

MFCC2 /s/

/b/

/o//u/


24/35

UBMfing

MFCC1

MFCC2 /s/

/b/

/o//u/


25/35

MAPadaptaon

Whenwehaveasegmentofspeechtoevaluate,

GenerateMFCCfeatures.

UseMAPadaptaonontheUBMGaussianMixtureModel.


26/35

MAPAdaptaon

MFCC1

MFCC2 /s/

/b/

/o//u/


27/35

MAPAdaptaon

MFCC1

MFCC2 /s/

/b/

/o//u/


28/35

UBM-MAP

Claim:Thedifferencesbetweenspeakerscanberepresentedbythemovementofthemixture

componentsoftheUBM.

Howdowetrainthismodel?


29/35

UBM-MAPtraining

Training

Data

Heldout

SpeakerN

UBM

Training

MAP

Supervector

Supervector Avectorofadaptedmeansofthegaussianmixturecomponents

xi =

0 1 . . . kT

ti = Speaker ID

Trainasupervisedmodelwiththese

labeledvectors.


30/35

UBM-MAPtraining

Training

Data

Heldout

SpeakerN

UBM

Training

MAP

Supervector

xi =

0 1 . . . kT

ti = Speaker ID

Repeatforalltrainingdata

Mulclass

SVM

Training


31/35

UBM-MAPEvaluaon

TestData

UBM

MAP

Supervector Mulclass

SVM

Predicon


32/35

AlternateView

Doweneedallthis? WhatifwejusttrainanSVMonlabeledMFCCdata?

TestData

Mulclass

SVM

Predicon

Labeled

Training

Data

Mulclass

SVM

Training


33/35

Results

UBM-MAP(withsomevariants)isthestate-of-the-artinSpeakerRecognion.

Currentstateoftheartperformanceisabout97%accuracy(~2.5%EER)withafewminutesof

speech.

DirectMFCCmodelingperformsabouthalfaswell~5%EER.


34/35

ModelAdaptaon

AdaptaonallowsGMMstobeseededwithlabeleddata.

Incorporaonofunlabeleddatagivesamorerobustmodel.

Adaptaonprocesscanbeusedtodifferenatemembersofthepopulaon

UBM-MAP


35/35

NextTime

SpectralClustering

Andrew Rosenberg- Lecture 20: Model Adaptation

Documents

Transcript of Andrew Rosenberg- Lecture 20: Model Adaptation