Speaker Verification System using SVM
-
Upload
justina-wilcox -
Category
Documents
-
view
57 -
download
11
description
Transcript of Speaker Verification System using SVM
Jun-Won Suh Intelligent Electronic Systems
Human and Systems EngineeringDepartment of Electrical and Computer Engineering
Speaker Verification System using SVM
Page 2 of 12Research Progress: Jun-Won Suh
Outline – Summary of Ph.d Dissertation of Vincent Wan
• Speaker verification system
Extracting features
• Creating models of speakers
Generative models, discriminative models
Making generative models discriminative
• Developing speaker verification using SVMs
• My interest to improve our system.
Page 3 of 12Research Progress: Jun-Won Suh
Speaker verification system
• Authenticate a person’s claimed identity
• Text dependent and independent
The system models the sound of the client’s voice. (based on physical characteristics of the client’s vocal tract.)
A generic speaker verification system
• Feature extraction
• Enrolment
Creates a model for client’s voice
• Pattern matching
• Decision theory
Page 4 of 12Research Progress: Jun-Won Suh
Extracting features
• Building models of speakers depends on frequency analysis of the speaker’s voice.
• Linear predictive coding (LPC)
LPC assumes that speech can be modelled as the output of periodic pulses or random noise.
The solutions for these LPC coefficients is obtained by minimizing MSE.
• Perceptual linear prediction (PLP)
PLP combines LPC analysis with psychophysics knowledge of the human auditory system.
Ex: Human ear has a higher frequency resolution at low frequencies.
Page 5 of 12Research Progress: Jun-Won Suh
Creating models of speakers
• Generative models
Gaussian Mixture Model (GMM), Hidden Markov Model (HMM)
Models are probability density estimators that attempt to capture all of the fluctuations and variations of the data.
• Discriminative models
Polynomial classifiers, Support Vector Machines (SVM)
Models are optimized to minimize the error on a set of training samples.
Models draw the boundary between classes and ignores the fluctuations within each class.
• Generative models discriminative
Generative models use to estimate the within class probability densities and do not minimize a classification error.
Discriminative models achieves the highest performance in classification tasks.
Page 6 of 12Research Progress: Jun-Won Suh
Making generative models discriminative
• GMM-LR/SVM combination
GMM likelihood ratio
Bengio proposed that the probability estimates are not perfect and a better version would be
Bayes decision rule
)|(log)|(log)( XPMXPXS
cXPbMXPaXS )|(log)|(log)(
The input to the SVM is the two dimensional vector made up of the log likelihoods of the client and world models.
A limitation of these approaches arises from frame basis discrimination.
)|(log)|(log
)|(
)|(
XPMXPy
XP
MXP
Page 7 of 12Research Progress: Jun-Won Suh
Importance of kernels
• Early SVM using polynomial and RBF kernels
Optimization problems requiring significant computational resources that were unsustainable.
Employing cluster algorithms to reduce the accuracy.
Frame level training inputs discard the useful speaker classification information.
• SVM using score-space kernels
The variable length of utterance can be classified by sequence level.
Page 8 of 12Research Progress: Jun-Won Suh
Classifying sequences using score-space kernels
• The score-space kernel enables SVMs to classify whole sequences.
• A variable length sequence of input vectors is mapped explicitly onto a single point in a space of fixed dimension.
• The score-space is derived from the likelihood score.
• The likelihood ratio score-space
},...,{)}),,|(({)( 1^^ NkkkF
f
FxxXMXpfX
),|(
),|(log)}),|(({
22
11
MXP
MXPMXpf kkk
),|(
),|(log)(
22
11
MXP
MXPX
Page 9 of 12Research Progress: Jun-Won Suh
Computing the score-space vectors
Define the global likelihood of a sequence X = {x1, …, xNl}
Page 10 of 12Research Progress: Jun-Won Suh
Computing the score-space vectors
• The fixed length vectors of the likelihood ration kernel can be expressed as
• The final likelihood ratio kernel is
• The dimensionality of the score-space is equal to the total number of parameters in the generative models. Hence the SVM can classify the complete utterance sequences.
),|(log),|(log 2211 MXPMXP
)(
)()(
2
1
X
XX
Page 11 of 12Research Progress: Jun-Won Suh
Experiment Results on PolyVar
• The data has a noise.
• The data has a much more clients tests than YOHO.
Page 12 of 12Research Progress: Jun-Won Suh
Conclusion
• Add GMM-LR/SVM model in our verification system
• Add score-space kernel on SVM
Need to compare the computation requirement for Fisher and LR kernels.
Page 13 of 12Research Progress: Jun-Won Suh
References
• V. Wan, Speaker Verification using Support Vector Machines, University of Sheffield, June 2003
• V. Wan, Building Sequence Kernels for Speaker Verificaiton and Speech Recognition, University of Sheffield
• S. Bengio, and J. Marithoz, Learning the Decision Function for the Speaker Verification, IDIAP, 2001