Advances in WP2
-
Upload
stone-livingston -
Category
Documents
-
view
13 -
download
0
description
Transcript of Advances in WP2
Advances in WP2
Chania Meeting – May 2007
www.loquendo.com
2
Summary
• Unsupervised Adaptation
• Adaptation on Hiwire DB
Supervised vs Unsupervised
Adaptation
Chania Meeting – May 2007
www.loquendo.com
4
Supervised Adaptation
Gen. models
Adapted models
transcriptionsASR
Forced segmentation
AdaptationModule
forced segmentations
Speech
parametersAdaptation set
5
Unsupervised Adaptation
transcriptions
AdaptationModule
Gen. models
Adapted models
ASR
Forced segmentation
Speech
parameters
ASR
Recognition
Confidence based
selection
Adaptation set
forced segmentationsASR segmentations
Adaptation on HIWIRE DB
Chania Meeting – May 2007
www.loquendo.com
7
Kinds of Adaptation
Two kind of adaptation were performed:
• Multi-Condition: the adaptation data of all the speakers and all noise conditions are pooled. The models are adapted to channel, noise conditions, and non-native common aspects.
• Speaker-Dependent: Adaptation and tests are performed for each speaker separately, and all results are finally averaged. The models are adapted mainly to speaker’s voice, but also to channel and noise conditions.
8
Adaptation Types
Two type of adaptation are experimented:
• Supervised: the transcriptions of the sentences available in HDB are employed to perform forced segmentation of the adaptation utterances, providing the labels needed by the adaptation process, which is intrinsically supervised.
• Unsupervised: the transcriptions of the sentences are not employed, to simulate an “on-the-field” adaptation, and are approximated by the ASR outputs. Only the adaptation utterances recognized with a certain degree of confidence are used in the adaptation process, to avoid divergence due to incorrectly labeled data.
9
Multi-Condition Adaptation
Multi-Condition
Adaptation Denoising method
Noise ConditionAVG E.R.
%Method Type Clean LN MN HN
No -
No
90.5 49.1 27.5 5.0 43.0 -
LHN cons Supv 97.5 81.1 59.2 13.4 62.8 34.7
LHN spec Supv 98.2 90.9 79.6 34.8 75.9 57.7
No -
EM
90.2 71.9 55.0 16.6 58.4 27.0
LHN cons Supv 90.6 97.1 79.3 31.1 74.5 55.3
LHN spec Supv 98.0 93.2 83.7 35.5 77.6 60.7
LHN cons Unsupv EM 94.3 87.2 76.8 31.5 72.5 51.7
LHN spec Unsupv 93.7 85.5 73.7 27.1 70.0 47.4
• Adaptation is done with all the speakers and noise conditions together
• It adapts to channel, noise conditions, and non-native common aspects
10
Multi-Condition Adaptation
0
20
40
60
80
100
120
Clean LN MN HN AVG
wo
rd a
ccu
racy % No-Adapt No-Den
No-Adapt EM-Den
Supv-Adapt No-Den
Supv-Adapt EM-Den
Unsupv-Adapt EM-Den
• Adaptation is done with all the speakers and noise conditions together• It adapts to channel, noise conditions, and non-native common aspects
11
Comments
• supervised multi-condition adaptation gives good performance improvement. It operates well even without denoising, since it incorporates information of channel, noise and non-native accents in the models.
• The average best results are obtained with supervised adaptation in conjunction with denoising (60.7% E.R.)
• As expected, unsupervised adaptation is inferior to supervised adaptation (51.7% vs. 60.7% E.R.), but it proves to be an effective technique for adaptation in real life applications, when transcriptions of vocal material are not available.
13
Speaker Adaptation• Adaptation is done speaker by speaker
• Starting Models: Microphone 16kHz
• Denoising method is SNR dep. Ephraim-Malah spectral attenuation
0
20
40
60
80
100
120
Clean LN MN HN AVG
wo
rd a
ccu
racy
%
No Adapt - Adapt Supv Adapt Unsupv
14
Comments
• Speaker adaptation is very effective on HDB. The error reduction achieved by Supervised Adaptation plus Ephraim-Malah noise reduction is quite large
• The main improvements are in noisy conditions
• As expected, unsupervised adaptation is inferior to supervised adaptation, due to the errors introduced by the ASR transcriptions, but still it is very relevant.
15
Workplan
• Selection of suitable benchmark databases (m6)
• Baseline set-up for the selected databases (m8)
• LIN adaptation method implemented and experimented on the
benchmarks (m12)
• Experimental results on Hiwire database with LIN (m18)
• Innovative NN adaptation methods and algorithms for acoustic
modeling and experimental results (m21)
• Further advances on new adaptation methods (m24)
• Unsupervised Adaptation: algorithms and experimentation (m33)