Tight Coupling between ASR and MT in Speech-to-Speech Translation
description
Transcript of Tight Coupling between ASR and MT in Speech-to-Speech Translation
Tight Coupling between ASR and MT in
Speech-to-Speech Translation
Arthur Chan
Prepared for Advanced Machine Translation Seminar
This Seminar Introduction (4 slides)
A Conceptual Model of Speech-to-Speech Translation
SpeechRecognizer
MachineTranslator
SpeechSynthesizer
waveformsDecodingResult(s) Translation
waveforms
Motivation of Tight Coupling between ASR and MT One best of ASR could be wrong MT could be benefited from wide range of
supplementary information provided by ASR• N-best list• Lattice• Sentenced/Word-based Confidence Scores
• E.g. Word posterior probability• Confusion network
• Or consensus decoding (Mangu 1999) Some observed that
• MT quality depends on WER.
Scope of this talk
SpeechRecognizer
MachineTranslator
SpeechSynthesizer
waveforms
1-best?
Translationwaveforms
Lattice?
N-best?
Confusion network?
1, Should we combine the two?2, How tight should be the
coupling?
Topics Covered Today The concept of Coupling
• The “tightness” of coupling between ASR and X• (Ringger 95)
Interfaces between ASR and MT in loose coupling• What could ASR provide?• What could MT use?
Very tight coupling• Ney’s formulae• AT&T Approach
Combination of features of ASR and MT• Direct Modeling
The Concept of Coupling
Classification of Coupling of ASR and Natural Language Understanding (NLU) Proposed in Ringger 95, Harper 94 3 Dimensions of ASR/NLU
• Complexity of the search algorithm• Simple N-gram?
• Incrementality of the coupling• On-line? Left-to-right?
• Tightness of the coupling• Tight? Loose? Semi-tight?
Tightness of Coupling
Tight
Semi-Tight
Loose
Summary of Coupling between ASR and NLU
Implication on ASR/MT coupling Generalize many systems
• Loose coupling• Any system which uses 1-best, n-best, lattice for
1-way module communication
• Tight coupling• AT&T FST-based system
• Semi-tight coupling• [Filled in a quote here]
Interfaces in Loose Coupling
Perspectives What output could an ASR generates?
• Not all of them are used but it could mean opportunity in future.
What algorithms could MT uses given a certain inputs?• On-line algorithm is a focus
Decoding of HMM-based ASR Decoding of HMM-based ASR
• Searching the best path in a huge HMM-state lattice.
1-best ASR result• The best path one could find from
backtracking. State Lattice (Next page)
Things one could extract from the state lattice From the backtracking information:
• N-best list • The N best decoding results from the state lattice
• Lattice• A lattice of the decoding but in the word level
From the lattice • N-best list• Confusion network.
• Or “consensus decoding” (Mangu 99)
Other things one could extract from the decoder Begin time and end time
• Useful in time-sensitive application• E.g. multi-modal applications
Sentence/Word-based Confidence Scores• Found to be pretty useful in many other
occasions
Experimental Results
How MT used the output? What decoding algorithms are using?
Tight Coupling
LiteratureEric K. Ringger, “A Robust Loose Coupling
for Speech Recognition and Natural Language Understanding”, Technical Report 592, Computer Science Department, Rochester University, 1995
[The AT&T paper]