1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline Introduction Architecture of MT Rule-Based...
-
Upload
ginger-ferguson -
Category
Documents
-
view
218 -
download
0
Transcript of 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline Introduction Architecture of MT Rule-Based...
1
Machine Translation
Dai Xinyu
2006-10-27
2
Outline Introduction Architecture of MT Rule-Based MT vs. Data-Driven MT Evaluation of MT Development of MT MT problems in general Some Thinking about MT from
recognition
3
Introductionmachine translation - the use of computers to translate from one language to another
•The classic acid test for natural language processing.•Requires capabilities in both interpretation and generation.•About $10 billion spent annually on human translation.
http://www.google.com/language_tools?hl=en
"I have a text in front of me which is written in Russian but I am going to pretend that it is really written in English and that it has been coded in some strange symbols. All I need do is strip off the code in order to retrieve the information contained in the text"
4
Introdution - MT past and present mid-1950's - 1965:
Great expectations The dark ages for MT:
Academic research projects 1980's - 1990's:
Successful specialized applications 1990's:
Human-machine cooperative translation 1990's - now:
Statistical-based MT Hybrid-strategies MT
Future prospects: ???
5
Interest in MT Commercial interest:
U.S. has invested in MT for intelligence purposes
MT is popular on the web—it is the most used of Google’s special features
EU spends more than $1 billion on translation costs each year.
(Semi-)automated translation could lead to huge savings
6
Interest in MT Academic interest:
One of the most challenging problems in NLP research
Requires knowledge from many NLP sub-areas, e.g., lexical semantics, parsing, morphological analysis, statistical modeling,…
Being able to establish links between two languages allows for transferring resources from one language to another
7
Related Area to MT Linguistics Computer Science
AI Compile Formal Semantics …
Mathematics Probability Statistics …
Informatics Recognition
8
Architecture of MT -- (Levers of Transfer)
9
Rule-Based MT vs. Data-Driven MT
Rule-Based MT Data-Driven MT
Example-Based MT Statistics-Based MT
10
Rule-Based MT
翻译系统
规则
x
语言学语义学认知科学人工智能
写规则
自然语言输入翻译结果
11
Rule-Based MT
12
Hmm, every time he sees “banco”, he either types “bank” or “bench” … but if he sees “banco de…”,he always types “bank”, never “bench”…
Man, this is so boring.
Translated documents
13
Example-Based MT origins: Nagao (1981) first motivation: collocations, bilingual
differences of syntactic structures basic idea:
human translators search for analogies (similar phrases) in previous translations
MT should seek matching fragment in bilingual database, extract translations
aim to have less complex dictionaries, grammars, and procedures
improved generation (using actual examples of TL sentences)
14
EBMT still going
Bi-lingual corpus Collection Store Searching and matching …
15
Statistical MT Basics Based on assumption that translations
observed statistical regularities origins: Warren Weaver (1949) Shannon’s information theory
core process is the probabilistic ‘translation model’ taking SL words or phrases as input, and producing TL words or phrases as output
succeeding stage involves a probabilistic ‘language model’ which synthesizes TL words as ‘meaningful’ TL sentences
16
Statistical MT
学习系统
预测系统
nxxx 21
1nx
概率模型
统计学习
)x(p̂ n 1
建立模型
自然语言输入
自然语言输入 预测
17
Statistical MT schema
18
Statistical MT processes Bilingual corpora: original and translation little or no linguistic ‘knowledge’, based on word co-
occurrences in SL and TL texts (of a corpus), relative positions of words within sentences, length of sentences
Alignment: sentences aligned statistically (according to sentence length and position)
Decoding: compute probability that a TL string is the translation of a SL string (‘translation model’), based on: frequency of co-occurrence in aligned texts of corpus position of SL words in SL string
Adjustment: compute probability that a TL string is a valid TL sentence (based on a ‘language model’ of allowable bigrams and trigrams)
search for TL string that maximizes these probabilities argmaxeP(e/f) = argmaxeP (f/e) P (e)
19
Language Modeling Determines the probability of some English
sequence of length l P(e) is normally approximated as:
where m is size of the context, i.e. number of previous words that are considered,
m=1, bi-gram language model m=2, tri-gram language model
e1l
P(e1l ) P(e1 )P(e2 | e1) P(eii3
l | ei mi 1 )
20
Translation Modeling Determines the probability that the foreign
word f is a translation of the English word e How to compute P(f | e) from a parallel
corpus? Statistical approaches rely on the co-
occurrence of e and f in the parallel data: If e and f tend to co-occur in parallel sentence pairs, they are likely to be translations of one another
21
SMT issues ignores previous MT research (new start, new ‘paradigm’)
basically ‘direct’ approach: replaces SL word by most probable TL word, reorders TL words
decoding is effectively kind of ‘back translation’ originally wholly word-based (IBM ‘Candide’ 1988) ; now predominantly phrase-based (i.e. alignment of word groups); some research on syntax-
based mathematically simple, but huge amount of training (large databases) problems for SMT:
translation is not just selecting the most frequent ‘equivalent’ (wider context)
no quality control of corpora lack of monolingual data for some languages insufficient bilingual data (Internet as resource) lack of structure information of language
merit of SMT: evaluation as integral process of system development
22
Rule-Based MT & SMT SMT black box: no way of finding how it works in
particular cases, why it succeeds sometimes and not others
RBMT: rules and procedures can be examined RBMT and SMT are apparent polar opposites, but
gradually ‘rules’ incorporated in SMT models first, morphology (even in versions of first IBM model) then, ‘phrases’ (with some similarity to linguistic
phrases) now also, syntactic parsing
23
Rule-Based MT & SMT Comparison from following perspectives:
Theory background Knowledge expression Knowledge discovery Robust Extension Development Cycle
24
Evaluation of MT
Manual: Precise / fluency / integrality 信 达 雅
Automatically evaluation: BLEU: percentage of word sequences (n-grams) occurring in reference texts NIST
25
Development of MT - MT System
26
Knowledge Acquisition Strategy
Knowledge Representation Strategy
All manual
Deep/ Complex
Shallow/ Simple
Fully automated
Learn from un-annotated data
Phrase tables
Word-based only
Learn from annotated data
Example-based MT
Original statistical MT
Typical transfer system
Classic interlingual system
Original direct approach
Syntactic Constituent Structure
Interlingua
New Research Goes Here!
Semantic analysis
Hand-built by non-experts
Hand-built by experts
Electronic dictionaries
MT Development - Research
27
MT problems in general
Characters of language Ambiguous Dynamic Flexible
Knowledge How to express How to discovery How to use
28
Some Thinking about MT from recognition
Human Cerebra Memory Progress - Learning Model Pattern
Translation by human… Translation by machine…
29
Further Reading Arturo Trujillo, Translation Engines: Techniques for Machine Translation,
Springer-Verlag London Limited 1999 P.F. Brown, et al., A Statistical Approach to MT, Computational Linguistics,
1990,16(2) P.F. Brown, et al., The Mathematics of Statistical Machine Translation:
Parameter Estimation, Computational Linguistics, 1993, 19(2) Bonnie J. Dorr, et al, Survey of Current Paradigms in Machine Translation Makoto Nagao, A Framework of a Mechanical Translation between Japanese
and English by Analog Principle, In A. Elithorn and R. Banerji(Eds.), Artificial and Human Intelligence. NATO Publications, 1984
Hutchins WJ, Machine Translation: Past, Present, Future. Chichester: Ellis Horwood, 1986
Daniel Jurafsky & James H. Martin, Speech and Language Processing, Prentice-Hall, 2000
Christopher D. Manning & Hinrich Schutze, Foundations of Statistical Natural Langugae Processing, Massachusetts Institute of Technology, 1999
James Allen, Natural Language Understanding, The Benjamin/Cummings Publishing Company, Inc. 1987