Machine Translation at DARPA
description
Transcript of Machine Translation at DARPA
![Page 1: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/1.jpg)
Approved for Public Release, Distribution Unlimited
Machine Translation at DARPA
Joseph OliveProgram Manager
![Page 2: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/2.jpg)
Agenda
●Pre-GALE Programs and Studies
●DARPA and the Language Community
●GALE Plans
●GALE MT Evaluation
●GALE Accomplishments
●Future Research
2Approved for Public Release, Distribution Unlimited
![Page 3: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/3.jpg)
Language Research at DARPA
●Four Decades of Research
●Continuous progress● Limited vocabulary single talker
● Speaker-independent speech recognition
● Large vocabulary
● Machine translation● Natural language processing
●TIDES and EARS● Great Accomplishments
● Need for a New Program
3Approved for Public Release, Distribution Unlimited
![Page 4: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/4.jpg)
GALE Program Goal
4Approved for Public Release, Distribution Unlimited
Enable Automated Processes &English Speaking Soldiers and Commanders
to Absorb & Analyze All Incoming Information In a Timely Manner
Genres• Newswire• Broadcast news• New Groups• Talk Shows...
Languages• Arabic• Chinese...
Topics• Unbounded
![Page 5: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/5.jpg)
Planning for GALE
●The community offered:● More Data● Evaluations
● Word Error Rate - WER● Bilingual Evaluation Understudy - BLEU
●DARPA Questions:● What are the applications for the research?● When is a technology good enough?● What is new?● How will progress be measured?
5Approved for Public Release, Distribution Unlimited
![Page 6: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/6.jpg)
Pre-GALE Studies
●Main question – how good is good enough?
●New MT study●Interpolation between human and machine translation●Analysts as subjects●The birth of Human-Targeted Translation Error Rate - HTER
●HTER is the GALE MT metric
6Approved for Public Release, Distribution Unlimited
![Page 7: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/7.jpg)
HTER Translation Evaluation
7Approved for Public Release, Distribution Unlimited
Foreign Language Text & Speech
No. of errorsAccuracy =1 – No. of words
Translators
Evaluators
Adjudicator
Human Editors who conduct comparison
Gold Standard Translation
GALE Machine Translation
Which is right?Can it be ambiguous?
Is it an idiom?
GALE Machine Translation Engine
![Page 8: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/8.jpg)
HTER Editing Example
8Approved for Public Release, Distribution Unlimited
Machine translationThe statement said that the brothers in the military wing to regulate Al Jihad base in the country had carried out the assassination of one of the criminals in the city of penalty.
Corrected machine translationThe statement said that the your brothers in the military wing to regulate Al Jihad base in the country had carried out the assassination of one of the criminals in the city of penalty.
1 error
Corrected machine translationThe statement said that the your brothers in the military wing to regulate of the Al Jihad base in the country had carried out the assassination of one of the criminals in the city of penalty.
5 errors
Corrected machine translationThe statement said that the your brothers in the military wing to regulate of the Al Qaeda Jihad base in the country had carried out the assassination of one of the criminals in the city of penalty.
6 errors
Corrected machine translationThe statement said that the your brothers in the military wing to regulate of the Al Qaeda Jihad organization base in the country Mesopotamia had carried out the assassination of one of the criminal tyrants in the city of penalty Baquba.
11 errors in 33 words (67% accuracy) DeletionInsertion
Corrected machine translation
Human-Translated ReferenceThe statement said that “your brothers in the military wing of the Al-Qaeda Jihad Organization in Mesopotamia carried out an assassination of one of the criminal tyrants in the city of Baquba.”
![Page 9: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/9.jpg)
New Technologies Implemented in GALE
●Topic-Dependent Language Modeling●Morphology●Extraction●Syntax Analysis●Hierarchical Classes●Long Distance Language Models●Semantic Analysis●Predicate Argument Analysis
9Approved for Public Release, Distribution Unlimited
![Page 10: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/10.jpg)
Arabic Translation Targets – Structured Language
10Approved for Public Release, Distribution Unlimited
Base Φ1 Φ2 Φ3 Φ4 Φ5Line
90
80
70
60
50
40
90
80
70
60
50
40
75/90
55
35
% docu
ments
exce
eding
accu
racy
targe
ts
Acc
urac
y (%
)
Translation from text
Translation from speech
Completed
Pre-GALE
(% accuracy / % of documents)
35
55
75/90
65/8065/80
80/9080/90
75/8075/80
75/90
Targets include accuracy and consistency
85/8585/90
85/9085/85
90/8590/85
90/90
90/9090/9090/90
90/95
90/95
![Page 11: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/11.jpg)
Arabic Translation Results – Newswire
11Approved for Public Release, Distribution Unlimited
0 4 8 12 16 21 25 29 33 37 41 45 49 54 58 62 66 70 74 78 82 87 91 95 9960
65
70
75
80
85
90
95
100
Phase 490.0
% A
ccur
acy
% of documents
Ph 4
Target
![Page 12: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/12.jpg)
Arabic progress
Approved for Public Release, Distribution Unlimited
% e
rror
P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4NW WB BN BC
0.0
5.0
10.0
15.0
20.0
25.0
30.0
35.0
40.0
Arabic Machine Translation
Formal Text
Semi-Formal Text
Formal Audio
Semi-Formal Audio
12
![Page 13: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/13.jpg)
Chinese Progress
13Approved for Public Release, Distribution Unlimited
Formal Text
Semi-Formal Text
Formal Audio
Semi-Formal Audio
![Page 14: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/14.jpg)
Human vs. MachineGALE is as good as a single human in Arabic
1 8 15 22 29 36 43 50 57 64 71 78 85 9270
75
80
85
90
95
100
105
Human vs. Machine Arabic Formal Text
pass 1pass 2GALE P4P4-Target
Percent of Documents
Per
cent
Acc
urac
y
1 8 15 22 29 36 43 50 57 64 71 78 85 9270
75
80
85
90
95
100
105
Human vs. Machine Arabic Semi-Formal Text
pass 1pass 2GALE P4P4-target
Percent of Documents
Per
cent
Acc
urac
y
1 9 17 25 33 41 49 57 65 73 81 89 9770
75
80
85
90
95
100
105
Human vs. Machine Chinese Formal Text
pass 1pass 2GALE P4P4-Target
Percent of Documents
Per
cent
Acc
urac
y
1 9 17 25 33 41 49 57 65 73 81 89 9770
75
80
85
90
95
100
105
Human vs. Machine Chinese Semi-Formal Text
pass 1pass 2GALE P4P4-Target
Percent of Documents
Per
cent
Acc
urac
y
14Approved for Public Release, Distribution Unlimited
![Page 15: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/15.jpg)
Improving Translation of Chinese Speech
●Chinese transcription error rates are extremely low, but increase along with perplexity
●Improvement in translation of Chinese speech will require work in lowering perplexity
15Approved for Public Release, Distribution Unlimited
Evaluation Set
Formal Audio Semi-Formal Audio Overall
PPL CER PPL CER PPL CER
Phase 2 21 2.7 33 14.8 26 8.5Phase 3 30 4.6 33 18.7 31 11.7
![Page 16: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/16.jpg)
Phoneme Transcription Experiment, Human Vs. Machine
●Overall Goal● Assess the bounds of human phonetic recognition and compare with
machines●Previous Work
● Human recognition tested on artificial stimuli● Results show that human accuracy is extremely high● Artificial stimuli lack the complexity of natural speech
●The Problem● Isolate phonetic recognition from language biases ● Human phonetic discrimination abilities are intimately tied with language,
phonotactic and prosodic processing, and lexical and semantic familiarity●Solution
● Use natural speech for stimuli● Use transcribers who lack prosodic, phonotactic, lexical, and semantic
information, but share a phoneme space
16Approved for Public Release, Distribution Unlimited
![Page 17: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/17.jpg)
●Japanese speakers – Italian transcribers●15 Human Subjects●420 phonemes per subject
17Approved for Public Release, Distribution Unlimited
System Subst Del Ins PERASR HMM-CI 19.6 7.9 7.4 34.9
Human
Average 15.3 8.6 5.9 29.9
Best 9.0 4.0 4.3 17.2
Worst 16.6 10.7 10.2 37.5
Phoneme Transcription Experiment, Human Vs. Machine
●The difference between human and machine performance was around 10%
●Result indicates that progress in STT will require improved language models
![Page 18: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/18.jpg)
Systems in Use Today
18Approved for Public Release, Distribution Unlimited18
FOUO
Real-time translation of Arabic, Chinese, Spanish*, or Farsi* broadcasts and web text into English
BBNBroadcast Monitoring System& Web Monitoring System
Real-time translation of Arabic, Chinese, Spanish*, or Farsi* broadcasts and web text into English
BBN Web Monitoring System
IBM Translingual Automated Language Exploitation System
“The Baghdad system was under extensive operation and the users were very pleased with its capability”
– LTC. John Venhaus, commanding officer for Joint PSYOP Group at CENTCOM (Oct. 2007)
*Farsi and Spanish were funded by outside sources.
“We are excited about the upgrades and think the program is a great asset to the Global War on Terror and beyond.” – SFC Douglas Wilderman 10th Special Forces Group(A) (Nov. 2008)
![Page 19: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/19.jpg)
Broadcast Monitoring System* Arabic example
19Approved for Public Release, Distribution Unlimited19
Real-time streaming video(~5 min delay)
1Automatic transcription
of Arabic speech
2Automatic translationof Arabic transcript
3
Although there are no official sources, and accurate numbers of dead, many believe that the number this year is the largest since the American invasion of Iraq and the fall of Saddam Hussein’s regime two thousand three.
The estimated number of civilians killed daily in Iraq at least one hundred and twenty persons as well as the wounded.Sample Fielded Arabic
Translation
![Page 20: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/20.jpg)
DARPA Present Status
20Approved for Public Release, Distribution Unlimited
Success
● GALE – Groundbreaking Improvements in machine translation of Arabic and Chinese text and speech, in some cases approaching human performance
● TRANSTAC – New state of the art in two way multi-lingual communication by speech for tactical use
● Deployment – GALE and TRANSTAC technologies have been integrated into operational systems and transitioned to users.
![Page 21: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/21.jpg)
DARPA Present Status (Continued)
21Approved for Public Release, Distribution Unlimited
Limitations
● Lack of Flexibility – No ability to communicate or monitor informal language● Conversations, chat, messaging, etc. are mostly informal● Technology does not exist to cope with informal language models
● Lack of Reliability – Error propagation in multiple dialogue turns● To perform multi-turn conversations and chat we need extremely high translation accuracies● Need human machine dialogue to clarify and disambiguate input to reduce probability of error
● Lack of Robustness – No capabilities to translate speech signals of less than 25db SNR● Conversing and monitoring of conversation are often not in clean signal. ● Transcription of degraded signals are unusable
● Lack of Generality – Costly and time consuming methods to develop new language● Cannot duplicate the GALE effort for each new language and dialect
● Huge parallel corpora – $60M-$160M/language● Parallel corpora are insufficient
● e.g. Chinese corpora already consist of 200 million words● Requires expensive and time consuming annotations
![Page 22: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/22.jpg)
Future Language Research Areas
● One way translation – Monitoring● Improvement of translation quality in language very different from English (e.g. Chinese)● Inclusion of informal genres – conversation, e-mail, web chat, messaging● Extension into Arabic dialects – Modern Standard Arabic is seldom used in informal
genres● Fast acquisition of new language capabilities● Robustness to noise
● Two way translation – Communication● Human-machine dialogue● Human-human and human-computer verbal and text interaction
● Information retrieval – linguistically enabled search● Accurate retrieval of relevant, non-redundant information● Natural language query capability
● Language Understanding● Grounded language comprehension through experiential learning of objects, actions, and
consequences
22Approved for Public Release, Distribution Unlimited
These four thrusts share many underlying technologies
![Page 23: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/23.jpg)
Future Algorithm Research
●Rugged Syntactic, Semantic Role Labeling, and Predicate –Argument Analysis● Unconstrained topics and genres● Use semantic equivalences● Analysis of incomplete sentences and/or Analysis of inconclusive acoustic output● Projection of syntax and SRL from known to unknown languages
●Powerful Language Models● Modeling non-adjacent words● Utilizing syntactic and semantic information● Using wild cards for incomplete sentences and/or inconclusive acoustic output
●Analysis and Translation of Longer Input● discourse threading● Prosodic cues● Coherency of topics● Co-reference resolution● Content analysis
23Approved for Public Release, Distribution Unlimited
![Page 24: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/24.jpg)
Future Algorithm Research (Continued)
●Increasing reliability of two-way communication and natural language query
● Human – machine dialogue for clarification and disambiguation● Automatic error detection● Ambiguity resolution● Language generation● Multimodal input
●Semantic Role Labeling and Dependency Parsing Analysis in Both Source and Target Languages
●Dialects● Translation from one dialect to another (e.g. Modern Standard Arabic to dialectal
Arabic)● Dialect detection and identification
●New Techniques in Automatic Evaluation of Translation Quality as a Target for Optimization and Automatic Quality Assessment
●Language Understanding24Approved for Public Release, Distribution Unlimited
![Page 25: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/25.jpg)
www.darpa.mil
25Approved for Public Release, Distribution Unlimited
![Page 26: Machine Translation at DARPA](https://reader035.fdocuments.us/reader035/viewer/2022062323/568161b3550346895dd17d13/html5/thumbnails/26.jpg)
26Approved for Public Release, Distribution Unlimited
Abstract: Defense Advanced Research Projects Agency (DARPA) Program Manager Joseph Olive will discuss the Chinese and Arabic machine translation work being carried out under DARPA's Global Autonomous Language Exploitation Program. Topics will include preparation for the program, the evaluation paradigm, the current status, and potential future research directions.