Speech-to-Speech Translation: A New Direction for the Speech Industry SpeechTEK West February 21-23,...
-
Upload
ralph-logan -
Category
Documents
-
view
250 -
download
1
Transcript of Speech-to-Speech Translation: A New Direction for the Speech Industry SpeechTEK West February 21-23,...
Speech-to-Speech Translation: A New Direction for the Speech Industry
SpeechTEK West February 21-23, 2007
Mark Seligman, CEO
Converser for Healthcare is the world’s first commercially available speech-to-speech translation system for wide-ranging conversations. (Input via handwriting, touchscreen, and keyboard is also enabled.)
Converser for Healthcare is an affordable, reliable, portable translation system which can improve communication 24/7 between healthcare workers and patients with limited English proficiency.
Overview• Automatic Spoken Language Translation (SLT)
– an age-old dream• Practical SLT systems are now coming into use
– … but users must cooperate and compromise • History: three classes of SLT systems
– categorized by degree of user cooperation and linguistic or topical coverage
• Demo • Market• Commercial and research activity
Star Trek? Not!
• The goal: speak as usual – freely shift topics – full range of vocabulary, idioms, structures– spontaneous language: fragments, false starts,
hesitations– mumble– converse in noisy environments– ignore the translation program
• For now: some cooperation, compromise
The scientific problem: component integration
• Component technologies (SR, MT, TTS) – imperfect, hard to integrate
• Each is usable, but combination may fall below usefulness threshold
– error rates combine, compound
Class One
• Class One: voice-driven phrase book– linguistic coverage: narrow
– topical coverage: narrow
– cooperation required: low
• Fixed expressions or templates only – “I’d like a bottle of [beer, wine, soda], please.”
– “I’d like a bottle of [BEVERAGE], please.”
• Advantages for user– no need to carry a book -- use telephone
– selection of phrase by voice rather than finger
– translation output pronounced by native
• Technology– Speech recognition: IVR
– MT: flat lookup, template or example-based
– Engineering exercise: low risk
Phraselator by VoxTec
Class Two
• Class Two: robust speech translation in narrow domains – linguistic coverage: broad– topical coverage: narrow– cooperation required: medium
• Examples– Uh, could I reserve a double room for next Tuesday, please?– I need to, um, I need a double room please. That’s for next
Tuesday.– Hello, I’m calling about reserving a room. I’d be arriving next week
on Tuesday.• Advantages
– Lots of experience– Can optimize SR, MT: special grammars (patterns)– Interlingua possible for MT
• Challenges– Robust parsing still imperfect, so MT input is dirty– Some user frustration inevitable, but balanced by freedom– Risk: medium
Class two: Worldwide Research
• CMU/Univ Karlsruhe (USA/Germany)• ATR (Japan)• IRST (Italy)• ETRI (Korea)• GETA-CLIPS (France)• CAS-NLPR (China)• IBM (USA)
Class two: Research
SYSTEM DEVELOPER TIME DOMAINS LANGUAGES MT VOCAB
Head Transducers
AT&T Labs (USA)
1996 Travel information
accessing
English-Chinese / English-Spanish
Statistical 1200/1300
JANUS-III CMU (USA) 1997-
Hotel reservation , flight / train ticket booking , etc.
English-German, Japanese, Spanish, etc.
Multi-engine open
ATR-MATRIXATR-SLT
(Japan)1998- 2001
Hotel reservation
Japanese-English 、German etc.
Pattern-based
2000
Verbmobil
Univ. of Karlsruhe, DFKI etc. (Ger.)
1993-2000
Meeting appointmentGerman,
English, Japanese
Multi-engine 10000/2500
LodestarCAS-NLPR
(China)1999
Hotel reservation, travel information accessing
Chinese-Japanese, English
Multi-engine 2000
Class Three • Class three: highly interactive speech translation
with broad linguistic and topical coverage – linguistic coverage: broad– topical coverage: broad– cooperation required: extensive
• User achieves broad coverage by supervising• SR: need dictation for broad coverage• MT: need broad coverage, good quality
– Must be modifiable to enable interactive correction
In the beginning …
French: Qu’est-ce que vous étudiez? (What do you study?)
English: Computer science.(L’informatique.)
French: Qu'est-ce que vous faites plus tard? (What are you doing later?)
English: I'm going skiing.(Je vais faire du ski.)
French: Vous n'avez pas besoin de travailler? (You don't need to work?)
English: I'll take my computer with me.(Je prendrai mon ordinateur avec moi.)
French: Où est-ce que vous mettrez l'ordinateur pendant que vous skiez?(Where will you put the computer while you ski?)
English: In my pocket.(Dans ma poche.)
Market: U.S. Healthcare
• 200,000 potential customers• Healthcare venues
• 6,003 hospitals (2003 www.USNews.com)
• 836,156 physicians (2001 www.ama.com)
• 15-20 minutes/meeting
• $45-$150/hour for human interpreter
Value PropositionOperational
– significant ROI – 24/7 access to interpreting – reduced patient waiting time– more efficient use of employees (keep staff in
their positions)– patient SAFETY (real and perceived)– reduced liability: bilingual transcripts of
interaction with patients– compliance
Communication benefits– privacy – more verifiability, consistency than with
human interpreter– Informed consent
Worldwide Market
• IDC– Cross-language software:
• $67 billion (2000) to $237 billion (2005) – Worldwide e-business globalization support:
• > $540 billion – Multilingual communications, collaboration tools:
• $5 billion (by 2008)
• Allied Business Intelligence, Inc.– Worldwide human translation:
• $5.7 billion (in 2006)
• Global Reach• 70%+ of online population not native English
Markets
• Defense and Security– services, intelligence, allies– law enforcement
• Travel and Tourism• Language Instruction/Education• Government Service
– immigration– welfare, food stamps, etc.
• Business – B2C: customer service – B2B: multinational firms, global
partners/operations
• Consumer– online affinity/personal portals
(e.g. online dating)
Some Current Research/Commercial Activity
• Spoken Translation, Inc. (Converser)• IBM (Mastor)• Sehda (S-Minds)• SpeechGear (Compadre Interpreter)• VoxTec (Phraselator)• Sony/Sharp/NEC (tourist)• Ectaco (Dictionary +)• MIT (flight domain)• CMU (Arabic for military)• BBN (Arabic for military)
Thank you!
To view demo visit:
www.ConverserforHealthcare.com