Subtitling & translation of weblectures by Carlos Turró Ribalta
-
Upload
recall-project -
Category
Education
-
view
584 -
download
2
description
Transcript of Subtitling & translation of weblectures by Carlos Turró Ribalta
![Page 1: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/1.jpg)
Rec: All Lecture Capture Workshop11 December 2013
Carlos TurróUniversitat Politècnica de València EC FP7 ICT project #287755
![Page 2: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/2.jpg)
Motivation
12 Nov 2013 2
• Video lecture repositories and MOOCs• Thousands of hours of video lectures available• Hundreds of hours of video lectures
recorded every week
• Most video lectures only available in their original language• No subtitles
![Page 3: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/3.jpg)
Motivation
12 Nov 2013 3
• Transcriptions and translations are needed• Accessibility for people with disabilities• Accessibility for speakers of different
languages• Search and analysis functions• Automated topic finding• …
![Page 4: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/4.jpg)
Motivation
12 Nov 2013 4
• Transcriptions and translations are needed• Accessibility for people with disabilities• Accessibility for speakers of different
languages• Search and analysis functions• Automated topic finding• …
• How do we get there?
![Page 5: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/5.jpg)
The transLectures approach
12 Nov 2013 5
1. Automatic Speech Recognition (ASR)and Machine Translation (MT)• Adaptation: Taking advantage of the
characteristics of video lecture repositories• High-quality automatic transcriptions and
translations
2. Interactive postediting:intelligent interaction for reduced effort
![Page 6: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/6.jpg)
Goals
12 Nov 2013 6
• Development of an engine for adaptation & Intelligent interaction
• Implementation• Case studies: Videolectures.NET & Polimedia• Real-life evaluation• Integration into Opencast Matterhorn
http://opencast.org/matterhorn/
![Page 7: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/7.jpg)
The transLectures partners
12 Nov 2013 7
Name Country
1 Universitat Politècnica de València Spain2 Xerox SAS France3 Institut Jožef Stefan Slovenia3+ Knowledge for All Foundation UK4 RWTH Aachen University Germany5 EML – European Media Laboratory Germany6 DDS – Deluxe Digital Studios UK
36 Months
Now we are in M25
![Page 8: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/8.jpg)
Statistical Transcription (and translation)
Acustic Model
LanguageModel
TRANSCRIPTION
Sound ASR Engine
![Page 9: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/9.jpg)
Statistical transcription(and translation)
Acustic Model
LanguageModel
Manually transcriptedvoice Modeling Engine
![Page 10: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/10.jpg)
Architecture of TransLectures
Lecture
Language Model
Slides
Extracontent
Result
Intelligent interaction
Transcription Translation
![Page 11: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/11.jpg)
Languages
12 Nov 2013 11
• Transcription (ASR)• EN• SL• ES
• Translation (MT)• EN>SL , SL>EN• EN>ES , ES>EN• EN>FR• EN>DE
![Page 12: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/12.jpg)
Case study: VideoLectures.NET
15000 lectures
![Page 13: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/13.jpg)
Case study: Polimedia
10000 Learning Objects
![Page 14: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/14.jpg)
Demo
http://translectures.videolectures.nethttp://polimedia.upv.es/catalogo
http://translectures.eu/player/
![Page 15: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/15.jpg)
Scientific evaluations
• Transcription results
• WER: Word Error Rate (%)• Goal: WER < 20%
• EN, SL, ES
Worse
12 Nov 2013 15
Better
![Page 16: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/16.jpg)
Scientific evaluations
• Translation results
• BLEU• Goal: BLEU > 30
• EN>SL , SL>EN• EN>ES , ES>EN• EN>FR• EN>DE
Better
12 Nov 2013 16
Worse
![Page 17: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/17.jpg)
Y2 results and comparison
12 Nov 2013 17
![Page 18: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/18.jpg)
Y2 results and comparison
12 Nov 2013 18
![Page 19: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/19.jpg)
Y2 results and comparison
12 Nov 2013 19
![Page 20: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/20.jpg)
Massive adaptation
• Characteristicsof video lectures Just one person
Known speaker
Clear talking
No interruptions
Focused on a topic
Slides
12 Nov 2013 20
![Page 21: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/21.jpg)
Massive adaptation
12 Nov 2013 21
• Known speaker and topic• Slides• Related documents
![Page 22: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/22.jpg)
Intelligent interaction
12 Nov 2013 22
• Postediting automatic transcriptions/translations• The user invests the least possible effort• The system learns the most from it
• Confidence measures• Fast constrained search
![Page 23: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/23.jpg)
Intelligent interaction
12 Nov 2013 23
![Page 24: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/24.jpg)
Intelligent interaction
12 Nov 2013 20
![Page 25: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/25.jpg)
Implementation and integration
12 Nov 2013 25
• Videolectures.NET• Polimedia
• Opencast Matterhorn
![Page 26: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/26.jpg)
Online HTML5 VideoPlayer editor with editing capabilities.The user interface has three different editing layouts, and full keyboard support.User interaction statistics analyzed to improve user experience and develop a user model.
The tL player
![Page 27: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/27.jpg)
tL player
![Page 28: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/28.jpg)
Manual upload of lectures
![Page 29: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/29.jpg)
transLectures: tools available
12 Nov 2013 29
• The transLectures-UPV Toolkit (TLK) for ASR• www.translectures.eu/tlk
• RWTH Aachen: rASR, Jane (MT)• http://www-i6.informatik.rwth-aachen.de/web/Software/
Note that you need an acoustic & language model
![Page 30: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/30.jpg)
transLectures: tools at M30
• The tL player (& editor)• tL Opencast Matterhorn module• Cloud service for testing• Coming soon at M30 (www.translectures.eu)
More info at the OCWC conference
(Ljubljana) in April 2014
![Page 31: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/31.jpg)
Next steps for transLectures
12 Nov 2013 31
• Keep improving ASR and MT results• Keep improving tL open source tools (TLK, tL player)• External user evaluations (VL.NET and polimedia)• External trials: implementation in other universities
![Page 32: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/32.jpg)
Next EU project: EMMA
• MOOC related project
• transLectures work in adding 7 new transciption systems (English, Italian, Spanish, French, Dutch, Portuguese and Estonian)
• … and 8 translation systems (from Italian, Spanish, French, Dutch, Portuguese and Estonian into English; and from English into Italian and Spanish)
• Beginning in 2014
![Page 33: Subtitling & translation of weblectures by Carlos Turró Ribalta](https://reader035.fdocuments.us/reader035/viewer/2022070302/548330b0b47959000d8b4a09/html5/thumbnails/33.jpg)
www.translectures.eu
My mail (Carlos Turro)Project coordinator: Alfons Juan-Ciscar
EC FP7 ICT Programme – Project Number 28775512 Nov 2013 33
Thanks!