TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, Tilde

Post on 17-Jan-2015

124 views 0 download

Tags:

description

This presentation is a part of the MosesCore project that encourages the development and usage of open source machine translation tools, notably the Moses statistical MT toolkit. MosesCore is supported by the European Commission Grant Number 288487 under the 7th Framework Programme. For the latest updates go to http://www.statmt.org/mosescore/ or follow us on Twitter - #MosesCore

Transcript of TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, Tilde

s

Andrejs Vasiļjevsandrejs@tilde.com

MT Showcase | LocWorld2014 | Dublin | 04.06.2014

MT applications in the EU public sector

• Language technology developer

• Localization service provider

• Leadership in smaller languages

• Offices in Riga (Latvia), Tallinn (Estonia) and Vilnius (Lithuania)

• 130 employees

• Strong R&D team

• 9 PhDs and candidates

• Trusted partner of the EU for significant research projects

G2CGovernment to Citizens

G2CGovernment to Businesses

G2EGovernment to Employees

G2GGovernment to Governments

C2GCitizens to GovernmentseGovernment

• Better business environment

• Customers on-line, NOT in line

• Improving efficiency

• Increase participation

• Reach marginalized groupsGoals of

e-Government

80European languages

24 official languages

1.1B EUREU spends per annum on translation services

Multilingual Europe

The five most widely spoken foreign languages in Europe

38%

12% 11%7% 5%

0%

5%

10%

15%

20%

25%

30%

35%

40%

English French German Spanish Russian

% of European Union population

Europeans able to hold a conversation in additional language

54%

25%

10%0%

10%

20%

30%

40%

50%

60%

at least one at least two at least three

% of European Union population

98%Luxembourg

95% Latvia

94%Netherlands

93%Malta

92%SloveniaLithuania

91%Sweden

EU countries where people can speak at least one language in addition to their mother tongue

65%Hungary

62% Italy

61%United Kingdom

Portugal

60%Ireland

EU countries where majority of people can not speak anyforeign language

What role does translation playin your everyday life?

is translation important

?

43

16

30

% of EU population

important

very important

no role

‘‘The language of Europe istranslation’’

Umberto Eco

Preserving the European cultural and linguistic diversity

Securing at affordable costs the free flow of information and thought across language boundaries

Providing each language community with the most advanced technologies

… so that maintaining their mother tongue does not turn into a disadvantage

Credits: Hans Uszkoreit

EU MULTILINGUALITYIN PRACTICE:CASE STUDY

In October 2010, a Spanish lawyer turned to the Ombudsman, complaining that many public consultations are only published

in English, for example, consultations concerning a new partnership to help small and medium-sized enterprises and

concerning the freedom of movement of workers.

“The Commission should ensure that all European citizens are able to understand its public consultations, which should [..] be published in all the official languages. Its failure to do so is an instance of maladministration.”

4 October 2012The European Ombudsman, P. Nikiforos Diamandouros

[European] Commission [has] to ensure that every EU citizen's right to address the EU institutions in any of the EU official languages is fully respected and implemented by ensuring that public consultations are available in all EU official languages,[..] and that there is no language-based

discrimination [..]European Parliament resolution 2012/2676(RSP)

Fulfill the vision ofe-Government

AND

the promise oflanguage diversityThe e-Government

Challenge

machine translation

machine translation

What MT serves best

Short shelf life

Immediacy

Large volume

Multiple languages

Where it works

Embedded in webpages

Multilingual onlineservices

Social media

Multi-lingual chat

Mobile devices

Customizable, trainable

Domain specific

On-demand

In the cloud

Real –time

Security

PrivacySpecificrequirements

Case Study:LATVIA

Case study: MT for eGov in Latvia

Population 2,1 M

1,6 million native Latvianspeakers

Large Russian speakingpopulation (36%)

Lack of parallel data

Complex language structure

Highly inflected

Languagesituation in

Latvia

• To provide e-services to all the population / linguistic groups

• To develop technologies for supporting Latvian in information society

• To facilitate access to the information of European Union institutions

• To integrate in the infrastructure of EU multilingual servicesGOAL

MT @ eGov.LV

Online translationservice

Translation widget forintegration in eGovservice sites

Standardized API foruniversal integratability

INTEGRATABILITYMT @ eGov.LV

MT@

eGov.LV

custommachine

translationas easy

and affordable

as a cup

of coffee

• upload your dataTMX, XLIFF, DOC, PPTX, XLSX, PDF, XLZ, TXT

• combine it with the data on the LetsMT public repository

• generate your custom MT with a few mouse clicks

• run your MT system on the LetsMT cloud

• use it in your CAT tool with LetsMT plug-in

• integrate through LetsMT APIin online or desktop app

2,4 billionparallel sentences

5,8 billionmonolingual sentences

126languages

548MT systems

currently provided

Recently added:EUBookshop data

3.6 billions of words

from

135,000 documents

48 languages

>10 million parallel sentences

for some language pairs

Data for SMT training

customterminology

incremental data

customMT

Online Terminology Services

Translation

Training

SMT System Training and adaptation

Online Translation Service

Input Text for Translation

Parallel corpus

Monolingual corpus

Bilingual term collections

Monolingual Term

Extraction

Trained SMT

Model

Bilingual Term

Extraction

Translated Text

Multiple translation options

• copy texts into the online translator

• upload entire files and documents

• or translate in your own work environment (CAT tools)

.

Use case: MT to support EU Presidency

Case Study:LATVIA

Case study: MT as a Public Service inLithuania

25.1%

28.5%

Czech Polish

32.9%

Latvian

Productivity evaluation

Productivity evaluation

Estonian

Public

MT

Better than Google

Better than Google & Bing

37,38

44,15

28,8

38,42

24,22

37,97

35,04

42,92

18,59

32,56

21,45

35,3

26,95

37,32

16,86

33,09

17,42

30,14

16,7

26,11

0

5

10

15

20

25

30

35

40

45

50

English-Latvian Latvian-English English-Lithuanian Lithuanian-English English-Estonian Estonian-English

Tilde Google Microsoft Tartu Uni

Deliver the strategic vision, functional, technical and operational specifications of the infrastructure for EU public service for automated translation and other multilingual services and resources

Design a sustainable governance model for the multilingual infrastructure

Foster multi-stakeholders alliances to ensure its commitment and support for the deployment of the MLiand its usage

Towards the EU Public MT Infrastructure

Machine translation

bringing governments and citizens closer

tilde.comandrejs@tilde.com

Au

tho

r: -

Co

pyr

igh

t: S

tock

lib©

Ro

ber

t W

ilso

n