MT@EC European Commission machine translation supporting e-government Spyridon Pilos Head of...
-
Upload
kristopher-mccoy -
Category
Documents
-
view
222 -
download
0
Transcript of MT@EC European Commission machine translation supporting e-government Spyridon Pilos Head of...
MT@ECEuropean Commission machine translation
supporting e-government
Spyridon PilosHead of language applications
Directorate-General for Translation
MT@WorkBrussels, 5.12.2014
European Commission machine translation and public administrations
• MT@EC: a service for the EU • The context of the free trial • Implementation• What next?
2
5
Why does the Commission need MT?
• The Commission…• DGT has 1700 translators• Over 2 M pages translated in 2013
• But……just to make europa.eu fully multilingual
almost 6.8 M documents to be translatedor 8 500 translators/year!
The result: Thousands of non-translated documents(and this does not include user generated content)
There are also interactions with and between actors in the Member States
Member State X
Administration
Member State Y
A2C
BusinessA2B
Citizens
Administration
EU Administrations
A2B
A2C
A2A
A2A A2A
First type
Second type6
7
Vision
Wouldn’t it be great if I could start using a public service in any Member State from any place and obtain the information in my mother tongue?
9
• ISA=programme for interoperability solutions for public administrations
• EIF=European Interoperability Framework
10
EIF*: 12 Underlying principles
Need for EC action• Subsidiarity and Proportionality
User needs and expectations• User Centricity, Inclusion and Accessibility,
Security and Privacy, Multilingualism, Administrative Simplification, Transparency, Preservation of Information
Collaboration• Openness, Reusability, Technological Neutrality
and Adaptability, Effectiveness and Efficiency
* European Interoperability Frameworkhttp://ec.europa.eu/isa/documents/isa_annex_ii_eif_en.pdf
The role of Machine TranslationMT is the only viable solution for: quick and cheap access to information in
foreign languages. understanding information received in a
foreign language that otherwise could not be used or would require substantial time and costs to translate.
making multilingual use of websites possible facilitating cross-lingual information search
and analytics.
That is why machine translation (MT) is acritically important technology for multilingual Europe
MT@EC: a European Commission product •
• Released : 26 June 2013 (version 1.0)• Version 2.0 released on 3 July 2014•
Languages: All 24 EU official languages552 language pairs (62 direct)
• Technology: Statistical machine translation using open source software Moses co-funded by EU Framework Programmes for research and innovation
• Development by DGT: between 2010-2013co-funded by the ISA programme (action 2.8)
• * Interoperability solutions for public administrationshttp://ec.europa.eu/isa/actions/02-interoperability-architecture/2-8action_en.htm
12
• Delivery: - web user interface (human to machine)- web services (machine to machine)
• Special features: • User interface in 24 languages• Source document format/formatting maintained [not for pdf]
• Specific output formats for translation: tmx and xliff• Translation can also be returned by email• Can translate multiple documents to multiple languages• Indication of quality for language pairs (using BLEU Scores)• Feedback mechanism (using EU Survey)
13
MT@EC description
• Secure hosting in the EC data centre • Access through ECAS (EC Authentication Service)• Secure document transfers :
- over sTESTA*, a very secure private network between public administrations in the EU, separate from the internet
- over the internet (through a secure https connection)
• * You can check if your organisation has access to sTESTA on: https://portal.testa.eu/jetspeed/portal/homepage/about.psml.
14
MT@EC security
MT@EC is already available for…
15
… the staff of European institutions and bodies:
… online services funded or supported by the EU … real-life trial and pilot projects with public administrations in the EU Member States
… collaboration projects with EMT* Universities
* European Masters in Translation
Commission Parliament Council Court of Justice Court of Auditors
Economic and Social Committee Committee of the Regions European Central Bank, European Investment Bank etc.
Online services connected to MT@EC in production
16
Service Description/URL
IMI Internal Market Information Systemhttp://ec.europa.eu/internal_market/imi-net/index_en.html
SOLVIT SOLVIT is an on-line problem solving network concerning misapplication of Internal Market law by public authorities.http://ec.europa.eu/solvit/
nLex A common gateway to National Lawhttp://eur-lex.europa.eu/n-lex/
Online services connected to MT@EC in test
17
Service Description/URL
e-Justice The future electronic one-stop-shop in the area of justicehttp://e-justice.europa.eu/
ODR Platform to facilitate the resolution of consumer disputes out-of-court (Alternative Dispute Resolution) http://ec.europa.eu/consumers/redress_cons/adr_en.htm
CircaBC Communication and Information Resource Centre for Administrations, Businesses and Citizens https://circabc.europa.eu/
EU Survey Tool for creating multilingual online surveys http://ec.europa.eu/eusurvey/
Online services to be connected to MT@EC in preparation
18
Service Description/URL
TED TED (Tenders Electronic Daily) is the online version of the 'Supplement to the Official Journal of the European Union', dedicated to European public procurementhttp://ted.europa.eu/
Joinup Joinup is an open collaborative platform supporting interoperability in Europehttps://joinup.ec.europa.eu/
Online services interested in using MT@EC discussions initiated (indicative list)
19
Service Description/URL
EURES The European employment services network (European Job Mobility portal)https://ec.europa.eu/eures/
EQF The portal supporting the implementation of the European Qualifications Framework for lifelong learning http://ec.europa.eu/eqf/home_en.htm
ESCO The multilingual classification of European Skills, Competences, Qualifications and Occupations which identifies and categorises skills and competences, qualifications and occupations in all 22 European languages and supports EURES and other similar portals https://ec.europa.eu/esco/
EPALE The European Portal for Adult Learninghttp://ec.europa.eu/epale
MT@EC for Public Administrations
20
Context: MT@EC "Pilot operation" phase until Q4/2014 (ISA)Objective: Develop and test in real-life conditions methods and structures for most efficient use of MT@EC by different beneficiaries (including PAs); normal operation of service.Conditions• PAs participate on a voluntary basis.• No cost for PAs other than use of internal resources.• No commitment by DGT on use of service after the end of
the pilot.Output• Service delivery models (including pricing)• Operational support structure and methods
MT@EC for Public Administrations
21
- Free real-life trial - Staff members can have direct access to the
standard MT@EC service [upon request by the individual PA staff member]
• - The Organisation can participate in a customisation pilot project, where DGT can also build specific engines with their own data.• [Administrative Agreement between PA and DGT needed,
to be signed until end of June 2015]
Customisation pilots for PA• Pilot A:Connect a PA information system
to the standard MT@EC service.• Pilot B: DGT builds custom engines with PA data
available through MT@EC to all• Pilot C: DGT builds custom engines with PA data
available through MT@EC only to the PA• Pilot D: DGT builds custom engines with PA data
for PA to run in PA premises• Pilot E: DGT assists PA to build own custom
engines to run in PA premises
22
If you are interestedemail [email protected]
23
Ongoing pilots
Country Name of administration Type Pilot
Finland Prime Minister's Office Central translation service C
Germany Bundesprachenamt Translation service of the Armed Forces E
Greece Hellenic Quality Assurance and Accreditation Agency for Higher Education Education administration A
Discussions were held with more PAs but did not lead to signature of agreements on pilots usually because: • there was no need for custom engines• the necessary data were not enough or could not be shared• resources could not be made available for the work
to be performed on the PA side.
Special types of "pilots" Networks (Association des Conseils d’État et Cours administratives suprêmes de
l'UE, Réseau des Présidents des Cours suprêmes judiciaires de l'UE, Legivoc project) New languages (Norwegian)
Staff access to MT@EC
24
• Get an individual ECAS user name and password (self-registration) using your work email address. [go to https://webgate.ec.europa.eu/cas/eim/external/register.cgi and follow the instructions]
• Send an email to [email protected] asking for the activation of access to the service.
• DGT will activate your access and inform you by email.
25
Users - total
Country reg'd using
Austria 3 3
Belgium 5 3
Bulgaria 1 1
Croatia 0 0
Cyprus* 77 46
Czech Republic* 25 15
Denmark 0 0
Estonia 3 3
Finland 2 2
France* 21 15
Germany* 30 28
Greece* 37 23
Hungary 1 0
Ireland 0 0
Country reg'd using
Italy 2 1
Latvia 0 0
Lithuania 1 1
Luxembourg 3 2
Malta 0 0
Netherlands 8 8
Poland 0 0
Portugal* 7 5
Romania 9 7
Slovakia* 86 39
Slovenia* 13 7
Spain* 9 7
Sweden* 3 3
UK 1 1
TOTALregistered
347using
22063,4%
* Countries where
national events were organised
Requests per user
Only one 32%
2 to 9 54%
10 or more 14%
26
Top 40 users
Domain Requests
Economy and finance 674
Agriculture 218
Foreign affairs 92
European affairs 61
Health 61
Modernisation 55
Education 48
Local government 48
Country Requests
Germany 633
Slovakia 313
France 156
Greece 125
Cyprus 75
Portugal 22
Finland 15
Spain 14
Slovenia 12
Bulgaria 10
Czech republic 10
Lithuania 10
Domain Requests
Transport 37
Telecom 20
Statistical authority 14
Employment 12
Interior 11
Justice 11
Police 11
27
Implementation
• Usually individuals ask for their own translations.• In some cases a translation service centralises requests
(for example through functional mailbox)• No guidelines on feedback or evaluation were imposed by
DGT. Quality is "fit for purpose" (compliance with user requirements). A feedback function is available in MT@EC.
• Translation to/from non-EU languages is very important in several cases.
• For translators, if MT is not integrated in their translation workflow so as to post-edit easily, then they will not use it.
• Original is sometimes hand-written or "confidential".
28
Feedback
• Different depending on whether it comes from translators or other users
• Little understanding of statistical MT technology and its constraints
• Several problems were pointed out:• document formats and formatting• national names and acronyms• non translation of "common" words• ommission of words• consistency• syntax, grammar etc.
Hint: Do not test on only one document to draw general conclusions. Usefulness depends greatly on factors such as type of document, quality of original, domain and language pair.
29
On the pilots• In most cases the generic engines were sufficient.• Difficult to find data that are useful in terms of quality and
quantity for building engines while ownership and confidentiality is an issue.
• Lack of clarity on status of the service after the end of the pilot discouraged investment on the side of PAs.
• Translation services asked for guidelines for evaluation and structured feedback.
• Information to technicians should be provided in their own language.
• Need more clarity on scope of "public administration".
Intermediate conclusions (1)
Intermediate conclusions (2)
30
On the service• Do not need too much security: sTesta to internet https• The interface should be multilingual• A tool for translators and other users: different attitudes.• Use depends on "fitness for purpose" and not on some
general quality of languageOn communication• Difficult to find the right network to promote (used ISA,
EUPAN, COTSOES, DGT Field Offices in MS etc.) • Promotion in national events in the language of the country
(even in videoconference) worked best.
OK
OK
MT@EC for EMT universities
31
• Free use for teaching or research.• Mutually beneficial project-based cooperation. • The teacher/researcher may ask for access to see how it
looks like and check whether it is relevant for his/her work.• If interested s/he sends a short project description (title,
duration, objectives, approach, expected volume of requests) and a list of more persons to access.
• At the end of the project s/he informs DGT on the outcome of the project or study, as well as any other feedback considered useful to improve the service and its use.
Status: On 30.11.2014 we had 103 registered users, of which 75 are
students, from 21 universities from 12 countries (11 EU MS and CH),
of which 9 have communicated a research/teaching plan.
from MT@EC...
32
to the CEF automated translation platform
What next?
CEF.AT will: • build on the existing MT@EC service• put emphasis on secure, quality, customisable MT
DISPATCHERmanaging
MT requests
MT enginesby language,
subject…
MT datalanguage resources
specific for each MT engine Language resources
built around Euramis
DATA
MODELLING
Customised interfaces
ENGINES HUB USER FEEDBACK DATA HUB
Users and Services
Generic MT
& piloting customisation
MT@EC Outline
CEF.AT platform Outline
DISPATCHERmanaging
MT requests
MT enginesby language,
domain… Engines factory Language resources DSIs
Multilingual corpora
Monolingual corpora
NLP Tools
Other
From data to engines Collect and clear
The service
SECURE(and performing) CUSTOMISABLEQUALITY
Real-life trial and customisation pilotsfor Public Administrations
35
- There is still time for your organisation to participate in a pilot (sign agreement until end of June 2015).
- Any staff member of a public administration can ask for access at any time.
- Access will be free of charge until further notice.- Service delivery models (including pricing) will be
developed only under the Connecting Europe Facility.- Lessons learned from the pilots will be used for developing
the operational support structure and methods for the CEF.
36
• DGT MT page on europa.euhttp://ec.europa.eu/dgs/translation/translationresources/machine_translation/index_en.htm
• ISA page on action 2.8 Machine translationhttp://ec.europa.eu/isa/actions/02-interoperability-architecture/2-8action_en.htm
Includes:• The ISA Work programme 2010-2014 for MT@EC• Presentations for public administrations• and more…
• CEF work programme for 2014 where section 3.1.7 is on the CEF.AT platform
https://ec.europa.eu/digital-agenda/sites/digital-agenda/files/WP2014%20-%20official%20published.pdf
• Language technologies (CEF, H2020,…)http://ec.europa.eu/digital-agenda/language-technologies
• Language technology resources (DGT-TM, EuroVoc,…)http://ec.europa.eu/jrc/en/language-technologies
Useful links