Challenges for the Language Technology Industry

Challenges for the LT Industry

Antoine Isaac

LT-Innovate Summit 2014

Brussels, June 25, 2014

Europe’s platform to access cultural heritage

Currently33M objects

Built on descriptive metadatafrom a broad, heterogeneous network

Audiovisual collections

National Aggregators

Regional Aggregators

Archives

Thematic collections

Libraries

Musées Lausannois

Culture.frThe European Library

European Film Gateway Europeana Fashion

2,300 galleries, museums, archives and libraries

Platform implies network

Accessing items from 36 countries

top 16

Portal interface in 31 languagesMetadata in 33 languages

Serving Europe’s citizens

5M visits on Europeana.eu7M Facebook impressionsAPI use…

Facilitating re-use on the language side?

Our network needs automatic translation tools to address information needs all over Europe

Gathering/linking existing multilingual data

Related projects applying NLP tools

E.g. a project (PATHS) developed techniques to enrich English and Spanish collections

1)Identification of key entities

2)Detection of (typed) similarities between objects, using metadata

3)“Background links” to external resources such as Wikipedia

4)Classification of object against a hierarchy of topics

Applying these to other languages would require work

1)-> requires language-specific tools (PoS tagging, lemmatization)

2)-> straightforward to apply to new languages

3)-> requires language-specific tools

4)-> depends on (3) and on translation of some topics

http://www.paths-project.eu/eng/Resources/Semantic-Enrichment-of-Cultural-Heritage-content-in-PATHS

Language challenges for Digital Libraries

Typical queries are very short

Average < 2 terms

Identification of query language is not easy, even manually

39% of queries may belong to several languages

Plenty of named entities

60% of queries are for persons & places

Not only is it hard for queries: the same issues apply to the descriptive metadata

Studies by Humboldt University on Europeana and The European Libraryhttp://www.clef-initiative.eu/documents/71612/86374/CLEF2010wn-LogCLEF-StillerEt2010.pdf

Language issues at the scale of Europe

Very diverse domains, probably with few training corpora available

Tools, UCL Museums, CC-BY-NC-SAParis, nouvelle machine à paver : [photographie de presse] / [Agence Rol], National Library of France, Public DomainSt. Philip holding a book and St. James (the Less?) holding a book, National Library of the Netherlands, Public domainLa paloma / O sole mio, Dalane Folkemuseum, CC0

Relevant LT can come from everywhere in Europe, raising interoperability issues

Resource problem

Both for us and our partners - libraries, archives, museums

Not much money

Few technical experts

Emphasis on open source technology

We can provide interesting challenges for the industry in terms of (open) data availability, users and scenarios.

But we're not (yet) a market of the size of others

Thank you!

Antoine Isaac

antoine.isaac@europeana.eu

@EuropeanaEU

Challenges for the Language Technology Industry

Technology

Transcript of Challenges for the Language Technology Industry

Challenges in Aerospace Industry

Tackle Refining Industry Challenges

Heritage Language Program Sustainability: Challenges …asiasociety.org/files/Heritage Language Program Sustainability... · Heritage Language Program Sustainability: Challenges and

ARM Industry Opportunities, Challenges and Technology ...ARM Industry Opportunities, Challenges and Technology Solutions ... ARM Industry Opportunities, Challenges and Technology Solutions

Challenges of Indian Telecom Industry

2004 Challenges Construction Industry Proceedings

Industry Challenges

Submarine Networks Industry Challenges

Environmental Challenges in Petroleum Industry

Auto Industry Opportunities & Challenges

The Oil Industry Resourcing Challenges - sapc-iadc.org Industry... · The Oil Industry Resourcing Challenges . ... Weatherford Drilling International in 2005 when Weatherford ...

Progress Challenges Language Access

COMPUTER & INDUSTRY LANGUAGE

Challenges in aviation industry

SPL: Brand Industry Challenges & Opportunities

Beverages Industry Challenges Infographic - Siemens...Title Beverages Industry Challenges Infographic Subject Beverages Industry Challenges Infographic Created Date 9/23/2016 3:18:57

Semiconductor industry – Challenges and Opportunitieshome.iitk.ac.in/~chauhan/2014_Q4_DAVV.pdf · 2020. 3. 9. · Semiconductor industry – Challenges and Opportunities Yogesh

Challenges in the Industry

THE PHILIPPINE COAL INDUSTRY ITS CHALLENGES AND · PDF filethe philippine coal industry : its challenges and opportunities by i ... the philippine coal industry - its challenges ...

UniTo & the challenges of Industry 4politichediateneounito.it/.../UniTo-the-Challenges-of-Industry-4.0... · UniTo & the challenges of Industry 4.0 2017 Index Foreword 1. ... Cristina