www.europeanaconnect.eu
Multilingual Access to Online Content - the Europeana Experience
Vivien Petras (Humboldt-Universität zu Berlin)
With the help of many people involved in Europeana (referenced in the slides)
Eurovoc Conference, 18-19 November 2010
Vivien Petras, Humboldt-Universität zu BerlinEurovoc Conference, 18-19 November 2010
Outline
• Europeana – a brief introduction
• Multilingual access to Europeana – approaches
• Europeana Semantic Data Layer
• Multilingual Alignments of Vocabularies
• Semantic Search Engine Prototype
Vivien Petras, Humboldt-Universität zu BerlinEurovoc Conference, 18-19 November 2010
Europeana
“A digital library that is a single, direct and multilingual access point to the European cultural heritage.”
European Parliament, 27 September 2007
Vivien Petras, Humboldt-Universität zu BerlinEurovoc Conference, 18-19 November 2010
Europeana Today
• 13 million objects
• 28 data aggregators
• 1500 participating institutions
• 200 partners
• 35 FTE’s
• 21 projects
• 1 million visits in 2010
• 30,000 My Europeana signees
• 2008: Prototype
• 2010: Operational Service
•Stable portal
•Open Source Code
•EuropeanaLabs
•Public Domain Charter
From: Cousins, Jill (2010). Europeana Overview. Europeana Open Cultures Conference, 14-15 October Amsterdam
Vivien Petras, Humboldt-Universität zu BerlinEurovoc Conference, 18-19 November 2010
Europeana Contributions by Country
From: Cousins, Jill (2010). Europeana Overview. Europeana Open Cultures Conference, 14-15 October Amsterdam
Different languages!(?)
Vivien Petras, Humboldt-Universität zu BerlinEurovoc Conference, 18-19 November 2010
Goethe, Johann Wolfgang von Title: Goethe, Johann Wolfgang von Date: unknown Creator: Goethe, Johann Wolfgang von Description: Goethe, Johann Wolfgang von Language: de-DE Format: image/jpeg Source: SLUB/Deutsche Fotothek Rights: Deutsche Fotothek Provider: Deutsche Fotothek ; Germany Identifier: http://www.deutschefotothek.de/obj70226592.html Subject: Bildnis; Bildniskatalog; Foto; Fotos; Portrait Type: image
Books, Articles, Postcards, Folklore objects, Photography, Art
Europeana Content Types
Vivien Petras, Humboldt-Universität zu BerlinEurovoc Conference, 18-19 November 2010
Multilingual Acess to Europeana
• Interface • static pages
• Search • query translation• (document translation)
• Subject Browse (& Search)
• Controlled vocabularies• Semantic Data Layer
French English Spanish
German Italian Polish
Dutch Portugese
Hungarian Swedish
Vivien Petras, Humboldt-Universität zu BerlinEurovoc Conference, 18-19 November 2010
Europeana Semantic Data Layer
Doerr, M.; Gradmann, S.; Hennicke, S.; Isaac, A.; Van de Sompel, H. (2010). The Europeana Data Model (EDM).
Vivien Petras, Humboldt-Universität zu BerlinEurovoc Conference, 18-19 November 2010
Europeana Semantic Data Layer
Doerr, M.; Gradmann, S.; Hennicke, S.; Isaac, A.; Van de Sompel, H. (2010). The Europeana Data Model (EDM).
library
archive
museum
Bridging „isles of information“ by connecting objects from different domains via cross-vocabulary links.
Vivien Petras, Humboldt-Universität zu BerlinEurovoc Conference, 18-19 November 2010
Semantic Data Layer Alignment Example
Irish vocabulary
From: Cousins, Jill (2010). Europeana Overview. Europeana Open Cultures Conference, 14-15 October Amsterdam
Norwegian vocabularySKOS Mapping
skos:exactMatch
Vivien Petras, Humboldt-Universität zu BerlinEurovoc Conference, 18-19 November 2010
Multilingual Alignment: Approach
• Identify and convert relevant semantic resources • Pivot vocabularies for relevant categories (subject, persons, places…)
= multilingual and with wide coverage• E.g. UDC, DDC, VIAF, TGN, Geonames, Wordnets, dbPedia
From: Isaac, Antoine; Schreiber, Guus (2010). Vrije Universiteit Amsterdam Approach to Multilingual Mapping of Vocabularies.
Vivien Petras, Humboldt-Universität zu BerlinEurovoc Conference, 18-19 November 2010
Multilingual Alignment: Approach
• Align more specific vocabularies to the pivots = anchoring mappings
• Finding instances of skos:exactMatch mappings
• Vocabulary characteristics important for matching:• Lexical variance of lables (e.g. plural/singular, diacritics,
multilinguality)• Preferred / alternative labels• Nature of hierarchy
From: EuropeanaConnect Milestone 1.2.1 (2010). Specification of preferred terms identification methodology.
Vivien Petras, Humboldt-Universität zu BerlinEurovoc Conference, 18-19 November 2010
Multilingual Alignment: Approach
• Methodology:• Conversion to SKOS/RDF• Application of different alignment methods:
• Lexical matching• Structure-based matching• Instance-based matching
• Filtering / disambiguation of matching candidates:• Analyzing children / parent matches
• Combining alignments
From: EuropeanaConnect Milestone 1.2.1 (2010). Specification of preferred terms identification methodology.
Vivien Petras, Humboldt-Universität zu BerlinEurovoc Conference, 18-19 November 2010
VUA Vocabulary Aligment Tool Amalgame
• AMsterdam ALignment GenerAtion MEtatool
• Uses EDOAL (Expressive and Declarative Ontology Alignment Language) or SKOS
• Also provides pre- / post-mapping statistics and an evaluation tool
From: EuropeanaConnect Milestone 1.2.2 (2010). Semantics of descriptions aligned (intermediary).
Vivien Petras, Humboldt-Universität zu BerlinEurovoc Conference, 18-19 November 2010
VUA Vocabulary Aligment Tool Amalgame
• Skosified: en, fr, de, nl, hu
• Mappings (>500,000): en, fr, nl
• Mostly label matches
http://semanticweb.cs.vu.nl/beta/amalgame/list_alignments
Vivien Petras, Humboldt-Universität zu BerlinEurovoc Conference, 18-19 November 2010
Europeana Semantic Search Engine
http://eculture.cs.vu.nl/europeana/session/search
Vivien Petras, Humboldt-Universität zu BerlinEurovoc Conference, 18-19 November 2010
Europeana Semantic Search Engine
Disambiguation of search terms
Vivien Petras, Humboldt-Universität zu BerlinEurovoc Conference, 18-19 November 2010
Europeana Semantic Search Engine
Multilingual query expansion
Vivien Petras, Humboldt-Universität zu BerlinEurovoc Conference, 18-19 November 2010
Europeana Semantic Search Engine
• Works created by matching person
• Works related to matching person
• Works created by a teacher of matching person
• Works related to an artefact created by matching person
• Works created by an artist professionally related to matching person
• Works titled
• Works showing concept
• Works with matching Location
• ….
Clustering of search results
Vivien Petras, Humboldt-Universität zu BerlinEurovoc Conference, 18-19 November 2010
Next Steps
• Adding more vocabularies from the content providers:• VIAF • Spanish and Polish subject heading lists
• Switching metadata delivery to Europeana Data Model (EDM) format (2011)
• And: linking with the cloud…
Vivien Petras, Humboldt-Universität zu BerlinEurovoc Conference, 18-19 November 2010
Europeana & Linked Open Data
Doerr, M.; Gradmann, S.; Hennicke, S.; Isaac, A.; Van de Sompel, H. (2010). The Europeana Data Model (EDM).
Information Spaces•DBpedia•PND and SWD (prototype)•Geonames•LCSH•…
Vivien Petras, Humboldt-Universität zu BerlinEurovoc Conference, 18-19 November 2010
www.europeana.eu
Thank you.
Top Related