Roadmap for a
multilingual BioPortal
Clement Jonquet ([email protected]),Vincent Emonet ([email protected]) &Mark A. Musen ([email protected])
4th workshop on the Multilingual Semantic Web
Portoroz, Slovenia – June 1st 2015
A few introduction words
4th workshop on the Multilingual
Semantic Web
Context:
increasing number of biomedical data
+ multilingualism
Limits of keyword-based indexing
Biomedical community has turned to ontologies to describe their
data and turn them into structured and formalized knowledge
Using ontologies is by means of creating semantic annotations
Crucial need for tools & services for French biomedical data
Biomedical data integration challenge
New potential sceintific discoveries hidden in data
Translational research
4th workshop on the Multilingual
Semantic Web
Biologist have adopted
ontologies To provide canonical representation of scientific
knowledge
To annotate experimental data to enable interpretation, comparison, and discovery across databases
To facilitate knowledge-based applications for
Decision support
Natural language-processing
Data integration
But ontologies are: spread out, in different formats, of different size, with different structures
In different “languages”
4th workshop on the Multilingual
Semantic Web
Working with terminologies &
ontologies – a portal please!
You’ve built an ontology, how do you let the world know?
You need an ontology, where do you go o get it?
How do you know whether an ontology is any good?
How do you find resources that are relevant to the domain of the ontology (or to specific terms)?
How could you leverage your ontology to enable new science?
How could you use ontologies without managing them ?
4th workshop on the Multilingual
Semantic Web
A few words about
BioPortal
4th workshop on the Multilingual
Semantic Web
Bioportal : A “one stop shop”
for Biomedical Ontologies
Web repository for biomedical ontologies
Make ontologies accessible and usable – abstraction on format, locations, structure, etc.
Users can publish, download, browse, search, comment, align ontologies and use them for annotations both online and via a web services API.
Online support for ontology
Peer review
Notes (comments and discussion)
Versioning
Mapping
Search
Resources
4th workshop on the Multilingual
Semantic Web
http://bioportal.bioontology.org
BioPortal Ontology Repository
htt
p:/
/dat
a.b
ioo
nto
logy
.org
Ontology Services
• Search• Traverse• Comment• Download
Widgets• Tree-view• Auto-complete• Graph-view
Annotation
Data Access
Mapping Services
• Create• Upload• Download
Term recognition
Search “data”annotated with a given term
http://bioportal.bioontology.org4th workshop on the Multilingual
Semantic Web
Status of multilingualism in
BioPortal
Does accept (and parse) both multilingual ontologies and
monolingual ontologies
sometime represented as views
No leveraging of multilingual structure and content
inclusion/exclusion of labels in different languages in the use of the
services the portal offers e.g., Annotator
No t capable to reconcile and deal with the multilingual mappings
Not use a proper mechanism to identify the language property(ies)
of an ontology
Not support relationships between ontologies in different languages
(or in general)
Does not support any internationalization.
whole UI exists only in English
4th workshop on the Multilingual
Semantic Web
A few words about words
4th workshop on the Multilingual
Semantic Web
multilingual
ontology
4th workshop on the Multilingual
Semantic Web
en:diseasefr:maladie
...en:cancerfr:cancer
en:spindel cell sarcomefr:sarcome à cellules fusiformes
en:melanomafr:mélanome
disease
... cancer
spindle cell sarcome melanoma
maladie
... cancer
sarcome à cellules fusiformes
mélanome
language specific ontology (monolingual)
Ontology language &
translation
Natural language = the language (French, English, Spanish,
etc.) used when building a language specific ontology
Format language = used to describe the ontology (OWL,
RDFS, RRF, etc.)
Translation = relation between two language specific
ontologies that represent mainly the same object (domain,
topics, set of concepts and relations)
4th workshop on the Multilingual
Semantic Web
Multilingual mappings
Mapping (or alignment) = a correspondence between concepts in different ontologies
Multilingual mapping = a concept mapping between 2 language specific ontologies
Multilingual translation mapping = additionally the 2 concerned language specific ontologies are a translation of one another
For instance,
Mesh/melanoma has a mapping to DOID/melanoma
Mesh-fr/mélanome has multilingual mapping to DOID/melanoma
Mesh/melanoma has a multilingual translation mapping to Mesh-fr/mélanome
4th workshop on the Multilingual
Semantic Web
What is being multilingual?
Interface internationalization = displaying static elements of
the user interface (e.g., menu names, help, etc.) in
different languages
Content internationalization = displaying BioPortal content
(e.g., ontology labels, mappings, etc.) in different languages
Multilingual = internationalization (display) + to enabling a
complete use of the functionalities and services of BioPortal
for multilingual ontologies or monolingual ontologies
completely and properly addressed (languages, translations,
multilingual mappings, etc.)
rich semantic description
4th workshop on the Multilingual
Semantic Web
A few propositions for
multilingual BioPortal
4th workshop on the Multilingual
Semantic Web
Representation of natural
language property for an ontology
Reuse the OMV (http://omv2.sourceforge.net) is already
imported and used in BioPortal Metadata ontology
(http://bioportal.bioontology.org/ontologies/BP-METADATA)
omv:naturalLanguage
4th workshop on the Multilingual
Semantic Web
Representation of the distinction
between ontologies
Extend OMV within BioPortal Metadata to include and
formalize the distinction
4th workshop on the Multilingual
Semantic Web
meta:MultilingualOntology
rdfs:subClassOf omv:Ontology
omv:naturalLanguage some Literal
meta:LanguageSpecificOntology
rdfs:subClassOf omv:Ontology
omv:naturalLanguage exactly 1 literal
Representation of relation
between ontologies
Extend the DOOR ontology (http://kannel.kmi.open.ac.uk)
A translated ontology is a specific evolution of the ontology with
a different syntax (an equivalent ontology but in another
language)
new property in BioPortal metadata
4th workshop on the Multilingual
Semantic Web
4th workshop on the Multilingual
Semantic Web
meta:isTranslationOf
Representation of
multilingual mappings
Keep a single and simple model as the one BioPortal already
provides to represent any mappings
as any other mapping, but with a specific relation (non exclusive)
Reuse standard properties to represent translations
the LEMON translation module (direct|cultural|lexicalEquivalent)
the GOLD ontology (free|literalTranslation)
4th workshop on the Multilingual
Semantic Web
disease
... cancer
spindle cell sarcome melanoma
maladie
... cancer
sarcome à cellules fusiformes
mélanome
gold:freeTranslation
gold:literalTranslation
Reconciliation of multilingual
mappings
Methods to extract multilingual (translation) mappings
between (translated) ontologies and then reconcile them
into BioPortal mapping repository
Approaches
Via term code when they are the same
Extraction from a meta-thesaurus such as UMLS
Extraction from external mapping databases e.g. CISMEF
Using existing monolingual mappings
Using language parallel data resources
Etc.
4th workshop on the Multilingual
Semantic Web
Overall representation of
multilingual content
4th workshop on the Multilingual
Semantic Web
A few elements of
discussion
4th workshop on the Multilingual
Semantic Web
Important for the Web of
tomorrow
Multilingualism is an important issue in the explosion of data
being released and linked over the Web today
The vision of the semantic web is to be able to leverage and
interoperate data whatever natural language these data is
available into
Make ontology repository multilingual and thus making
ontologies inside the repositories multilingual
4th workshop on the Multilingual
Semantic Web
Language reflects cultural
difference An ontology corresponds to an interpretation of a certain
reality done by a group of people at a certain time
Language => cultural differences => conceptual differences
When the sociological and cultural differences are important,
the effect on the knowledge formalized is also important
4th workshop on the Multilingual
Semantic Web
traitement de données
transfer de données
téléchargement
data process
data transfer
upload download sideload
What is the challenge?
Multilingual translational discoveries
Potential discoveries that would become possible by crossing
large amount of (clinical) data about population of different
ethnics and continental origins currently expressed and
limited to a unique natural language
e.g. multilingual crossing of genotype-phenotype distinction
studies to help understanding better the role of the
environment on gene expression
4th workshop on the Multilingual
Semantic Web
Remaining open questions
How to deal with partial multilingual ontology?
How to deal more than one-to-one mapping?
download/upload vs. télecharger
Formalize entailment of these new classes and properties
e.g., a multilingual translation mapping is a multilingual mapping
connecting 2 ontologies that are a translation one of the other
Make BioPortal ontology parser deals with lexical enrichment
vocabularies
SKOS-XL, LIR, LexINfo, Lexvo, Lingvoj => LEMON
LEMON translation module (Jan 2014)
4th workshop on the Multilingual
Semantic Web
Conclusions
Multilingual semantic Web is crucial
Propositions to manage multilingualism in an ontology
repository such like BioPortal
Deal with monolingual ontologies and translation mappings
Deal with multilingual ontologies (from xmllang to LEMON)
Within the SIFR project, we are implementing and test those
propositions in a local instance of BioPortal deployed at
LIRMM
4th workshop on the Multilingual
Semantic Web
Thank you.
Any questions?
Top Related