Copyright 2008 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise...

Post on 27-Mar-2015

212 views 0 download

Tags:

Transcript of Copyright 2008 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise...

1 Copyright 2008 Digital Enterprise Research Institute. All rights reserved.

Digital Enterprise Research Institute www.deri.ie

From OntoSelect to OntoSelect-SWSE

Paul Buitelaar, Andreas HarthDERI – NUIG, Galway

February 2009

Digital Enterprise Research Institute www.deri.ie

2

Outline

OntoSelect @ DFKI Recap of OntoSelect Functionality

OntoSelect @ DERI SWSE: Semantic Web Search Engine Architecture OntoSelect-SWSE

OntoSelect-SWSE Experiments Ranked List of Ontologies with Rich Metadata

Digital Enterprise Research Institute www.deri.ie

3

OntoSelect

Ontology Library and Ontology Search Service

http://olp.dfki.de/OntoSelect

OntoSelect monitors the web for ontologies (indexing/updates) Ontology browse and search (by keyword, topic, document) Class, property and (multilingual) label browse and search Ontology publishing (submit your ontology) Statistics on

– Formats– Human languages– Frequently used labels– Ontology publishing

Digital Enterprise Research Institute www.deri.ie

4

Browse Ontologies

Digital Enterprise Research Institute www.deri.ie

5

Ontology Search

Digital Enterprise Research Institute www.deri.ie

6

Expand Search with Wikipedia Topic

Digital Enterprise Research Institute www.deri.ie

7

Keyword Extraction & Ranked Ontologies

Digital Enterprise Research Institute www.deri.ie

8

OntoSelect Statistics - Multilinguality

English German French Spanish

Portugese Hungarian Italian Polish

Dutch Russian Japanese

Distribution of languages in 136 ontologies with multilingual labels - out of 1530 ontologies currently collected (~9%)

Digital Enterprise Research Institute www.deri.ie

9

OntoSelect Statistics - Labels

Most frequently used labels (‘words’, ‘terms’) in 1530 ontologies

Digital Enterprise Research Institute www.deri.ie

10

SWSE: Semantic Web Search Engine

Digital Enterprise Research Institute www.deri.ie

11

SWSE Architecture

Distributed shared-nothing architecture Implementation scales to billions of triples

Crawler Reasoner Indexer

HTTPQuery

Processor

Ranker

User Interface

Linked Data Raw

Data

Material-isedData

CompleteIndex

Ranks

Digital Enterprise Research Institute www.deri.ie

12

OntoSelect-SWSE

Digital Enterprise Research Institute www.deri.ie

13

OntoSelect-SWSE Experiments

Experiment Onto Seed set from Google & Yahoo (*.owl, *.daml, *.rdfs) 27,519 data sources (ontologies) 6.5m statements

Experiment Web Crawling six degrees from seed URI

http://www.w3.org/People/Berners-Lee/card 100,555 data sources (instance data + ontologies) 11.9m statements

Digital Enterprise Research Institute www.deri.ie

14

Ranked List of Ontologies - Onto

Digital Enterprise Research Institute www.deri.ie

15

Ranked List of Ontologies - Web

Digital Enterprise Research Institute www.deri.ie

16

Conclusion

SWSE framework facilitates web data experiments

Taking into account real-world instance usage improves ontology ranking

Web data is noisy (more data providers == more noise)

Ontology registry should include rating facility to “vote out” data sources that do not provide consensus view