Implementing Semantic Web applications: reference architecture and challenges

Post on 28-Jan-2015

116 views 1 download

Tags:

description

Best paper award at the workshop for Semantic Web enabled software engineering 2009, at the International Semantic Web Conference 2009. Full paper at: http://ceur-ws.org/Vol-524/swese2009_2.pdf Summary of the slides and the paper: * an empirical analysis of 98 Semantic Web applications based on an architectural analysis and an application functionality questionnaire * a reference architecture for Semantic Web applications * the main challenges of implementing Semantic Web technologies and their effect on an example application * approaches for mitigating the challenges

Transcript of Implementing Semantic Web applications: reference architecture and challenges

Chapter♥ Copyright 2009 Digital Enterprise Research Institute. All rights reserved.

Digital Enterprise Research Institute www.deri.ie

Implementing Semantic Web applications:reference architecture and challenges

Benjamin Heitmann, Sheila Kinsella, Conor Hayes, and Stefan Decker

Workshop on Semantic Web Enabled Software Engineering 2009

2

Digital Enterprise Research Institute www.deri.ie

Benjamin.Heitmann@deri.org slide of 14

Introduction

Focus of Semantic Web research until now:benefits of Semantic Web technology

Less research on: costs, effort, challenges of Semantic Web technology

Result:estimating cost/benefit offset for Semantic Web technologies is difficultobstacle for uptake of Semantic Web technologies by real-world projects

Our contributions: identify main challenges and outline Software Engineering solutions

2

3

Digital Enterprise Research Institute www.deri.ie

Benjamin.Heitmann@deri.org slide of 14

Overview

3

Empirical Analysis of 98 Semantic Web applicationsarchitectural analysis + app functionality questionnaire

Reference Architecture for Semantic Web applicationsMain challenges of implementing Semantic Web technologies

and their effect on an example applicationApproaches for mitigating the challenges

4

Digital Enterprise Research Institute www.deri.ie

Benjamin.Heitmann@deri.org slide of 14

Empirical analysis - Architectural

Goal: identify common functionality Result: components, allow comparison between apps98 papers about apps from SemWeb challenge 2003-2008 & Scripting for SemWeb challenge 2006-2008

4

5

Digital Enterprise Research Institute www.deri.ie

Benjamin.Heitmann@deri.org slide of 14

Reference Architecture for Semantic Web applications

Empirical basis: architectural analysis provides standard decomposition criteriaallows comparing of functionality

5

6

Digital Enterprise Research Institute www.deri.ie

Benjamin.Heitmann@deri.org slide of 14

Empirical analysis - Functionality

Goal: characterise capabilities of componentsResult: statistics about the range of variations for

each componentResults for 37 apps validated by authorsSurvey covers 27 properties in 7 areas of

functionality

6

7

Empirical analysis - Functionality

Data Interface: data sources used (external/decentralised/evolving ?)Persistent Storage: Semantic Web standards supported (e.g. RDF, OWL, SPARQL ?)User Interface: generic/domain specificData Integration: manual/automaticSearch Service: structured/unstructured dataAuthoring: read-only/edit/create new dataCrawling: one-time/continuous

Digital Enterprise Research Institute www.deri.ie

Benjamin.Heitmann@deri.org slide of 14

Functionality Variations(examples)

7

8

Digital Enterprise Research Institute www.deri.ie

Benjamin.Heitmann@deri.org slide of 14

Implementation challenges (1)

integration service is very common (72%)expensive: 80% require manual intervention76% allow updating data after initial integrationReasons:

use of non-standard termsincorrect usage of vocabulariesmultiple URIs for the same objects and incorrect merging

8

1. Integrating noisy and heterogeneous data

9

Digital Enterprise Research Institute www.deri.ie

Benjamin.Heitmann@deri.org slide of 14

Implementation challenges (2)

70% allow access or importing of external data 60% can export data or are reusable as sourceonly 1/3 allow creation of new dataReason: standards are just emerging:

Linked Data principles: 2006, ~8 years after RDF (1999)RDFa for embedding RDF in HTML: finalised 2008GRDDL for converting (X)HTML to RDF: finalised 2007SPARQL update: not finalisedRDF forms and RDF pushback: not finalised

9

2. Missing or belated conventions and standards

10

Digital Enterprise Research Institute www.deri.ie

Benjamin.Heitmann@deri.org slide of 14

Implementation challenges (3)

components have different data models (majority)object oriented (92%), relational database, graph based

slow, non-native APIs between components

10

3. Mismatch of data models and APIs between components:

4. Distribution of application logic across multiple components

Result of 3+4: higher maintenance costs, performance loss due to non-native API overhead

Logic included not just in code but queries, rules, formal vocabularies

58% using inferencing, 24% using queries

11

Digital Enterprise Research Institute www.deri.ie

Benjamin.Heitmann@deri.org slide of 14

Example Application: SIOC explorer

11

3 - Mismatched data models: graph/relational/OOMismatched APIs: ruby<->java, SPARQL (slow)4 - distributed app logic: crawler, integration, primary app logic

1 - Integration: all data is RDF+SIOC, still 2 integration steps required2 - Unclear best practices: every SIOC exporter requires different crawling

12

Digital Enterprise Research Institute www.deri.ie

Benjamin.Heitmann@deri.org slide of 14

Mitigating the challenges (1)

72% implement integration, 3 components required

Delegating generic integration simplifies architectureDrawback: application specific integration may still be

necessary

12

1. Delegating generic functionality to external providers

13

Digital Enterprise Research Institute www.deri.ie

Benjamin.Heitmann@deri.org slide of 14

Mitigating the challenges (2)

most apps in survey created on case-by-case basis:multiple librariesmultiple programming languages mismatch of native APIsdistributed application logic

provide frameworks / software factories to assemble and customise complete applications

provide generic data integrationimplement best practices and guidelinescentralise application logicallow app specific customisation

inspiration: Ruby on Rails, PHPCake, Django (Python), Struts (Java)

13

2. Assembling applications from components:

14

Digital Enterprise Research Institute www.deri.ie

Benjamin.Heitmann@deri.org slide of 14

Summary

main challenges of implementing SemWeb techcost of integrating noisy or heterogeneous data

(non-RDF and RDF data)missing or belated standards and conventionsmismatch of data models and APIs between componentsdistribution of application logic across components

approaches to mitigate the challenges:delegate generic functionality to external servicessupport assembly of complete applications with

frameworksempirical foundation: analysis of 98 Semantic Web

applications

14