Programming the Semantic Web

42
Steffen Staab [email protected] 1 WeST Web Science & Technologies University of Koblenz ▪ Landau, Germany Programming the Semantic Web Steffen Staab & Team@WeST

description

Keynote at ESWC 2014, Crete, May 27, 2014

Transcript of Programming the Semantic Web

Page 1: Programming the Semantic Web

Steffen [email protected]

1WeST

Web Science & Technologies

University of Koblenz ▪ Landau, Germany

Programming the Semantic Web

Steffen Staab & Team@WeST

Page 2: Programming the Semantic Web

Steffen [email protected]

2WeST

This presentation contains program code, engineering, speculation and worse.It may be viewed as offensive by some viewers.

Page 3: Programming the Semantic Web

Steffen [email protected]

3WeST

Semantic Web: Great History, Still Long Way to Go!

Big Systems

Practice

Research influence

Wanted: mainstream adoption!

Target: ProgrammingTarget: Programming

Page 4: Programming the Semantic Web

Steffen [email protected]

4WeST

SEMANTIC WEB PROGRAMMING: BIRD‘S EYES VIEW

Page 5: Programming the Semantic Web

Steffen [email protected]

5WeST

Semantic Web Programming

Linked Data

SPARQL endpoint

RDF fileHypothesis:Semantic Web data is wonderful, but programming with Semantic Web has changed too little since 2000 and still is a mess.

We promise flexibility, but code

still hard to maintain.

Ontology

Page 6: Programming the Semantic Web

Steffen [email protected]

6WeST

Semantic Web Programming

Linked Data

SPARQL endpoint

RDF file

„Inside“Data Mgmt

Eclipse

Visual Studio

....

Ontology

Page 7: Programming the Semantic Web

Steffen [email protected]

7WeST

Semantic Web Impedance Mismatch

„Inside“http getDereferencingQuery executionUpdating IndexingTriplification IntegrationStreaming

„Inside“Data Mgmt

Eclipse

Visual Studio

....

Mind the

Gap!

Page 8: Programming the Semantic Web

Steffen [email protected]

8WeST

Semantic Web

Linked Data

SPARQL endpoint

RDF file

„Inside“Data Mgmt

„Outside“

Data Mgmt

Eclipse

Visual Studio

....

Ontology

Page 9: Programming the Semantic Web

Steffen [email protected]

9WeST

Semantic Web

„Inside“Data Mgmt

„Outside“

Data Mgmt

Eclipse

Visual Studio

....

„Outside“ Understanding

by the developer „Search + Code“ „Browse + Code“

Triple-object mapping Code generation Process of federation

LD vs endpoint Models of federation

Abstraction layers that facilitate a developer‘s life

Page 10: Programming the Semantic Web

Steffen [email protected]

10WeST

Programming with Data: What does it cost?

Ctotal = t*Ctool + d*t*Clearn + s*Cdeu + s*Cmap + n*Ccode

Ctool : Costs for building t many tools, shared; almost free

Clearn: Costs for learning how to use technology per developer dCdeu: Costs for data engineering/understanding s sources

Cmap : Costs for mapping data structure for s sources to objects

Ccode : Actual costs for accessing/manipulating data n times

„Inside“Both „Outside“

Page 11: Programming the Semantic Web

Steffen [email protected]

11WeST

SWOT of Semantic Web Programming

Ctotal = t*Ctool + d*t*Clearn + s*Cdeu + s*Cmap + n*Ccode

threat

opportunity

opportunity

weakness

Not a strength, yet!

strength/weakness

Strong in flexibilitySomewhat weak in

performance

„Outside“

as good as RelDB is not good enough!

Page 12: Programming the Semantic Web

Steffen [email protected]

12WeST

Target cost scenario 1: Automated setup

Ctotal

n (as in n*Ccode )

SemWeb coding efforts

RelDB/XML coding efforts

Applies to small nD

iff.

cost

s fo

r le

arni

ngS

etup

co

st

0

Page 13: Programming the Semantic Web

Steffen [email protected]

13WeST

Target cost scenario 2: Manually perfected setup

Ctotal

n (as in n*Ccode )

SemWeb coding efforts

RelDB/XML coding efforts

„Perfect“ object model shields

developer from database

ideosyncracies

Set

up c

osts

0

Page 14: Programming the Semantic Web

Steffen [email protected]

14WeST

Intermediate conclusion

Minimize costs for setup:

Clearn: Costs for learning how to use technology per developer dCde: Costs for data engineering/understanding s sources

Cmap : Costs for mapping data structure for s sources to objects

Minimize costs for core programming:

Ccode : Actual costs for accessing/manipulating data n times

Costs for learning and understanding constitute a threat!

Need to be overcome!

Page 15: Programming the Semantic Web

Steffen [email protected]

15WeST

SEMANTIC WEB PROGRAMMING: IDENTIFYING SOME PITFALLS

Page 16: Programming the Semantic Web

Steffen [email protected]

16WeST

Example scenario: Jamendo

Data about license free music ~ 1 Million triples classes and predicates

from 18 different ontologies FOAF, Tag ontology,

music ontology, …

Simple programming task: List for every music artist,

all the records they made

Page 17: Programming the Semantic Web

Steffen [email protected]

17WeST

Software Development Process Overview

data model design

revised data model design

data model prototype

data queries

final data model

Creation of initial data

model

Exploration of the data

source

Creation of model in

code

Query design / implementation

Mapping of query results

Page 18: Programming the Semantic Web

Steffen [email protected]

18WeST

Accessing Artists Using Apache Jena

Page 19: Programming the Semantic Web

Steffen [email protected]

19WeST

From artists to songs

Observations SPARQL queries are strings Results are strings Requires good understanding of the data source

RDF Typing is lost

Page 20: Programming the Semantic Web

Steffen [email protected]

20WeST

Related Work on RDF Access

Static Typing Errors detected before

execution Misspelling discovered

by compiler! Anectode: 2nd place

because of misspelt var.

Static types are form of documentation Less knowledge about

data source required

Better IDE integration / autocompletion

Code generation Sommer Winter OntoMDE

Dynamic Typing E.g. ActiveRDF

(Oren et al 2007)) “convention over

configuration”

dynamic metaprogramming allows for slick code

Criticism

Page 21: Programming the Semantic Web

Steffen [email protected]

21WeST

SEMANTIC WEB PROGRAMMING:

LITEQ – OUR OUTSIDE APPROACH

Page 22: Programming the Semantic Web

Steffen [email protected]

22WeST

Programming against Jamendo

c1

Page 23: Programming the Semantic Web

Steffen [email protected]

23WeST

Node Path Query Language Using Autocompletion

Exploration of classes

Exploration of relations

Querying for instances

Page 24: Programming the Semantic Web

Steffen [email protected]

24WeST

Node Path Query Language Using Autocompletion

Exploration of classes

Exploration of relations

Page 25: Programming the Semantic Web

Steffen [email protected]

25WeST

Node Path Query Language: Query Formulation

Exploration of classes

Exploration of relations

Querying for instances

Type set of mo:MusicArtist

No definition or declaration needed

Page 26: Programming the Semantic Web

Steffen [email protected]

26WeST

Node Path Query Language for Code Development

Exploration of classes

Exploration of relations

Querying for instances

Developing code with queries

All translated into SPARQL queries at• Development time• Type inference at compile time

(but also as part of IDE)• Querying again at run time

One language to bind them all

Page 27: Programming the Semantic Web

Steffen [email protected]

27WeST

Node Path Query Language for Code Development

Exploration of classes

Exploration of relations

Querying for instances

Developing code with queries

Developing code with new classes

All translated into SPARQL queries at• Development time• Run time update• Persistence!

Page 28: Programming the Semantic Web

Steffen [email protected]

28WeST

NPQL

NPQL (Node Path Query Language) Intensional Queries

Describing RDF classes and properties for reuse in IDE and in host language metaprogramming

Extensional Queries Class instances and property instances

Compilation to SPARQL for reuse of existing endpoints

Ongoing discussion about details of NPQL

Page 29: Programming the Semantic Web

Steffen [email protected]

29WeST

LITEQ

NPQL (Node Path Query Language) Intensional Queries Extensional Queries Compilation to SPARQL

LITEQ (Language Integrated Types, Extensions and Queries) Implementation of NPQL as F# Type Provider in Visual Studio Autocompletion using NPQL queries Automatic typing

of extensional query resultsby intensional queries

Page 30: Programming the Semantic Web

Steffen [email protected]

30WeST

Cost savings

Ctotal = t*Ctool + d*t*Clearn + s*Cdeu + s*Cmap + n*Ccode

Ctool : open source

Clearn: not free – though autocompletion reduces cognitive load

Cdeu: not free – understanding the RDF schema from your IDE

Cmap : 0

Ccode : a lot less than for dotNet RDF (Apache Jena?!!)little bit more than for a fictitious perfect object model

Page 31: Programming the Semantic Web

Steffen [email protected]

31WeST

Halstead metrics for different tasks:

Conventional Semantic Web programming approaches waste up to 50% of your efforts!

Page 32: Programming the Semantic Web

Steffen [email protected]

32WeST

Speculation 1

Ctotal

n (as in n*Ccode )

SemWeb coding efforts

RelDB/XML coding efforts

Applies to small n

Diff

. co

sts

for

lear

ning

Speculation 1: Using ontologies and RDF schemata, we can develop more efficiently

using the right tools!

Page 33: Programming the Semantic Web

Steffen [email protected]

33WeST

Halstead metrics for different tasks:

If someone gives me a perfect RDF-to-OO mapping for freethen I will not care about whether it is RDF or RelDB

underneath!

Page 34: Programming the Semantic Web

Steffen [email protected]

34WeST

Speculation 2

Ctotal

n (as in n*Ccode )

SemWeb coding efforts

RelDB/XML coding efforts

„Perfect“ object model shields

developer from database

ideosyncracies

Diff

. co

sts

for

setu

pSpeculation 2: For large programmes, our tools need to offer better support to reduce

setup costs!

Page 35: Programming the Semantic Web

Steffen [email protected]

35WeST

Issues

No schema many LOD sources do not have schema

Page 36: Programming the Semantic Web

Steffen [email protected]

36WeST

Issues

No schema many LOD sources do not have schema

Useless schema

New kinds of (?) schema induction!

Page 37: Programming the Semantic Web

Steffen [email protected]

37WeST

Issues

No schema many LOD sources do not have schema

Useless schema Property-based schemata

Cf. Homoceanu et al ISWC 2013• ask for all movies – not possible by asking for class movie,

but only for different property combinations

Page 38: Programming the Semantic Web

Steffen [email protected]

38WeST

Outlook wrt LITEQ

Integration of schema induction

Programming the Semantic Web as a whole Federated queries link traversal-based queries

More expressive query language Composed data types in tractable DL

More precise integrated type inference (composed) types from RDF data source type inference in host language

Page 39: Programming the Semantic Web

Steffen [email protected]

39WeST

CONCLUSION

Page 40: Programming the Semantic Web

Steffen [email protected]

40WeST

Semantic Web: Make Developers More Productive

Linked Data

SPARQL endpoint

RDF file

„Inside“Data Mgmt

„Outside“

Data Mgmt

Eclipse

Visual Studio

....

Ontology

Page 41: Programming the Semantic Web

Steffen [email protected]

41WeST

Semantic Web

Social Web & Web Retrieval

Interactive Web & Human Computing

Web & Economy

Software & Services

Web Science & Technologies Team & Research

Computational Social Science

Thank You!

Ralf Lämmel

EvelyneViegas

Page 42: Programming the Semantic Web

Steffen [email protected]

42WeST

References

S. Scheglmann, A. Scherp, S. Staab. Declarative Representation of Programming Access to Ontologies. In: 9th Extended Semantic Web Conference (ESWC2012), Heraklion, Greece, May 27-31, 2012.

Daniel Oberle. How Ontologies Benefit Enterprise Applications. Semantic Web Journal. http://www.semantic-web-journal.net/content/how-ontologies-benefit-enterprise-applications-0

S. Homoceanu, P. Wille, W.-T. Balke. ProSWIP: Property-Based Data Access for Semantic Web Interactive Programming. In: ISWC-2013

C. Saathoff, S. Scheglmann, S. Schenk. Winter: Mapping RDF to POJOs revisited.

E. Oren, R. Delbru, S. Gerke, A. Haller, S. Decker: ActiveRDF: object-oriented semantic web programming. WWW 2007: 817-824

S. Scheglmann, S. Staab, M. Thimm, G. Gröner. Locking for Concurrent Transactions on Ontologies. In: 10th Extended Semantic Web Conference (ESWC2013), Montpellier, France, May 26-30, 2013.

M. Konrath, T. Gottron, S. Staab, A. Scherp. SchemEX – Efficient Construction of a Data Catalogue by Stream-based Indexing of Linked Data. In: Journal of Web Semantics. Special issue on Semantic Web Challenge Winners 2011. Elsevier, Volume 16, November 2012, pp. 52-58.

W. Cook, A. Ibrahim. Integrating Programming Languages & Databases: What’s the Problem?? http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.66.7169&rep=rep1&type=pdf