Triplewave: a step towards RDF Stream Processing on the Web

Post on 16-Apr-2017

237 views 1 download

Transcript of Triplewave: a step towards RDF Stream Processing on the Web

Department of Informatics

TripleWave: a step towards

RDF Stream Processing on the

Web

Daniele Dell’Aglio

dellaglio@ifi.uzh.ch http://dellaglio.org @dandellaglio

Galway, 16.12.2016

An (incomplete) overview of the RSP/SR research

IMaRS

STARQL

DynamiTE

TROWL

Eu

rop

e M

ap

fro

m W

ikip

ed

ia

StreamRule

StarCITY

SPARKWAVE

LARS

Key

Daniele Dell'Aglio - TripleWave 2/28

Connecting RSPs on the Web

MorphStreams

CSPARQL

Etalis

TrOWLStream

Rule

CQELS

How far are we?

• Working prototypes/systems

• Formal models for RSPs and reasoning

• Minimal agreements: standards, serialization, interfaces

Daniele Dell'Aglio - TripleWave 3/28

Looking for minimal agreements: the RSP

Community group

Research work

Many Papers

PhD Theses

Datasets

Prototypes

Benchmarks

RDF Streams

Stream Reasoning

Complex Event Processing

Stream Query Processing

Stream Compression

Semantic Sensor WebMan

y t

op

ics

To

ns o

f w

ork

http://www.w3.org/community/rspW3C RSP Community Group

Effort to our work on RDF stream processing

discussstandardizecombineformalizeevangelize

Daniele Dell'Aglio - TripleWave 4/28

W3C RSP Documents

https://www.w3.org/community/rsp/

http://streamreasoning.github.io/RSP-QL/RSP_Requirements_Design_Document/

Use cases

Implementations

State of the Art

Challenges & issues

Requirements

Design principles

• RDF Stream model

• RDF Stream query language

Abstract syntax RDF Streams

Daniele Dell'Aglio - TripleWave 5/28

But...

W3C RSP set some foundations and requirements, but:

• Standard protocols and exchanging mechanisms for RDF stream are missing.

• We need generic and flexible solutions for making RDF streams available and exchangeable on the Web.

Daniele Dell'Aglio - TripleWave 6/28

TripleWave

TripleWave is an open-source framework for creating and publishing

RDF streams over the Web.

Triple

Wave

how?input?

RDF Stream

what is it?Daniele Dell'Aglio - TripleWave 7/28

TripleWave’s RDF streams

TripleWave should exploit and be compatible to existing standards and recommendations

• The data model should be compatible with the abstract model defined by the W3C RSP CG

• The output format compatible with RDF

Daniele Dell'Aglio - TripleWave 8/28

TripleWave serialization format

In TripleWave, an RDF stream is an (infinite) ordered sequence of time-annotated data items (RDF graphs)…

... serialized in JSON-LD[{ "@graph": {

"@id": "http://.../G1",

{"@id": "http://.../a",

"http://.../isIn": {"@id":"http://.../rRoom"}}

},{ "@id": "http://.../G1",

"generatedAt":"2016-16-12T00:01:00"

}

},

{ "@graph": {

"@id": "http://.../G2",

{"@id": "http://.../b",

"http://.../isIn": {"@id":"http://.../rRoom"}}

},{ "@id": "http://.../G2",

"generatedAt":" 2016-16-12T00:03:00"

}

},…

G1

G2

G3

{:a :isIn :rRoom}

{:b :isIn :bRoom}

{:c :talksIn :rRoom,

:d :talksIn :bRoom}

S

3

5

1

tDaniele Dell'Aglio - TripleWave 9/28

Building TripleWave

Triple

Wave

how?input?

RDF Streams

Daniele Dell'Aglio - TripleWave 10/28

Spreading the RDF streams

TripleWave must be able to provide the stream to RDF Stream Processing engines (query processors and reasoners) through the Web.

• HTTP

• HTTP chunk

• Web sockets

• MQTT (upcoming)

Daniele Dell'Aglio - TripleWave 11/28

TripleWave Stream Descriptor

TripleWave must provide information about how to access the stream

• TripleWave exposes an RDF description of the RDF stream

• RDF Stream Descriptor (sGraph)

• It contains:

• The identifier of the stream

• Data item samples (see next slide)

• A description of the schema

• The location of the stream endpoint (e.g. WebSocket URL)

Daniele Dell'Aglio - TripleWave 12/28

TripleWave Stream Descriptor - Example

Daniele Dell'Aglio - TripleWave 13/28

TripleWave Stream Descriptor - Example

Daniele Dell'Aglio - TripleWave 13/28

Building TripleWave

Triple

Wave

input?

RDF Streams(Web socket |

HTTP-chunk |

etc.)

RDF Stream

Descriptor

Daniele Dell'Aglio - TripleWave 14/28

Feeding TripleWave

TripleWave should support a variety of data sources.

• RDF dumps with temporal information

• RDF with temporal information exposed through SPARQL endpoints

• Streams available on the Web

Daniele Dell'Aglio - TripleWave 15/28

From RDF to RDF streams

Converts RDF stored in files/SPARQL endpoints

• Containing some time information

… into an RDF stream

• continuous flow of RDF data

• ordered according the original timestamps

• the time between two items is preserved

Use Cases

• Evaluation, testing and benchmarking

• Simulation systems

Daniele Dell'Aglio - TripleWave 16/28

From Web stream to RDF stream

Consumes an existing Web stream…

• through connectors

… and converts it into an RDF Stream

• Each data item is lifted to RDF

Use Cases

• Querying and reasoning

• Data integration

Web

ServiceConnector TW Core

Web Service API

Daniele Dell'Aglio - TripleWave 17/28

From Web stream to RDF stream

Convertion is made through R2RML

• Mappings to convert each data item in RDF

Example: map a field

{

“userUrl”:”foo”

}

rr:predicateObjectMap [

rr:predicate schema:agent;

rr:objectMap [ rr:column "userUrl"] ];

{

"https://schema.org/agent": {"@id": ”foo"},

}

Daniele Dell'Aglio - TripleWave 18/28

From Web stream to RDF stream

Convertion is made through R2RML

• Mappings to convert each data item in RDF

Example: map a field with template

{

“time”:”value”

}

rr:subjectMap [

rr:template ”something {time}”

{

“@id”:”something value”

}

Daniele Dell'Aglio - TripleWave 19/28

From Web stream to RDF stream

Convertion is made through R2RML

• Mappings to convert each data item in RDF

Example: add a new constant field

rr:predicateObjectMap

[ rr:predicate rdf:type; rr:objectMap

[ rr:constant schema:UpdateAction]];

{

"http://www.w3.org/1999/02/22-rdf-syntax-ns#type":

{"@id": "https://schema.org/UpdateAction"}

}

Daniele Dell'Aglio - TripleWave 20/28

Building TripleWave

Replay

Conversion to

RDF Stream

R2RML

MappingsLive Non-RDF Streams

JSON

RDFTime-annotated

RDF datasetsFinite RDF

substreams

Replay Loop

Triple

Wave RDF Streams(Web socket |

HTTP-chunk |

etc.)

RDF Stream

Descriptor

Daniele Dell'Aglio - TripleWave 21/28

Implementing TripleWave

TripleWave is a NodeJS Web Application

• NodeJS is a JavaScript runtime built on Chrome's V8 JavaScript engine.

TripleWave is open source

• Released with a Apache 2.0 licence

• Source code available at:

https://github.com/streamreasoning/TripleWave

Daniele Dell'Aglio - TripleWave 22/28

TripleWave Architecture

Web API

Transform

Stream

Graph

Stream

Connector

Stream

Datagen

Stream

Scheduler

Stream

Web

Service

SPARQL

Endpoint

File

R2RML

Mapping

Conversion

Daniele Dell'Aglio - TripleWave 23/28

TripleWave Architecture

Web API

Transform

Stream

Graph

Stream

Connector

Stream

Datagen

Stream

Scheduler

Stream

Web

Service

SPARQL

Endpoint

File

R2RML

Mapping

Replay

Replay loop

Daniele Dell'Aglio - TripleWave 23/28

Consuming TripleWave RDF Stream - Push

The TripleWave stream can be consumed via push by extending the RSP service framework1

1https://github.com/streamreasoning/rsp-services

TripleWaveRSP-

Service

C-SPARQL

Register the stream, the query

and the observers

Connect to the

RDF stream

descriptor

Connect to the

RDF stream

endpoint Declare the stream, the query

and the observersInject the stream

Daniele Dell'Aglio - TripleWave 24/28

Show cases

Three demos have been deployed to show the capabilities of the system.

Wikipedia changes stream conversion.

http://131.175.141.249/TripleWave-transform/sgraph

Endlessly replay as a stream the Linked Sensor Data dataset.

http://131.175.141.249/TripleWave-endless/sgraph

Endlessly replay as a stream the LDBC social graph dataset.

http://131.175.141.249/TripleWave-ldbc/sgraph

Daniele Dell'Aglio - TripleWave 25/28

Find more...

• Andrea Mauri, Jean-Paul Calbimonte, Daniele Dell’Aglio, Marco Balduini, Marco Brambilla, Emanuele Della Valle, Karl Aberer: TripleWave: Spreading RDF Streams on the Web. Resource Paper at International Semantic Web Conference 2016.

• Andrea Mauri, Jean-Paul Calbimonte, Daniele Dell’Aglio, Marco Balduini, Emanuele Della Valle, Karl Aberer: Where Are the RDF Streams?: On Deploying RDF Streams on the Web of Data with TripleWave. Poster at International Semantic Web Conference 2015.

• A special thanks to Jean-Paul Calbimonte and Andrea Mauri for supplying me parts of the today slides

Daniele Dell'Aglio - TripleWave 26/28

Conclusions

RDF streams are getting a momentum

• Several active research groups

• Prototypes, methods and applications

TripleWave shows that it is possible to exchange RDF streams over the Web

• It uses standard technologies

• It feeds C-SPARQL (and soon CQELS)

There is a potential huge value in putting together the results we are obtaining

Daniele Dell'Aglio - TripleWave 27/28

Thank you! Questions?

TripleWave: a step towards RDF Stream Processing on the Web

http://streamreasoning.github.io/TripleWave

Daniele Dell’Aglio

dellaglio@ifi.uzh.ch

http://dellaglio.org

@dandellaglio

Daniele Dell'Aglio - TripleWave 28/28