ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Post on 30-Jul-2015

429 views 3 download

Tags:

Transcript of ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Publishing and Interlinking Linked Geospatial Data

Tutorial in Conjunction with the

12th Extended Semantic Web Conference

http://event.cwi.nl/eswc2015-geo/

Tutorial organization

9:00-9:15 Introduction9:15-10:30 Background in geospatial data modeling, representinggeospatial information in the Semantic Web, and querying linked geospatial data.10:30-11:00 coffee break11:00-12:00 Publishing geospatial information as RDF graphs12:00-12:30 Discovering Spatial and Temporal Links among RDF graphs12:30-14:00 Lunch break14:00-14:30 Discovering Spatial and Temporal Links among RDF graphs14:30-15:30 Hands-on session: Publishing geospatial information as RDF graphs15:30-16:00 coffee break16:00-17:00 Hands-on session: Discovering Spatial and Temporal Links among RDF graphs17:00-17:10 Conclusions

http://event.cwi.nl/eswc2015-geo/

Part 1:

Background in geospatial data

modeling

ESWC 2015 Tutorial

Publishing and Interlinking Linked Geospatial Data

Dept. of Informatics and TelecommunicationsNational and Kapodistrian University of Athens

ESWC 2015 Tutorial 2

Outline

• Basic GIS concepts and terminology

• Representing geometries

• Representing topological information

• Geospatial data standards

ESWC 2015 Tutorial 3

Basic GIS Concepts and Terminology

• Theme: the information corresponding to a particular domain

that we want to model. A theme is a set of geographic

features.

• Example: the countries of Europe

ESWC 2015 Tutorial 4

Basic GIS Concepts (cont’d)

• Geographic feature or geographic object: a domain entity

that can have various attributes that describe spatial and non-

spatial characteristics.

• Example: the country Greece with attributes

• Population

• Flag

• Capital

• Geographical area

• Coastline

• Bordering countries

ESWC 2015 Tutorial 5

Basic GIS Concepts (cont’d)

• Geographic features can be atomic or complex.

• Example: According to the Kallikratis administrative reform of

2010, Greece consists of:

• 13 regions (e.g., Crete)

• Each region consists of regional units (e.g., Heraklion)

• Each regional unit consists of municipalities (e.g.,

Dimos Chersonisou)

• …

ESWC 2015 Tutorial 6

Basic GIS Concepts (cont’d)

• The spatial characteristics of a feature can involve:

• Geometric information (location in the underlying

geographic space, shape etc.)

• Topological information (containment, adjacency etc.).

Municipalities of the regional unit of Heraklion:1. Dimos Irakliou2. Dimos Archanon-Asterousion3. Dimos Viannou4. Dimos Gortynas5. Dimos Maleviziou6. Dimos Minoa Pediadas7. Dimos Festou8. Dimos Chersonisou

ESWC 2015 Tutorial 7

Geometric Information

• Geometric information can be captured by using geometric primitives

(points, lines, polygons, etc.) to approximate the spatial attributes of

the real world feature that we want to model.

• Geometries are associated with a coordinate reference system which

describes the coordinate space in which the geometry is defined.

ESWC 2015 Tutorial 8

Encoding Geometries: Vector Representation

• In this encoding objects in space are represented using points as

primitives as follows:

• A point is represented by a tuple of coordinates.

• A line segment is represented by a pair with its beginning

and ending point.

• More complex objects such as arbitrary lines, curves,

surfaces etc. are built recursively by the basic primitives

using constructs such as lists, sets etc.

• This is the approach used in all GIS and other popular

systems today. It has also been standardized by various

international bodies.

ESWC 2015 Tutorial 9

Example

[(1,2) (2,2) (5,3) (3,1) (2,1) (1,2)]

ESWC 2015 Tutorial 10

Encoding Geometries: Constraint Representation

• In this case objects in space are represented by quantifier free

formulas in a constraint language (e.g., linear constraints).

)3

4

353()124()223(

xyxyyxxyyxxy

ESWC 2015 Tutorial 11

Constraint Databases

• The constraint representation of spatial data was the focus of

much work in databases, logic programming and AI after the

paper by Kanellakis, Kuper and Revesz (PODS, 1991).

• The approach was very fruitful theoretically but was not adopted

in practice.

ESWC 2015 Tutorial 12

Topological Information

• Topological information is inherently qualitative and it is

expressed in terms of topological relations (e.g., containment,

adjacency, overlap etc.).

• Topological information can be derived from geometric

information or it might be captured by asserting explicitly the

topological relations between features.

ESWC 2015 Tutorial 13

Topological Relations

• The study of topological relations has produced

a lot of interesting results by researchers in:

• GIS

• Spatial databases

• Artificial Intelligence (qualitative reasoning

and knowledge representation)

ESWC 2015 Tutorial 14

DE-9IM

• The dimensionally extended 9-intersection model

(DE-9IM) of Clementini and Felice.

• It is based on the point-set topology of R2.

• It deals with simple, closed and connected

geometries (areas, lines, points).

• It is an extension of earlier approaches: the 4-

intersection (4IM) and 9-intersection (9IM)

models by Egenhofer and colleagues.

ESWC 2015 Tutorial 15

Topological Relations in DE-9IM

• It captures topological relationships between two

geometries a and b in R2 by considering the

dimensions of the intersections of the

boundaries, interiors and exteriors of the two

geometries:

• The dimension can be 2, 1, 0 and -1 (dimension of

the empty set).

ESWC 2015 Tutorial 16

Example

I(C) B(C) E(C)

I(A) -1 -1 2

B(A) -1 -1 1

E(A) 2 1 2

A

C

ESWC 2015 Tutorial 17

Topological Relations in DE-9IM

• The following five named relationships between two different

geometries can be distinguished: disjoint, touches, crosses,

within and overlaps.

• The named relationships have a reasonably intuitive meaning

for users. They are jointly exclusive and pairwise disjoint

(JEPD).

• The model can also be defined using an appropriate calculus of

geometries that uses these 5 binary relations and boundary

operators.

ESWC 2015 Tutorial 18

Example: A disjoint C

I(C) B(C) E(C)

I(A) F F *

B(A) F F *

E(A) * * *

A

C

Notation: • T = { 0, 1, 2 }• F = -1 • * = don’t care = { -1, 0, 1, 2 }

ESWC 2015 Tutorial 19

Example: A within C

I(C) B(C) E(C)

I(A) T * F

B(A) * * F

E(A) * * *

C

A

Notation equivalent to 3x3 matrix:

• String of 9 characters representing the above matrix in row major order.

• In this case: T*F**F***

ESWC 2015 Tutorial 20

DE-9IM Relation Definitions

ESWC 2015 Tutorial 21

The Region Connection Calculus (RCC)

• The primitives of the calculus are spatial regions. These are

non-empty, regular closed subsets of a topological space.

• The calculus is based on a single binary predicate C that

formalizes the “connectedness” relation.

• C(a,b) is true when the closure of a is connected to the

closure of b i.e., they have at least one point in common.

• It is axiomatized using first order logic.

ESWC 2015 Tutorial 22

RCC-8

• This is a set of eight JEPD binary relations that can

be defined in terms of predicate C.

ESWC 2015 Tutorial 23

RCC-5

• The RCC-5 subset has also been studied. The

granularity here is coarser. The boundary of a region is

not taken into consideration:

• No distinction among DC and EC, called just DR.

• No distinction among TPP and NTPP, called just

PP.

• RCC-8 and RCC-5 relations can also be defined

using point-set topology, and there are very close

connections to the models of Egenhofer and others.

ESWC 2015 Tutorial 24

More Qualitative Spatial Relations

• Orientation/Cardinal directions (left of, right of,

north of, south of, northeast of etc.)

• Distance (close to, far from etc.). This information

can also be quantitative.

ESWC 2015 Tutorial 25

Coordinate Systems

• Coordinate: one of n scalar values that determines the position

of a point in an n-dimensional space.

• Coordinate system: a set of mathematical rules for specifying

how coordinates are to be assigned to points.

• Example: the Cartesian coordinate system

ESWC 2015 Tutorial 26

Coordinate Reference Systems

• Coordinate reference system: a coordinate system

that is related to an object (e.g., the Earth, a planar

projection of the Earth, a three dimensional

mathematical space such as R3) through a datum

which specifies its origin, scale, and orientation.

• The term spatial reference system is also used.

ESWC 2015 Tutorial 27

Geographic Coordinate Reference Systems

• These are 3-dimensional coordinate systems that utilize latitude

(φ), longitude (λ) , and optionally geodetic height (i.e.,

elevation), to capture geographic locations on Earth.

ESWC 2015 Tutorial 28

The World Geodetic System

• The World Geodetic System (WGS) is the most well-known

geographic coordinate reference system and its latest revision is

WGS84.

• Applications: cartography, geodesy, navigation (GPS), etc.

ESWC 2015 Tutorial 29

Projected Coordinate Reference Systems

• Projected coordinate reference systems: they transform the

3-dimensional approximation of the Earth into a 2-dimensional

surface (distortions!)

• Example: the Universal Transverse Mercator (UTM) system

ESWC 2015 Tutorial 30

Coordinate Reference Systems (cont’d)

• There are well-known ways to translate between co-

ordinate reference systems.

• See the list of coordinate reference systems of the

European Petroleum Survey Group: http://www.epsg-

registry.org/

ESWC 2015 Tutorial 31

Geospatial Data Standards

• The Open Geospatial Consortium (OGC) and the

International Organization for Standardization (ISO) have

developed many geospatial data standards that are in wide use

today. In this tutorial we will cover:

• Well-Known Text

• Geography Markup Language

• OpenGIS Simple Features Access

ESWC 2015 Tutorial 32

Well-Known Text (WKT)

• WKT is an OGC and ISO standard for representing geometries,

coordinate reference systems, and transformations between

coordinate reference systems.

• WKT is specified in OpenGIS Simple Feature Access - Part 1:

Common Architecture standard which is the same as the ISO 19125-1

standard. Download from

http://portal.opengeospatial.org/files/?artifact_id=25355 .

• This standard concentrates on simple features: features with all

spatial attributes described piecewise by a straight line or a

planar interpolation between sets of points.

ESWC 2015 Tutorial 33

WKT Class Hierarchy

ESWC 2015 Tutorial 34

Example

WKT representation:

GeometryCollection(

Point(5 35),

LineString(3 10,5 25,15 35,20 37,30 40),

Polygon((5 5,28 7,44 14,47 35,40 40,20 30,5 5),

(28 29,14.5 11,26.5 12,37.5 20,28 29))

)

ESWC 2015 Tutorial 35

Geography Markup Language (GML)

• GML is an XML-based encoding standard for the

representation of geospatial data.

• GML provides XML schemas for defining a variety of concepts:

geographic features, geometry, coordinate reference

systems, topology, time and units of measurement.

• GML profiles are subsets of GML that target particular

applications.

• Examples: Point Profile, GML Simple Features Profile etc.

ESWC 2015 Tutorial 36

GML Simple Features: Class Hierarchy

ESWC 2015 Tutorial 37

Example

GML representation:

<gml:Polygon gml:id="p3" srsName="urn:ogc:def:crs:EPSG:6.6:4326”>

<gml:exterior>

<gml:LinearRing>

<gml:coordinates>

5,5 28,7 44,14 47,35 40,40 20,30 5,5

</gml:coordinates>

</gml:LinearRing>

</gml:exterior>

</gml:Polygon>

ESWC 2015 Tutorial 38

OpenGIS Simple Features Access

• OGC has also specified a standard for the storage, retrieval,

query and update of sets of simple features using

relational DBMS and SQL.

• This standard is “OpenGIS Simple Feature Access - Part 2: SQL

Option” and it is the same as the ISO 19125-2 standard. Download from

http://portal.opengeospatial.org/files/?artifact_id=25354.

• Related standard: ISO 13249 SQL/MM - Part 3.

ESWC 2015 Tutorial 39

OpenGIS Simple Features Access (cont’d)

• The standard covers two implementations options: (i) using only

the SQL predefined data types and (ii) using SQL with

geometry types.

• SQL with geometry types:

• We use the WKT geometry class hierarchy presented earlier

to define new geometric data types for SQL

• We define new SQL functions on those types.

ESWC 2015 Tutorial 40

SQL with Geometry Types -Functions

• Functions that request or check properties of a geometry:

• ST_Dimension(A:Geometry):Integer

• ST_GeometryType(A:Geometry):Character Varying

• ST_AsText(A:Geometry): Character Large Object

• ST_AsBinary(A:Geometry): Binary Large Object

• ST_SRID(A:Geometry): Integer

• ST_IsEmpty(A:Geometry): Boolean

• ST_IsSimple(A:Geometry): Boolean

ESWC 2015 Tutorial 41

SQL with Geometry Types –Functions (cont’d)

• Functions that test topological relations between two geometries

using the DE-9IM:

• ST_Equals(A:Geometry, B:Geometry):Boolean

• ST_Disjoint(A:Geometry, B:Geometry):Boolean

• ST_Intersects(A:Geometry, B:Geometry):Boolean

• ST_Touches(A:Geometry, B:Geometry):Boolean

• ST_Crosses(A:Geometry, B:Geometry):Boolean

• ST_Within(A:Geometry, B:Geometry):Boolean

• ST_Contains(A:Geometry, B:Geometry):Boolean

• ST_Overlaps(A:Geometry, B:Geometry):Boolean

• ST_Relate(A:Geometry, B:Geometry, Matrix: Char(9)):Boolean

ESWC 2015 Tutorial 42

DE-9IM Relation Definitions

• A equals B can also be defined by the pattern TFFFTFFFT.

• A intersects B is the negation of A disjoint B

• A contains B is equivalent to B within A

ESWC 2015 Tutorial 43

SQL with Geometry Types –Functions (cont’d)

• Functions for constructing new geometries out of existing

ones:

• ST_Boundary(A:Geometry):Geometry

• ST_Envelope(A:Geometry):Geometry

• ST_Intersection(A:Geometry, B:Geometry):Geometry

• ST_Union(A:Geometry, B:Geometry):Geometry

• ST_Difference(A:Geometry, B:Geometry):Geometry

• ST_SymDifference(A:Geometry, B:Geometry):Geometry

• ST_Buffer(A:Geometry, distance:Double):Geometry

ESWC 2015 Tutorial 44

Geospatial Relational DBMS

• The OpenGIS Simple Features Access Standard is today been

used in all relational DBMS with a geospatial extension.

• The abstract data type mechanism of the DBMS allows

the representation of all kinds of geospatial data types

supported by the standard.

• The query language (SQL) offers the functions of the

standard for querying data of these types.

• The book Geographic Information Systems and Science is a nice introduction to GIS. See: http://eu.wiley.com/WileyCDA/WileyTitle/productCd-EHEP001475.html

• The following papers present the DE-9IM model:

Eliseo Clementini, Paolino Di Felice and Peter van Oosterom.

A Small Set of Formal Topological Relationships Suitable for End-User Interaction. SSD 1993: 277-295

http://link.springer.com/chapter/10.1007%2F3-540-56869-7_16

E. Clementini and P. Felice. A Comparison of Methods for Representing Topological Relationships. Information Sciences 80 (1994), pp. 1-34.

http://www.sciencedirect.com/science/article/pii/106901159400033X The paper

• The paper below surveys a lot of interesting results on the RCC calculus:J. Renz, B. Nebel, Qualitative Spatial Reasoning using Constraint

Calculi, in: M. Aiello, I. Pratt-Hartmann and J. van Benthem (eds.),

Handbook of Spatial Logics, pp. 161–215, 2007, Springer.http://users.cecs.anu.edu.au/~jrenz/papers/renz-nebel-los.pdf

• The two OGC standards mentioned in the slides.

Readings

Part 2:

Spatial and Temporal Data in RDF:

stRDF/stSPARQL and GeoSPARQL

ESWC 2015 Tutorial

Publishing and Interlinking Linked Geospatial Data

Dept. of Informatics and TelecommunicationsNational and Kapodistrian University of Athens

ESWC 2015 Tutorial 2

Common Approach

• The two proposals (stRDF/stSPARQL and GeoSPARQL) offer constructs for:o Developing ontologies for spatial

and temporal data.o Encoding spatial and temporal

data that use these ontologies in RDF.

o Extending SPARQL to query spatial and temporal data.

ESWC 2015 Tutorial 3

Two Proposals

• stRDF/stSPARQL

• GeoSPARQL

ESWC 2015 Tutorial 4

The data model stRDF

An extension of RDF for the representation of geospatial information that changes over time.

Geospatial dimension:

Spatial data types are introduced.

Geospatial information is representing using spatial literals of these datatypes.

OGC standards WKT and GML are used for the serialization of spatial literals.

Temporal dimension (later)

Proposed independently and around the same time as GeoSPARQL (starting with an ESWC 2010 paper by Koubarakis and Kyzirakos).

[ Kyzirakos, Karpathiotakis

& Koubarakis 2012 ]

ESWC 2015 Tutorial 5

strdf:geometry rdf:type rdfs:Datatype;

rdfs:subClassOf rdfs:Literal.

strdf:WKT rdf:type rdfs:Datatype;

rdfs:subClassOf strdf:geometry.

strdf:GML rdf:type rdfs:Datatype;

rdfs:subClassOf strdf:geometry.

Spatial Datatypes

ESWC 2015 Tutorial 6

Example Ontology: Administrative Geography of Greece

Geometry property

strdf:geometry

strdf:GML strdf:WKT

ESWC 2015 Tutorial 7

Example Ontology: Administrative Geography of Greece

strdf:geometry

strdf:GML strdf:WKT

Geometry property

ESWC 2015 Tutorial 8

Example Data in stRDF

gag:Olympia

gag:name "Ancient Olympia";

rdf:type gag:MunicipalCommunity .

Spatial data type

gag:Olympia gag:hasGeometry

"POLYGON((21.5 18.5, 23.5 18.5,

23.5 21, 21.5 21, 21.5 18.5));

<http://www.opengis.net/def/crs/EPSG/0/4326>"^^

strdf:WKT .

Spatial literal

Coordinate Reference

System

Geometry Property

ESWC 2015 Tutorial

gag:Olympia

rdf:type gag:MunicipalCommunity;

gag:name "Ancient Olympia";

gag:population "184"^^xsd:int;

gag:hasGeometry "POLYGON

(((25.37 35.34,…)))"^^strdf:WKT.

gag:OlympiaMUnit

rdf:type gag:MunicipalityUnit;

gag:name "Municipality Unit of

Ancient Olympia".

gag:OlympiaMunicipality

rdf:type gag:Municipality;

gag:name "Municipality of

Ancient Olympia".

gag:Olympia gag:belongsTo gag:OlympiaMUnit .

gag:OlympiaMUnit gag:belongsTo gag:OlympiaMunicipality.

9

Example (cont’d)

ESWC 2015 Tutorial 10

More Examples

Corine Land Use/Land Cover (http://www.eea.europa.eu/publications/COR0-landcover )

Burnt Area Products (project TELEIOS,

http://www.earthobservatory.eu/ )

ESWC 2015 Tutorial 11

Corine Land Use/Land Cover

ESWC 2015 Tutorial 12

Corine Land Use/Land Cover in stRDF(http://www.linkedopendata.gr )

clc:Area_24015134

rdf:type clc:Area ;

clc:hasCode "312"^^xsd:decimal;

clc:hasID "EU-203497"^^xsd:string;

clc:hasArea_ha "255.5807904"^^xsd:double;

clc:hasGeometry "POLYGON((15.53 62.54,

…))"^^strdf:WKT;

clc:hasLandUse clc:ConiferousForest .

Geometry Property

ESWC 2015 Tutorial 13

Burnt Area Products (http://www.earthobservatory.eu/ontologies/noaOntology.owl)

ESWC 2015 Tutorial 14

Burnt Area Products

noa:ba_15

rdf:type noa:BurntArea;

noa:isProducedByProcessingChain

"static thresholds"^^xsd:string;

noa:hasAcquisitionTime

"2010-08-24T13:00:00"^^xsd:dateTime;

noa:hasGeometry "MULTIPOLYGON(((

393801.42 4198827.92, ..., 393008 424131)));

<http://www.opengis.net/def/crs/

EPSG/0/2100>"^^strdf:WKT.

Geometry Property

ESWC 2015 Tutorial 15

stSPARQL: Geospatial SPARQL 1.1

We define a SPARQL extension function for each function defined in the OpenGIS Simple Features Access standard

Basic functions

Get a property of a geometryxsd:int strdf:dimension(strdf:geometry A)

xsd:string strdf:geometryType(strdf:geometry A)

xsd:int strdf:srid(strdf:geometry A)

Get the desired representation of a geometryxsd:string strdf:asText(strdf:geometry A)

xsd:string strdf:asGML(strdf:geometry A)

Test whether a certain condition holdsxsd:boolean strdf:isEmpty(strdf:geometry A)

xsd:boolean strdf:isSimple(strdf:geometry A)

ESWC 2015 Tutorial 16

stSPARQL: Geospatial SPARQL 1.1

Functions for testing topological spatial relationships

OGC Simple Features Access

xsd:boolean strdf:equals(strdf:geometry A, strdf:geometry B)

xsd:boolean strdf:disjoint(strdf:geometry A, strdf:geometry B)

xsd:boolean strdf:intersects(strdf:geometry A, strdf:geometry B)

xsd:boolean strdf:touches(strdf:geometry A, strdf:geometry B)

xsd:boolean strdf:crosses(strdf:geometry A, strdf:geometry B)

xsd:boolean strdf:within(strdf:geometry A, strdf:geometry B)

xsd:boolean strdf:contains(strdf:geometry A, strdf:geometry B)

xsd:boolean strdf:overlaps(strdf:geometry A, strdf:geometry B)

xsd:boolean strdf:relate(strdf:geometry A, strdf:geometry B,

xsd:string intersectionPatternMatrix)

Egenhofer

RCC-8

ESWC 2015 Tutorial 17

stSPARQL: Geospatial SPARQL 1.1

Spatial analysis functions

Construct new geometric objects from existing geometric objects

strdf:geometry strdf:boundary(strdf:geometry A)

strdf:geometry strdf:envelope(strdf:geometry A)

strdf:geometry strdf:convexHull(strdf:geometry A)

strdf:geometry strdf:intersection(strdf:geometry A, strdf:geometry B)

strdf:geometry strdf:union(strdf:geometry A, strdf:geometry B)

strdf:geometry strdf:difference(strdf:geometry A, strdf:geometry B)

strdf:geometry strdf:symDifference(strdf:geometry A, strdf:geometry B)

strdf:geometry strdf:buffer(strdf:geometry A, xsd:double distance, xsd:anyURI units)

Spatial metric functions

xsd:float strdf:distance(strdf:geometry A, strdf:geometry B, xsd:anyURI units)

xsd:float strdf:area(strdf:geometry A)

Spatial aggregate functions

strdf:geometry strdf:union(set of strdf:geometry A)

strdf:geometry strdf:intersection(set of strdf:geometry A)

strdf:geometry strdf:extent(set of strdf:geometry A)

ESWC 2015 Tutorial 18

stSPARQL: Geospatial SPARQL 1.1

Select clause

Construction of new geometries (e.g., strdf:buffer(?geo, 0.1, uom:metre))

Spatial aggregate functions (e.g., strdf:union(?geo))

Metric functions (e.g., strdf:area(?geo))

Filter clause

Functions for testing topological spatial relationships between spatial terms (e.g.,

strdf:contains(?G1, strdf:union(?G2, ?G3)))

Numeric expressions involving spatial metric functions

(e.g., strdf:area(?G1) ≤ 2*strdf:area(?G2)+1)

Boolean combinations

Having clause

Boolean expressions involving spatial aggregate functions and spatial metric

functions or functions testing for topological relationships between spatial terms

(e.g., strdf:area(strdf:union(?geo))>1)

ESWC 2015 Tutorial 19

stSPARQL: An example (1/3)

SELECT ?name

WHERE {

?comm rdf:type gag:LocalCommunity;

gag:name ?name;

gag:hasGeometry ?commGeo .

?ba rdf:type noa:BurntArea;

noa:hasGeometry ?baGeo .

FILTER(strdf:overlaps(?commGeo,?baGeo))

}Spatial

Function

Return the names of local communities that have been affected by fires

ESWC 2015 Tutorial 20

stSPARQL: An example (2/3)

SELECT ?ba ?baGeom

WHERE {

?r rdf:type clc:Region;

clc:hasGeometry ?rGeom;

clc:hasCorineLandUse ?f.

?f rdfs:subClassOf clc:Forest.

?c rdf:type gag:LocalCommunity;

gag:hasGeometry ?cGeom.

?ba rdf:type noa:BurntArea;

noa:hasGeometry ?baGeom.

FILTER( strdf:intersects(?rGeom,?baGeom) &&

strdf:distance(?baGeom,?cGeom,uom:metre) < 200)}

Spatial Functions

Find all burnt forests near local communities

ESWC 2015 Tutorial

Spatial Function

21

SELECT ?burntArea

(strdf:intersection(?baGeom,

strdf:union(?fGeom))

AS ?burntForest)

WHERE {

?burntArea rdf:type noa:BurntArea;

noa:hasGeometry ?baGeom.

?forest rdf:type clc:Region;

clc:hasLandCover clc:ConiferousForest;

clc:hasGeometry ?fGeom.

FILTER(strdf:intersects(?baGeom,?fGeom))

}

GROUP BY ?burntArea ?baGeom

Compute the parts of burnt areas that lie in coniferous forests.

stSPARQL: An example (3/3)

Spatial Aggregate

ESWC 2015 Tutorial

Time dimensions in Linked Data

User-defined time: A time value (literal) with no special semantics.

Valid time: The time when a fact (represented by a triple) is true in the modeled reality.

Transaction time: The time when the triple is current in the database.

ESWC 2015 Tutorial

The time dimension of stRDF: The valid time of triples

The following extensions are introduced in stRDF:• Timeline: the (discrete) value space of the datatype xsd:dateTime of

XML-Schema

• Two kinds of time primitives are supported: time instants and time periods.• A time instant is an element of the time line.

• A time period is an expression of the form [B, E) or [B, E] or (B, E] or (B, E) where B and E

are time instants called the beginning and ending time of the period.

• The new datatype strdf:period is introduced.

23

rdfs:Literal

strdf:WKT strdf:GML

strdf:periodstrdf:geometry

ESWC 2015 Tutorial

The time dimension of stRDF (cont’d)

• Triples are extended to quads.

• A temporal triple (quad) is an expression of the form s p o t.

where s p o. is an RDF triple and t is a time instant or time

period called the valid time of the triple.

• The temporal constants NOW and UC (“until changed”) are

introduced.

24

ESWC 2015 Tutorial

An example with valid time

25

Forest

ESWC 2015 Tutorial 26

Forest

clc:region1 clc:hasLandCover clc:Forest

"[2006-08-25T11:00:00+02, "UC")"^^strdf:period .

An example with valid time

ESWC 2015 Tutorial

An example with valid time

27

Forest

clc:region1 clc:hasLandCover clc:Forest

"[2006-08-25T11:00:00+02, "UC")"^^strdf:period .

Burnt area

ESWC 2015 Tutorial 28

Forest Burnt area

noa:ba1 rdf:type noa:BurntArea

"[2007-08-25T11:00:00+02, "UC")"^^strdf:period .

clc:region1 clc:hasLandCover clc:Forest

"[2006-08-25T11:00:00+02, "UC")"^^strdf:period .

An example with valid time

ESWC 2015 Tutorial 29

Forest Burnt area

noa:ba1 rdf:type noa:BurntArea

"[2007-08-25T11:00:00+02, "UC"))"^^strdf:period .

clc:region1 clc:hasLandCover clc:Forest

"[2006-08-25T11:00:00+02,2007-08-25T11:00:00+02)"^^strdf:period .

An example with valid time

ESWC 2015 Tutorial 30

Forest Burnt area Agricultural area

clc:region1 clc:hasLandCover clc:AgriculturalArea

"[2009-08-25T11:00:00+02, "UC")"^^strdf:period .

noa:ba1 rdf:type noa:BurntArea

"[2007-08-25T11:00:00+02,2009-08-25T11:00:00+02)"^^strdf:period .

clc:region1 clc:hasLandCover clc:Forest

"[2006-08-25T11:00:00+02,2007-08-25T11:00:00+02)"^^strdf:period .

An example with valid time

ESWC 2015 Tutorial

The time dimension of stSPARQL

The following extensions are introduced:

• Triple patterns are extended to quad patterns (the last component is a temporal

term: variable or constant)

• Temporal extension functions are introduced:

• Allen's temporal relations (e.g., strdf:after)

• Period constructors (e.g., strdf:period_intersect)

• Temporal aggregates (e.g., strdf:maximalPeriod)

31

ESWC 2015 Tutorial

• Find the current land cover of all areas in the dataset

SELECT ?clc

WHERE {

?R rdf:type clc:Region .

?R clc:hasLandCover ?clc ?t1 .

FILTER(strdf:during ("NOW", ?t1))

}

Temporal extension function

Temporal constant

Example Query

32

Quad Pattern

ESWC 2015 Tutorial 33

Two Proposals

• stRDF/stSPARQL

• GeoSPARQL

ESWC 2015 Tutorial 34

GeoSPARQL

GeoSPARQL is an OGC standard.

Functionalities similar to stRDF/stSPARQL:

Geometries are represented using literals of spatial datatypes.

Literals are serialized using WKT and GML.

The same families of functions are offered for querying geometries.

Functionalities beyond stSPARQL:

High level ontologies inspired from GIS terminology.

Topological relations can now be asserted as well so that reasoning and querying on them is possible.

A query rewriting mechanism.

Functionalities of stSPARQL that are not included in GeoSPARQL:

• Geospatial aggregate functions

• Temporal dimension

ESWC 2015 Tutorial

GeoSPARQL Components

Core

Topology VocabularyExtension

- relation family

Geometry Extension- serialization- version

Geometry TopologyExtension

- serialization- version - relation family

Query RewriteExtension

- serialization- version - relation family

RDFS Entailment Extension

- serialization- version - relation family

Parameters

• Serialization

• WKT

• GML

• Relation Family

• Simple Features

• RCC-8

• Egenhofer

ESWC 2015 Tutorial 36

GeoSPARQL Core

Defines two top level classes that can be used to organize geospatial data.

ESWC 2015 Tutorial 37

GeoSPARQL Geometry Extension

Provides vocabulary for asserting and querying data about the geometric attributes of a feature.

ESWC 2015 Tutorial 38

Example Ontology: Greek Administrative Geography

ESWC 2015 Tutorial 39

Greek Administrative Geography

ESWC 2015 Tutorial 40

Greek Administrative Geography

ESWC 2015 Tutorial 41

Example Data

gag:Olympia

rdf:type gag:MunicipalCommunity;

gag:name "Ancient Olympia";

gag:population "184"^^xsd:int;

geo:hasGeometry ex:polygon1.

ex:polygon1

rdf:type geo:Geometry;

geo:asWKT "http://www.opengis.net/def/crs/OGC/1.3/CRS84

POLYGON((21.5 18.5,23.5 18.5,

23.5 21,21.5 21,21.5 18.5))"

^^sf:wktLiteral.

Datatype from Geometry extension

Geometry literal

Property from Geometry extension

Property from Geometry extension

Class from Geometry extension

ESWC 2015 Tutorial 42

Non-Topological Query Functions of the Geometry Extension

The following non-topological query functions are also offered:

geof:distance

geof:buffer

geof:convexHull

geof:intersection

geof:union

geof:difference

geof:symDifference

geof:envelope

geof:boundary

ESWC 2015 Tutorial 43

GeoSPARQL Topology Vocabulary Extension

The extension is parameterized by the family of topological relations supported.

Topological relations for simple features

The Egenhofer relations e.g., geo:ehMeet

The RCC-8 relations e.g., geo:rcc8ec

ESWC 2015 Tutorial

gag:Olympia

rdf:type gag:MunicipalCommunity;

gag:name "Ancient Olympia".

gag:OlympiaMUnit

rdf:type gag:MunicipalityUnit;

gag:name "Municipality Unit of

Ancient Olympia".

gag:OlympiaMunicipality

rdf:type gag:Municipality;

gag:name "Municipality of

Ancient Olympia".

gag:Olympia geo:sfWithin gag:OlympiaMUnit .

gag:OlympiaMUnit geo:sfWithin gag:OlympiaMunicipality.

44

Greek Administrative Geography

Simple Features topological

relation

ESWC 2015 Tutorial 45

GeoSPARQL: An example

SELECT ?m

WHERE {

?m rdf:type gag:MunicipalityUnit.

?m geo:sfContains gag:Olympia.

}

Find the municipality unit that contains the community of Ancient Olympia

Simple Featurestopological relation

Answer: ?m = gag:OlympiaMUnit

ESWC 2015 Tutorial 46

GeoSPARQL: An example

SELECT ?m

WHERE {

?m rdf:type gag:Municipality.

?m geo:sfContains gag:Olympia.

}

Find the municipality that contains the community of Ancient Olympia

Answer?

ESWC 2015 Tutorial 47

Example (cont’d)

The answer to the previous query is

?m = gag:OlympiaMunicipality

GeoSPARQL does not tell you how to compute this answer which needs reasoning about the transitivity of relation geo:sfContains.

Options: • Use rules• Use constraint-based techniques

ESWC 2015 Tutorial 48

The Geometry Topology Extension

• Offers vocabulary for querying topological properties of geometry literals.

• Simple Features• geof:relate

• geof:sfEquals

• geof:sfDisjoint

• geof:sfIntersects

• geof:sfTouches

• geof:sfCrosses

• geof:sfWithin

• geof:sfContains

• geof:sfOverlaps

• Egenhofer (e.g., geof:ehDisjoint)• RCC-8 (e.g., geof:rcc8dc)

ESWC 2015 Tutorial 49

Example Query

SELECT ?name

WHERE {

?comm rdf:type gag:LocalCommunity;

gag:name ?name;

geo:hasGeometry ?commGeo .

?ba rdf:type noa:BurntArea;

geo:hasGeometry ?baGeo .

FILTER(geof:sfOverlaps(?commGeo,?baGeo))

}Geometry Topology Extension Function

Return the names of local communities that have been affected by fires

Geometry Extension Property

Geometry Extension Property

ESWC 2015 Tutorial 50

GeoSPARQL Query Rewrite Extension

Provides a collection of RIF rules that use topological extension functions to establish the existence of topological predicates.

Example: given the RIF rule named geor:sfWithin, the serializations of the geometries of gag:Athens and gag:Greece named AthensWKT and GreeceWKT and the fact that

geof:sfWithin(AthensWKT, GreeceWKT)

returns true from the computation of the two geometries, we can derive the triple

gag:Athens geo:sfWithin gag:Greece

One possible implementation is to re-write a given SPARQL query.

ESWC 2015 Tutorial 51

RIF Rule

Forall ?f1 ?f2 ?g1 ?g2 ?g1Serial ?g2Serial

(?f1[geo:sfWithin->?f2] :-

Or(

And (?f1[geo:hasDefaultGeometry->?g1]

?f2[geo:hasDefaultGeometry->?g2]

?g1[ogc:asGeomLiteral->?g1Serial]

?g2[ogc:asGeomLiteral->?g2Serial]

External(geof:sfWithin (?g1Serial,?g2Serial)))

And (?f1[geo:hasDefaultGeometry->?g1]

?g1[ogc:asGeomLiteral->?g1Serial]

?f2[ogc:asGeomLiteral->?g2Serial]

External(geof:sfWithin (?g1Serial,?g2Serial)))

And (?f2[geo:hasDefaultGeometry->?g2]

?f1[ogc:asGeomLiteral->?g1Serial]

?g2[ogc:asGeomLiteral->?g2Serial]

External(geof:sfWithin (?g1Serial,?g2Serial)))

And (?f1[ogc:asGeomLiteral->?g1Serial]

?f2[ogc:asGeomLiteral->?g2Serial]

External(geof:sfWithin (?g1Serial,?g2Serial)))

))

Feature-

Feature

Feature-

Geometry

Geometry-

Feature

Geometry-

Geometry

ESWC 2015 Tutorial 52

Example

SELECT ?feature

WHERE {

?feature geo:sfWithin

geonames:OlympiaMunicipality.

}

Find all features that are inside the municipality of Ancient Olympia

ESWC 2015 Tutorial 53

Rewritten Query

SELECT ?feature

WHERE { {?feature geo:sfWithin geonames:Olympia }

UNION

{ ?feature geo:hasDefaultGeometry ?featureGeom .

?featureGeom geo:asWKT ?featureSerial .

geonames:Olympia geo:hasDefaultGeometry ?olGeom .

?olGeom geo:asWKT ?olSerial .

FILTER (geof:sfWithin (?featureSerial, ?olSerial)) }

UNION { ?feature geo:hasDefaultGeometry ?featureGeom .

?featureGeom geo:asWKT ?featureSerial .

geonames:Olympia geo:asWKT ?olSerial .

FILTER (geof:sfWithin (?featureSerial, ?olSerial)) }

UNION { ?feature geo:asWKT ?featureSerial .

geonames:Olympia geo:hasDefaultGeometry ?olGeom .

?olGeom geo:asWKT ?olSerial .

FILTER (geof:sfWithin (?featureSerial, ?olSerial)) }

UNION {

?feature geo:asWKT ?featureSerial .

geonames:Olympia geo:asWKT ?olSerial .

FILTER (geof:sfWithin (?featureSerial, ?olSerial)) }

ESWC 2015 Tutorial

Specifies the RDFS entailments that follow from the class and property hierarchies defined in the other components e.g., the Geometry Extension.

Systems should use an implementation of RDFS entailment to allow the derivation of new triples from those already in a graph.

54

GeoSPARQL RDFS Entailment Extension

ESWC 2015 Tutorial 55

Example

Given the triples

ex:f1 geo:hasGeometry ex:g1 .

geo:hasGeometry rdfs:domain geo:Feature.

we can infer the following triples:

ex:f1 rdf:type geo:Feature .

ex:f1 rdf:type geo:SpatialObject .

ESWC 2015 Tutorial

Readings

56

• Material from the Strabon web site (http://strabon.di.uoa.gr ).

• The following tutorial paper which introduces to the topic of linked geospatial data:M. Koubarakis, M. Karpathiotakis, K. Kyzirakos, C. Nikolaou and M. Sioutis. Data Models and Query Languages for Linked Geospatial Data. Reasoning Web Summer School 2012.http://strabon.di.uoa.gr/files/survey.pdf

• The following paper which introduces stSPARQL and Strabon:K. Kyzirakos, M. Karpathiotakis and M. Koubarakis. Strabon: A Semantic Geospatial DBMS. 11th International Semantic Web Conference (ISWC 2012). November 11-15, 2012. Boston, USA.http://iswc2012.semanticweb.org/sites/default/files/76490289.pdf

• The following paper which introduces the temporal features of stSPARQL and Strabon:

K. Bereta, P. Smeros and M. Koubarakis. Representing and Querying the Valid Time of Triples for Linked Geospatial Data. In the 10th Extended Semantic Web Conference (ESWC 2013). Montpellier, France. May 26-30, 2013.http://www.strabon.di.uoa.gr/files/eswc2013.pdf

• The GeoSPARQL standard found at http://www.opengeospatial.org/standards/geosparql

ESWC 2015 Tutorial

Readings (cont’d)

57

• The following paper which introduces the RDFi framework:Charalampos Nikolaou and Manolis Koubarakis. Incomplete Information in RDF. In the 7th International Conference on Web Reasoning and Rule Systems (RR 2013). Mannheim, Germany. July 27-29, 2013.http://cgi.di.uoa.gr/~koubarak/publications/rr2013.pdf

• The following paper which introduces the benchmark Geographica:G. Garbis, K. Kyzirakos and M. Koubarakis. Geographica: A Benchmark for Geospatial RDF Stores. In the 12th International Semantic Web Conference (ISWC 2013). Sydney, Australia. October 21-25, 2013.http://cgi.di.uoa.gr/~koubarak/publications/Geographica.pdf

Publishing geospatial information

as RDF graphsKostis Kyzirakos, Dimitrianos Savva

Outline

Mapping relational data to RDF graphs

Mapping non-relational data to RDF graphs

Geospatial Extensions for mapping geospatial data to RDF graphs

Implemented Systems

Demonstration

2

Mapping relational data to RDF graphs

Sitecode Sitename ReleaseDate …

DE0916391 NTP S-H W 2011-01-27

DE1003301 DOGGERB

ANK

2011-01-27

ProtectedArea

?

Natura 2000 is an ecological network

designated under the Birds Directive and

the Habitats Directive which form the

cornerstone of the nature conservation

policy of the European Union.

http://ec.europa.eu/environment/nature/natura2000/index_en.htm

http://www.eea.europa.eu/data-and-maps/data/natura-6

Direct Mapping

W3C Recommendation from 2012http://www.w3.org/TR/rdb-direct-mapping/

Relational tables are mapped to classes defined by an RDF vocabulary.

Attributes of each table are mapped to RDF properties that represent the relation between subject and object resources.

Identifiers, class names, properties and instancesare generated automatically following the labels of the input data.

4

Direct Mapping - Example

Sitecode Sitename ReleaseDate …

DE0916391 NTP S-H W 2011-01-27

DE1003301 DOGGERB

ANK

2011-01-27

ProtectedArea ProtectedArea

xsd:string xsd:date

ReleaseDateSitename

@base <http://foo.example/DB/> .

@prefix rdf: <http://www.w3.org/1999/02-22-rdf-syntax-ns#> .

@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<ProtectedArea/Sitecode=DE0916391> rdf:type <ProtectedArea> .

<ProtectedArea/Sitecode=DE0916391> <ProtectedArea#Sitename> "NTP S-H W" .

<ProtectedArea/Sitecode=DE0916391> <ProtectedArea#ReleaseDate>

"2011-01-27"^^xsd:date .

<ProtectedArea/Sitecode=DE1003301> rdf:type <ProtectedArea> .

<ProtectedArea/Sitecode=DE1003301> <ProtectedArea#Sitename> "DOGGERBANK" .

<ProtectedArea/Sitecode=DE1003301> <ProtectedArea#ReleaseDate>

"2011-01-27"^^xsd:date .

The language R2RML

R2RML is a language for expressing customized mappings from relational databases to RDF graphs

R2RML is a W3C Recommendation from 2012http://www.w3.org/TR/r2rml/

R2RML mappings provide the user with the ability to express the desired transformation of existing relational data into the RDF data model, following a structure and a target vocabulary that is chosen by the user.

6

The language R2RML (cont’d)

LogicalTable

PredicateObjectMap

GraphMap

TriplesMap SubjectMap

ObjectMap

PredicateMap

RefObjectMap Join

TermMap

Constant

Column

Template

Child

Parent

The language R2RML (cont’d)

A logical table can be a relational table that is explicitly stored in the

databasean SQL viewan SQL select query

A triples map is a rule that defines how each tuple of the logical table will be mapped to a set of RDF triples. It consists ofa subject map zero or more predicate-object maps.

8

The language R2RML (cont’d)

A subject map is a rule that defines how to generate the URI that will be the subject of each generated RDF triple.

A predicate-object map consists of predicate maps and object maps.

A predicate map defines the RDF property to be used to relate the subject and the object of the generated triple.

An object map defines how to generate the object of the triple which originates from the current row of the logical table.

9

The language R2RML (cont’d)

Subject, predicate, object and graph maps are term maps. A term map is a function that generates an RDF term from a logical table. Three types of term maps are defined:constant-valued term mapscolumn-valued term maps template-valued term maps

10

The language R2RML (cont’d)

A referencing object map allows using the subjects of another triples map as the objects generated by a predicate-object map. Optionally, it has one or more join condition

properties.

11

PredicateObjectMap

RefObjectMap

TriplesMap

JoinConditioncolumn name

column name

source: http://www.w3.org/TR/r2rml/#dfn-predicate-map

rr:child

rr:parent

rr:join

Condition*

rr:parent

TriplesMaprr:object

Map

The language R2RML – Example

Sitecode Sitename ReleaseDate …

DE0916391 NTP S-H W 2011-01-27

DE1003301 DOGGERB

ANK

2011-01-27

ProtectedArea ProtectedArea

xsd:string

Sitename

@base <http://foo.example/DB/> .

<NaturaMapping>

rr:subjectMap [

rr:template "ProtectedArea/SiteCode={SiteCode}";

rr:class <ProtectedArea> ];

rr:predicateObjectMap [

rr:predicate ProtectedArea:SiteName;

rr:objectMap [ rr:column "SiteName"; ]; ] .

<ProtectedArea/Sitecode=DE0916391> rdf:type <ProtectedArea> .

<ProtectedArea/Sitecode=DE0916391> <ProtectedArea#Sitename> "NTP S-H W" .

<ProtectedArea/Sitecode=DE1003301> rdf:type <ProtectedArea> .

<ProtectedArea/Sitecode=DE1003301> <ProtectedArea#Sitename> "DOGGERBANK" .

<ogr:FeatureCollection>

<gml:featureMember>

<ogr:waterways fid="waterways.128">

<ogr:osm_id>8108139</ogr:osm_id>

<ogr:name>Lech</ogr:name>

<ogr:type>river</ogr:type>

<ogr:geometryProperty>

<gml:LineString>

<gml:coordinates>

10.9034096,47.7996669

10.9037025,47.8003338 …

</gml:coordinates>

</gml:LineString>

</ogr:geometryProperty>

</ogr:waterways>

</gml:featureMember>

</ogr:FeatureCollection>

Mapping non-relational data to RDF graphs

?

OpenStreetMap is a collaborative project

for publishing free maps of the world. OSM

maintains a community-driven global

editable map that gathers map data in a

crowdsourcing fashion.

http://www.openstreetmap.org/

RDF Mapping Language (RML)

RML is a recently proposed mapping language that defines how to map heterogeneous sources into RDF.http://semweb.mmlab.be/rml/spec.html

RML is defined as a superset of the W3C-standard R2RML

R2RML RML

Logical Table rr:logicalTable Logical Source rml:logicalSource

Table Name rr:tableName URI rml:source

column rr:column reference rml:reference

SQL Reference Formulation rml:referenceFormulation

per row iteration defined iterator rml:iterator

source: http://semweb.mmlab.be/rml/RML_R2RML.html

RML Overview

LogicalSource

PredicateObjectMap

GraphMap

TriplesMap SubjectMap

ObjectMap

PredicateMap

RefObjectMap JoinChild

Parent

TermMap

Constant

Reference

Template

Source

Iterator

Reference Formulation

RML extensions

A logical source refers to the input dataset that will be converted to an RDF graph.

Each logical source has a source property pointing to input data a logical iterator that defines the iteration pattern over

the input data source an optional reference formulation property that defines

the query language that may be used (e.g., SQL2008, XPath, JSONPath)

An RML reference is a term map that refers to a column name (SQL, CSV), an XML element or attribute, or an JSON object.

<ogr:FeatureCollection>

<gml:featureMember>

<ogr:waterways fid="waterways.128">

<ogr:osm_id>8108139</ogr:osm_id>

<ogr:name>Lech</ogr:name>

<ogr:type>river</ogr:type>

<ogr:geometryProperty>

<gml:LineString>

<gml:coordinates>

10.9034096,47.7996669

10.9037025,47.8003338 …

</gml:coordinates>

</gml:LineString>

</ogr:geometryProperty>

</ogr:waterways>

</gml:featureMember>

</ogr:FeatureCollection>

RML Example <#waterways>

rml:logicalSource [

rml:source "/home/leo/osm.gml";

rml:referenceFormulation ql:XPath;

rml:iterator "/ogr:FeatureCollection

/gml:featureMember

/ogr:waterways";

];

rr:subjectMap [

rr:template

"http://www.example.com/id/{@fid}";

rr:class onto:waterways;

];

rr:predicateObjectMap [

rr:predicate onto:hasOgr-Name;

rr:objectMap [

rr:datatype xsd:string;

rml:reference "ogr:name";

]; ] .

ex_id:waterways.128 rdf:type onto:waterways ;

onto:hasOgr-Name "Lech" ;

onto:hasFid "waterways.128"^^xsd:ID ;

onto:hasOgr-Osm_id "8108139" ;

onto:hasOgr-Type "river" .

Mapping geospatial data to RDF graphs

Geospatial data are available in formats suchas:

• ESRI shape files

• KML documents

• GeoJSON documents

• XML documents

Geospatial data may also be stored in spatially-enabled relational databases.

Extending R2ML with transformation-valued term maps

LogicalTable

PredicateObjectMap

GraphMap

TriplesMap SubjectMap

ObjectMap

PredicateMap

RefObjectMap Join

TermMap

Constant

Column

Template

Child

Parent

Function

ArgumentMap

ArgumentMap

Function

Extending RML with transformation-valued term maps

LogicalSource

PredicateObjectMap

GraphMap

TriplesMap SubjectMap

ObjectMap

PredicateMap

RefObjectMap Join

TermMap

Source

Iterator

Reference Formulation

Constant

Column

Template

Child

Parent

Function

ArgumentMap

ArgumentMap

Function

Transformation-valued term maps

A transformation-valued term maps is a term map that generates an RDF term by applying a SPARQL extension function on one or more term maps.

A transformation-valued term map has exactly one rrx:function property that defines a

SPARQL extension function that performs the desired transformation

one rrx:argumentMap property that has as range an rdf:List of term maps that define the arguments to be passed to the transformation function

Transformation-valued term maps (cont’d)

Extending join conditions

PredicateObjectMap

RefObjectMap

TriplesMap

JoinCondition

column name

column name

rr:child

rrx:

function

rr:join

Condition*

rr:parent

TriplesMaprr:object

Map

rdf:List

IRIrefOr

Function

rr:parent

rrx:

argument

Map

Example

Sitecode Sitename Geometry …

DE0916391 NTP S-H W POLYGON((…))

DE1003301 DOGGERB

ANK

POLYGON((…))

ProtectedArea

ProtectedArea

xsd:string

geo:hasGeometry

<NaturaGeometryMapping>

rr:subjectMap [

rr:template "ProtectedArea/Geometry/SiteCode={SiteCode}";

rr:class geo:Geometry ];

rr:predicateObjectMap [

rr:predicate geo:dimension;

rr:objectMap [

rrx:function strdf:dimension;

rrx:argumentMap ( [rr:column "`Geom`"] ); ]; ] .

<ProtectedArea/Geometry/Sitecode=DE0916391>

rdf:type <ProtectedArea> ;

geo:dimension "2"^xsd:integer .

geo:Geometry

geo:

dimensiongeo:asWKT

geo:wktLiteral

Example

Sitecode Sitename Geom …

DE0916391 NTP S-H W POLYGON((…))

DE1003301 DOGGERB

ANK

POLYGON((…))

ProtectedArea

<NaturaGeometryMapping>

rr:subjectMap [

rr:template

"ProtectedArea/Geometry/SiteCode={SiteCode}";

rr:class geo:Geometry ];

rr:predicateObjectMap [

rr:predicate geo:sfIntersects;

rr:objectMap [

rr:parentTriplesMap <#waterwaysGeom> ;

rr:joinCondition [

rrx:function geof:intersection;

rrx:argumentMap (

[rr:column "`Geom`"] ;

[rml:reference "ogr:geometryProperty“;

rr:parentTriplesMap <#waterwaysGeom>]

); ] ; ]; ] .

natura:DE0916391 geo:sfIntersects osm-id:waterways.128 .

<ogr:FeatureCollection>

<gml:featureMember>

<ogr:waterways fid="waterways.128">

<ogr:osm_id>8108139</ogr:osm_id>

<ogr:geometryProperty>

<gml:LineString>

<gml:coordinates>

10.9034096,47.7996669 …

</gml:coordinates>

</gml:LineString>

</ogr:geometryProperty>

</ogr:waterways>

</gml:featureMember>

</ogr:FeatureCollection>

OSM Waterways

<#waterwaysGeom>

rml:logicalSource [

rml:source "/home/leo/osm.gml";

rml:referenceFormulation ql:XPath;

rml:iterator "/ogr:FeatureCollection

/gml:featureMember

/ogr:waterways";

];

rr:subjectMap [

rr:template

"http://www.osm.org/id/{@fid}";

rr:class onto:waterways;

].

Implemented Systems

Direct Mapping processors: SquirellRDF

R2RML processors: D2RQ Platform OpenLink Virtuoso Ultrawrap Morph Ontop Oracle

RML processor Processor by iMinds Lab, Ghent University

Other Mapping Language: Triplify

Geospatial capabilities

So far: Geometry2RDF Sparqlify TripleGeo GeoTriples

26

Custom MappingLanguage

DirectMapping

R2RML RMLSPARQLquery

evaluation

AutomaticMapping

Generation

Geospatialsupport

OpenLinkVirtuoso

✔ ✖ ✔*✖

✔ ✖ ✖

RDF-RDB2RDF ✖ ✔ ✔ ✖ ✖ ✖ ✖

D2RQ Platform ✔ ✖ ✔ ✖ ✔ ✔ ✖

Db2triples ✖ ✔ ✔ ✖ ✖ ✖ ✖

Morph ✔ ✖ ✔ ✖ ✔ ? ✖

Sparqlify ✔ ✖ ✖* ✖ ✔ ✖ ✔

Ontop ✔ ✖ ✔ ✖ ✔ ✖ ✖*

Ultrawrap ✔* ✔ ✔ ✖ ✔ ✖ ✖

Oracle ✖ ✔ ✔ ✖ ✔ ✔ ✖

Geometry2RDF ✖ ✔ ✖ ✖ ✖ ✖ ✔*

TriplesGeo ✖ ✔ ✖ ✖ ✖ ✖ ✔*

iMinds lab RMLprocessor

✖ ✖ ✔ ✔ ✖ ✖ ✖

GeoTriples ✖ ✖ ✔ (✔) ✖* ✔ ✔

Comparison of Geo2RDF tools

DirectMapping

R2RML RMLAutomaticMapping

Generation

GeoSPARQLcompliance

RDBMSESRI

Shapefile

GMLGeo

JSON

Geometry2RDF ✔ ✖ ✖ ✖ ✖ ✔ ✖ ✖ ✖

TriplesGeo ✔ ✖ ✖ ✖ (✔) ✔ ✔ ✖ ✖

GeoTriples ✖ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔

GeoTriples

Open Source software

Released under Mozilla Public Licence v2.0

Available at: https://github.com/LinkedEOData/GeoTriples

Extends the D2RQ Platform

Extends the iMinds lab RML processor

Provides both a graphical user interface and a command line interface

29

Architecture of GeoTriples

30

Earth Obsevation

Acquisitions

Automatic generation of R2RML mappings (cont’d)

Generate two triples maps for each table that has a geometry column. Thematic triples map for the non-geometric information Spatial triples map for the geometric information

The spatial triples map contains multiple transformation functions over the input geometries in order to generate a GeoSPARQL compliant dataset.

31

NaturaGeometryNaturaArea

geo:

hasGeometry

(rr:joinCondition)

Automatic generation of RML mappings for GML documents

Each geometric object is mapped to a geo:Geometryinstance

For each geometric object we generate a set of predicate object maps that use the appropriate transformation functions for producing a GeoSPARQL compliant dataset

Each simple element is mapped to a predicate object map

Each non simple element is mapped to a triples map

Appropriate mappings are generated for linking nestedelements

32

Mapping

GeneratorXSDRML

mapping

Demonstration

33

Discovering Spatial and Temporal Links

among RDF Graphs

Publishing and Interlinking Linked Geospatial Data In Conjunction with the 12th Extended Semantic Web Conference

Portoroz, Slovenia, 1st June 2015

Presenter: Panayiotis Smeros

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 2

Outline

• Introduction to Entity Resolution and Link

Discovery

– Examples, Definitions, Common Problems

• Spatial Entity Resolution

• Spatial and Temporal Link Discovery

– Background and Developed Methods

– Extensions to the Silk Framework

– Hands-on

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 3

Entities in Real-World

source

source

Most of our knowledge about the world is based on entities

and their relations:

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 4

Entities in Data-World

Portoroz Portorož بورتوروز Порторож Πορτορόζ Portorose Портороз Порторожу Portorožu Порторож

Portorož (Italian: Portorose, literally "Port of Roses"), is an Adriatic - Mediterranean coastal settlement in the Municipality of Piran in southwestern Slovenia. Its modern development began in the late 19th century with appearance of first health resorts.

http://www.geonames.org/3192682/portoroz.html http://en.wikipedia.org/wiki/Portoroz http://www.portoroz.si/en/ …

source

Many names, descriptions or IDs (URIs) are used for the

same real-world entity:

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 5

Content Providers

News about Portoroz Reviews of hotels in Portoroz

Pictures about Portoroz

Videos for Portoroz

Wiki pages about Portoroz

Social networks in Portoroz

Many applications provide valuable information about each of

these entities:

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 6

Content Providers

News about Portoroz Reviews of hotels in Portoroz

Pictures about Portoroz

Videos for Portoroz

Wiki pages about Portoroz

Social networks in Portoroz

Many applications provide valuable information about each of

these entities:

Solution?

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 7

Entity Resolution

Problem of understanding that two (or more) entities in data-world

are references of the same real-world entity. [Christen, TKDE’11]

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 8

Entity Resolution (Example)

DBpedia

Entity

name = PORTOROZ

population = 2,849

GeoNames

Entity

name = Portorose

population = 2,851

Problem of understanding that two (or more) entities in data-world

are references of the same real-world entity. [Christen, TKDE’11]

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 9

Entity Resolution (Example)

DBpedia

Entity

name = PORTOROZ

population = 2,849

GeoNames

Entity

name = Portorose

population = 2,851

sameAs

Problem of understanding that two (or more) entities in data-world

are references of the same real-world entity. [Christen, TKDE’11]

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 10

Spatial Entity Resolution (Example)

DBpedia

Entity

name = PORTOROZ

population = 2,849

GeoNames

Entity

name = Portorose

population = 2,851

sameAs

location = 45.51663, 13.57996 location = 45.51661, 13.57998

Problem of understanding that two (or more) entities in data-world

are references of the same real-world entity. [Christen, TKDE’11]

Entity Resolution (Definition)

Let 𝑆 and 𝑇 be two sets of entities. We define a distance

(similarity) function 𝑑𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 and a distance (similarity)

threshold 𝜃𝑑𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 as follows:

𝑑𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦: 𝑆 × T → [0,1] , 𝜃𝑑𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦∈ 0,1

We define the set of discovered similarity links 𝐷𝐿𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 as

follows:

𝐷𝐿𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 = s, sameAs, t 𝑠 ∈ 𝑆 𝑡 ∈ 𝑇 𝑑𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 𝑠, 𝑡 < 𝜃𝑑𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦}

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 11

Link Discovery

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 12

Source Source

Link Discovery is the fourth and the most important Linked Data

Principle.

Establish semantic relations between entities in order to enrich the

information that is known about them. [Bizer et al., IJSWIS’06]

Link Discovery (Definition)

Let 𝑆 and 𝑇 be two sets of entities and 𝑅 the set of relations

that can be discovered between entities. For a relation 𝑟 ∈ 𝑅,

w.l.o.g., we define a distance function 𝑑𝑟 and a distance

threshold 𝜃𝑑𝑟 as follows:

𝑑𝑟: S × T → [0,1] , 𝜃𝑑𝑟∈ 0,1

We define the set of discovered links for relation 𝑟 (𝐷𝐿𝑟) as

follows:

𝐷𝐿𝑟 = s, r, t 𝑠 ∈ 𝑆 𝑡 ∈ 𝑇 𝑑𝑟 𝑠, 𝑡 < 𝜃𝑑𝑟}

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 13

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 14

Link Discovery (Example)

Natura (2000) - Fields Fields - OSM Water Bodies

contains

intersects

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 15

Natura (2000) - Fields

Link Discovery (Example)

Fields - OSM Water Bodies

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 16

Main Problem: Heterogeneity

• Different Data Providers create Heterogeneous

Datasets

– Example: Literal Heterogeneity (case, language, etc).

• We focus on:

– Heterogeneity in the Representation of Geospatial

Information in RDF

– Heterogeneity in the Representation of Temporal

Information in RDF

name = PORTOROZ name = Portorose

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 17

Heterogeneity in the Representation of

Geospatial Information in RDF

_:1 rdf:type geo:Geometry .

_:1 geo:hasGeometry

"<http://www.opengis.net/def/crs/EPSG/0/4326>

POINT(10 20)"^^geo:wktLiteral .

_:1 rdf:type strdf:Geometry .

_:1 strdf:hasGeometry

"<gml:Point crsName="EPSG:2100"><gml:coordinates>10,20

</gml:coordinates></gml:Point>"^^strdf:GML .

_:1 rdf:type wgs84Geo:Point .

_:1 wgs84Geo:lat “10“^^xsd:double .

_:1 wgs84Geo:long “20“^^xsd:double .

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 18

Heterogeneity in the Representation of

Geospatial Information in RDF

_:1 rdf:type geo:Geometry .

_:1 geo:hasGeometry

"<http://www.opengis.net/def/crs/EPSG/0/4326>

POINT(10 20)"^^geo:wktLiteral .

_:1 rdf:type strdf:Geometry .

_:1 strdf:hasGeometry

"<gml:Point crsName="EPSG:2100"><gml:coordinates>10,20

</gml:coordinates></gml:Point>"^^strdf:GML .

• Different Vocabularies

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 19

Heterogeneity in the Representation of

Geospatial Information in RDF

_:1 rdf:type geo:Geometry .

_:1 geo:hasGeometry

"<http://www.opengis.net/def/crs/EPSG/0/4326>

POINT(10 20)"^^geo:wktLiteral .

_:1 rdf:type strdf:Geometry .

_:1 strdf:hasGeometry

"<gml:Point crsName="EPSG:2100"><gml:coordinates>10,20

</gml:coordinates></gml:Point>"^^strdf:GML .

• Different Vocabularies

• Different Serializations of Geometries

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 20

Heterogeneity in the Representation of

Geospatial Information in RDF

_:1 rdf:type geo:Geometry .

_:1 geo:hasGeometry

"<http://www.opengis.net/def/crs/EPSG/0/4326>

POINT(10 20)"^^geo:wktLiteral .

_:1 rdf:type strdf:Geometry .

_:1 strdf:hasGeometry

"<gml:Point crsName="EPSG:2100"><gml:coordinates>10,20

</gml:coordinates></gml:Point>"^^strdf:GML .

• Different Vocabularies

• Different Serializations of Geometries

• Geometries expressed in Different Coordinate

Reference Systems (CRS)

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 21

Heterogeneity in the Representation of

Geospatial Information in RDF

source

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 22

Heterogeneity in the Representation of

Geospatial Information in RDF

• Different Sampling Values

• Different Granularity

• Different Rounding Effects

source

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 23

Heterogeneity in the Representation of

Temporal Information in RDF

_:1 ex:hasBirthday "1989-09-

24T11:05:00+01:00"xsd:dateTime .

_:1 ex:hasAffiliation ex:UoA

"[2007-10-15T00:00:00+03:00,

2013-10-15T00:00:00+04:00)"^^strdf:Period .

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 24

Heterogeneity in the Representation of

Temporal Information in RDF

_:1 ex:hasBirthday "1989-09-

24T11:05:00+01:00"xsd:dateTime .

_:1 ex:hasAffiliation ex:UoA

"[2007-10-15T00:00:00+03:00,

2013-10-15T00:00:00+04:00)"^^strdf:Period .

• Different Vocabularies

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 25

Heterogeneity in the Representation of

Temporal Information in RDF

_:1 ex:hasBirthday "1989-09-

24T11:05:00+01:00"xsd:dateTime .

_:1 ex:hasAffiliation ex:UoA

"[2007-10-15T00:00:00+03:00,

2013-10-15T00:00:00+04:00)"^^strdf:Period .

• Different Vocabularies

• Different Time Zones

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 26

Heterogeneity in the Representation of

Temporal Information in RDF

_:1 ex:hasBirthday "1989-09-

24T11:05:00+01:00"xsd:dateTime .

_:1 ex:hasAffiliation ex:UoA

"[2007-10-15T00:00:00+03:00,

2013-10-15T00:00:00+04:00)"^^strdf:Period .

• Different Vocabularies

• Different Time Zones

• Time Instants and Periods

Outline

• Introduction to Entity Resolution and Link

Discovery

– Examples, Definitions, Common Problems

• Spatial Entity Resolution

• Spatial and Temporal Link Discovery

– Background and Developed Methods

– Extensions to the Silk Framework

– Hands-on

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 27

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 28

Spatial Entity Resolution (Example

Revisited)

DBpedia

Entity

name = PORTOROZ

population = 2,849

GeoNames

Entity

name = Portorose

population = 2,851

sameAs

location = 45.51663, 13.57996 location = 45.51661, 13.57998

Problem of understanding that two (or more) entities in data-world

are references of the same real-world entity. [Christen, TKDE’11]

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 29

Spatial Entity Resolution (1/4)

• Location Name Similarity

– Edit, Jaccard distance

• Location Similarity

– Euclidean distance

• Location Type Similarity

– (e.g. type “river” is similar to type “stream”)

Combines the above similarities to compute the

overall similarity between entities

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 30

Spatial Entity Resolution (2/4)

• Similarity measure: Hausdorff Distance – Intuitively Hausdorff Distance is defined as the

largest distance between the closest points of two geometric shapes

• Handling Geospatial Heterogeneity – Converts geometries to a common

vocabulary (NeoGeo)

– Assumes WGS-84 CRS

• Optimization – Simplifies Geometries with Ramer-Douglas-Peucker algorithm

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 31

Spatial Entity Resolution (3/4)

• Heuristic Combination of:

– URI Similarity

– Label Similarity

• Considering the language of the labels

– Location Similarity

• Assuming the W3C Geo vocabulary

– Geometric Similarity

• Minimum Distance between two Geometries

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 32

Spatial Entity Resolution (4/4)

• Non-Spatial Criteria

– Implemented within the LIMES framework

• Geometric Similarity

– Hausdorff Distance

– Optimizations

• Bounding Circle: Avoids useless comparisons

μ(s, t) = δ(ζ(s), ζ(t)) − r (s) − r (t) > θ ⇒ δ(s, t) > θ

• Space tiling: Reduces the quadratic number of comparisons

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 33

Spatial Entity Resolution

• [Sehgal et al. GIS’06] – Spatial and non-Spatial Criteria

– Only Location Similarity

• [Salas et al., TerraCognita’11] – Only Spatial Criteria

– Complex Geometric Similarity Methods

• [Vilches-Blázquez et al., AGILE’12] – Spatial and non-Spatial Criteria

– Simple Geometric Similarity Methods

• [Ngonga Ngomo, ISWC’13] – Spatial and non-Spatial Criteria

– Complex Geometric Similarity Methods

– Reduced number of comparisons

Outline

• Introduction to Entity Resolution and Link

Discovery

– Examples, Definitions, Common Problems

• Spatial Entity Resolution

• Spatial and Temporal Link Discovery

– Background and Developed Methods

– Extensions to the Silk Framework

– Hands-on

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 34

Link Discovery (reminder)

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 35

Source Source

Link Discovery is the fourth and the most important Linked Data

Principle.

Establish semantic relations between entities in order to enrich the

information that is known about them. [Bizer et al., IJSWIS’06]

Background on Spatial Relations (1/2)

• Dimensionally Extended 9-Intersection Model [Clementini et al., SSD'93]

– Captures topological relations in ℝ2, by considering the

dimension (dim) of the intersections involving the

interior (I), the boundary (B) and the exterior (E) of the

two geometries.

– Examples: Intersects, Equals, Touches, Disjoint,

Contains, Crosses, Covers, CoveredBy and Within

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 36

Background on Spatial Relations (2/2)

• Region Connection Calculus [Randell et al. KR’92]

– RCC-8: a well-known subset of RCC, which is based on eight topological relations

– DC stands for DisConnected, EC for Externally Connected, TPP for Tangential Proper Part, NTPP, for Non Tangential Proper Part, and TPPi and NTPPi are the inverse relations of TPP and NTPP

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 37

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 38

Background on Temporal Relations

• Allen’s Interval Calculus [Allen, Commun. ACM’83]

– thirteen jointly exclusive and pairwise disjoint qualitative

relations

Spatial and Temporal Relations

• We consider the previous Spatial (𝑅𝑠) and Temporal (𝑅𝑡)

relations as Boolean relations (𝑅𝐵) i.e., either they hold or

they do not:

𝑅𝑠, 𝑅𝑡 ⊂ 𝑅𝐵

• 𝑅𝐵 constitutes a special subset of 𝑅. The distance function

𝑑𝑟 and the distance threshold 𝜃𝑑𝑟 for a relation 𝑟 ∈ 𝑅𝐵 are

defined as follows:

𝑑𝑟(s,t) = 0 𝑖𝑓 𝑟 ℎ𝑜𝑙𝑑𝑠1 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒

, 𝜃𝑑𝑟= 1

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 39

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 40

Spatial and Temporal Transformations

(1/2)

• CRS Transformation. The geometries of a dataset can be expressed in a Coordinate Reference System that is more precise for the geographic area that they describe (e.g., the GGRS87 for Greece). This transformation converts the CRS of a geometry to the World Geodetic System (WGS 84)

• Vocabulary Transformation. This transformation converts geometry literals from GeoSPARQL, stRDF or W3C GEO to a common vocabulary (GeoSPARQL)

• Serialization Transformation. This transformation converts the geometries of a dataset to a common serialization (WKT)

• Time-Zone Transformation. This transformation converts the time zone of a given time interval to Coordinated Universal Time (UTC)

• Period Transformation. This transformation converts a time instant to a period with the same starting and ending point

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 41

Spatial and Temporal Transformations

(2/2)

• Simplification Transformation. Some datasets have very complex geometries, which makes the computation of spatial relations inefficient. This transformation simplifies a geometry according to a given distance tolerance, ensuring that the result is a valid geometry having the same dimension and number of components as the input

• Envelope Transformation. This transformation computes the envelope (i.e., the minimum bounding rectangle) of a geometry and it is useful in cases that we want to compute approximate spatial relations between two datasets

• Area Transformation. In some cases it is enough to compare just the areas of two geometries to infer whether they are the same or not. This transformation computes the area of a given geometry in square metres

• Points-To-Centroid Transformation. In crowdsourcing datasets like OpenStreetMap, multiple users can define the position of the same placemark. As a better approximation of the real position of this placemark we can compute the centroid of these positions. This transformation computes the centroid of a cluster of points

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 42

Techniques for Checking the Relations

• Cartesian Product Technique (Naive) – Performs exhaustive checks between the pairs of the entities

of datasets

– Complete

– Complexity: O(|S||T|) checks

• Blocking Technique [Isele et al., WebDB’11, Papadakis et al, TKDE’13]

– Divides the entities into blocks

– Decreases the number of checks

– Complete

– Complexity: O(|S||T|) checks (worst case), O(|L|) checks (best case)

* |S|, |T|: number of entities in datasets S and T; |L|: number of links between datasets S and T

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 43

Blocking Technique for Spatial Relations

• Divide the surface of the earth

into curved rectangles (blocks)

• Adjust the area of the blocks

with a blocking factor (bf)

(blockArea: 1

𝑏𝑓2

𝑜2

)

• If the MBB of a geometry spatially intersects with a block, then insert it in this block

• Check for a spatial relation only within each block (independently)

• Construct the set of discovered links (𝐷𝐿𝑟) by aggregating the respective links that have been discovered within each block

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 44

Blocking Technique for Temporal

Relations

• Divide the time into

intervals (blocks)

• Adjust the length of the

blocks with a blocking factor (bf) (blockLength:

1

𝑏𝑓 𝑡𝑖𝑚𝑒 𝑢𝑛𝑖𝑡𝑠)

• If a time period or instant temporally intersects with a block, then insert it in this block

• Check for a temporal relation only within each block (independently)

• Construct the set of discovered links (𝐷𝐿𝑟) by aggregating the respective links that have been discovered within each block

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 45

Blocking Technique

• Fully parallelizable with respect to the blocks

• Proven sound and complete

• 100% accurate links

• 100% precision, recall, F-measure

Extensions to the Silk Framework:

Spatial and Temporal Relations

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 46

Silk

Silk

Extensions to the Silk Framework:

Spatial and Temporal Transformations

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 47

Extensions to the Silk Framework

• Spatial and Temporal Extensions for Silk implemented as

Plugins

• Transparent to all the applications of Silk

– Single Machine

– MapReduce

– Workbench

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 48

Silk

• Download: https://github.com/silk-framework/silk

• Workbench application pre-installed in the VM

• Discover the following links:

All the datasets will be first converted to RDF with GeoTriples!

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 49

Hands-on Silk

Source Dataset Relation Target Dataset

Field Boundaries Contains Raster Cells

OSM Water

Bodies

Intersects Natura (2000)

Natura (2000) Within Federal States of

Germany

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 50

References (1/3)

• [Bizer et al., IJSWIS’06]

Bizer, C., Heath, T., Berners-Lee, T.: Linked Data - The Story So Far. International

Journal on Semantic Web and Information Systems 5(3), 1–22 (2009)

• [Christen, TKDE’11]

P. Christen, " A survey of indexing techniques for scalable record linkage and

deduplication.” in IEEE TKDE 2011.

• [Auer, RW’13]

Auer, S., Lehmann, J., Ngomo, A.C.N., Zaveri, A.: Introduction to Linked Data and Its

Lifecycle on the Web. In: Rudolph, S., Gottlob, G., Horrocks, I., van Harmelen, F. (eds.)

Reasoning Web. Lecture Notes in Computer Science, vol. 8067, pp. 1–90. Springer

(2013)

• [Salas et al., TerraCognita’11]

Salas, J., Harth, A.: Finding spatial equivalences accross multiple RDF datasets. In:

Proceedings of the Terra Cognita Workshop on Foundations, Technologies and

Applications of the Geospatial Web. pp. 114–126. Citeseer (2011)

• [Sehgal et al. GIS’06]

Sehgal, V., Getoor, L., Viechnicki, P.D.: Entity resolution in geospatial data integration. In:

Proceedings of the 14th annual ACM international symposium on Advances in

geographic information systems. pp. 83–90. ACM (2006)

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 51

References (2/3)

• [Vilches-Blázquez et al., AGILE’12]

Vilches-Blázquez, L.M., Saquicela, V., Corcho, O.: Interlinking geospatial information in

the web of data. In: Bridging the Geographic Information Sciences, pp. 119–139.

Springer (2012)

• [Ngonga Ngomo, ISWC’13]

Ngonga Ngomo, A.C.: Orchid - reduction-ratio-optimal computation of geo-spatial

distances for link discovery. In: Proceedings of ISWC 2013 (2013)

• [Clementini et al., SSD'93]

Clementini, E., Di Felice, P., van Oosterom, P.: A small set of formal topological

relationships suitable for end-user interaction. In: Abel, D., Chin Ooi, B. (eds.) Advances

in Spatial Databases, Lecture Notes in Computer Science, vol. 692, pp. 277–295.

Springer Berlin Heidelberg (1993), http://dx.doi.org/10.1007/3-540-56869-7_16

• [Randell et al. KR’92]

Randell, D.A., Cui, Z., Cohn, A.G.: A spatial logic based on regions and connection. In:

KR. pp. 165–176 (1992)

• [Allen, Commun. ACM’83]

Allen, J.F.: Maintaining knowledge about temporal intervals. Commun. ACM 26(11), 832–

843 (Nov 1983)

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 52

References (3/3)

• [Isele et al., WebDB’11]

Isele, R., Jentzsch, A., Bizer, C.: Efficient multidimensional blocking for link discovery

without losing recall. In: WebDB. Citeseer (2011)

• [Papadakis et al, TKDE’13]

Papadakis, G., Ioannou, E., Palpanas, T., Niederée, C., Nejdl, W.: A blocking framework

for entity resolution in highly heterogeneous information spaces. Knowledge and Data

Engineering, IEEE Transactions on 25(12), 2665–2682 (2013)

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 53

Thanks for your attention! Questions?

Transforming Natura2000 Shapefile into RDF

Kostis Kyzirakos and Dimitrianos Savva

Natura2000 (South Germany)

GeoTriples GUI

• From terminal execute: geotriples-gui

1. Connect to Natura2000 Shapefile

2. Adjust class/predicate names to ontology

3. Generate mapping

4. Dump RDF

Connect to Shapefile

GeoTriples Layout

Mapping Builder (Left)1. Adjust triples maps2. Change DataTypes3. Change Predicates

Bottom Toolbar1. Generate Mapping2. Select your preferred

geo-vocabulary3. Define CRS4. Select output format5. Dump RDF

Mapping Editor (Right)1. Change the mapping

by hand

Adjust to Ontology

Generate Mapping

Dump RDF

/home/leo/datasets/naturatriples.n3

RDF graph

Store RDF graph to Strabon

# endpoint store

http://localhost:8080/strabonendpoint

N-Triples -t

/home/leo/datasets/naturatriples.n3

Transforming OpenStreetMaps GML document into an RDF graph (1/4)

# cd ~/DEMO_ESWC15

# ./osmmapping.sh

--

geotriples-cmd generate_mapping

-o OSM/automatic-mapping.rml.ttl

-b http://data.linkedeodata.eu/waterways

-r waterways

-rp /ogr:FeatureCollection/gml:featureMember

-ns "gml|http://www.opengis.net/gml,

ogr|http://ogr.maptools.org/"

-null -onto OSM/automatic-ontology.txt

-x OSM/osm_waterways.xsd OSM/osm_waterways.gml

Transforming OpenStreetMaps GML document into an RDF graph (2/4)

# cp OSM/automatic-mapping.rml.ttl

OSM/altered-mapping.rml.ttl

# gedit OSM/altered-mapping.rml.ttl

Transforming OpenStreetMaps GML document into an RDF graph (3/4)

1. Change the class definition for the triples map <#ogr:waterwaysogr:geometryProperty>

1. Replace the class onto:LineStringPropertyTypewith ogc:Geometry

2. Change the predicate that will link the thematic data with the geometric data.

1. Find the triples map <#waterways>

2. Replace the text onto:has_geometryPropertywith ogc:hasGeometry

Transforming OpenStreetMaps GML document into an RDF graph (4/4)

# ./osmdump.sh

--

geotriples-cmd dump_rdf -rml

-o OSM/osmtriples.n3

-ns osm-namespaces.ns

OSM/altered-mapping.ttl

--

# endpoint store

http://localhost:8080/strabonendpoint N-

Triples -t

/home/leo/DEMO_ESWC15/OSM/osmtriples.n3

Store TalkingFields datasets to Strabon

# endpoint store

http://localhost:8080/strabonendpoint

N-Triples -t /home/leo/datasets/fb.n3

# endpoint store

http://localhost:8080/strabonendpoint

N-Triples -t /home/leo/datasets/rc.n3

• Download: https://github.com/silk-framework/silk

• Workbench application pre-installed in the VM

• Discover the following links:

All the datasets will be first converted to RDF with GeoTriples!

Hands-on Silk

Source Dataset Relation Target Dataset

Field Boundaries Contains Raster Cells

OSM Water

Bodies

Intersects Natura (2000)

Natura (2000) Within Federal States of

Germany

Start the Silk Workbench

Open Workspace

Import the project that you will find in the Desktop of the VM

Open the Linkage Rule

Modify the Linkage Rule

Start the Link Generation

Examing Generated Links

$ less

/home/leo/Desktop/FieldBounda

riesRasterCellsLinks.nt

• Download: https://github.com/silk-framework/silk

• Workbench application pre-installed in the VM

• Discover the following links:

All the datasets will be first converted to RDF with GeoTriples!

Hands-on Silk

Source Dataset Relation Target Dataset

Field Boundaries Contains Raster Cells

OSM Water

Bodies

Intersects Natura (2000)

Natura (2000) Within Federal States of

Germany