High Quality Linked Data Generation
Dr Anastasia DimouPost-Doc researcher
imec.be - [email protected]@natadimou
High Quality Linked Data Generation
from Heterogeneous Data
best practices for semantically annotating & connecting structured data on the (Semantic) Web
High Quality Linked Data Generation
from Heterogeneous Data
Linked Data is (still) not intuitively available on the Web
High Quality Linked Data Generation
from Heterogeneous Data
Linked Data is derived from heterogeneous data sources, structures & formats
High Quality Linked Data Generation
from Heterogeneous Data
Linked Data needs to be consistent
Human agents do not want to put in effort to provide Linked Data
until there are software agents that use it
Why is Linked Data Generation Such a difficult task?
data owner
custom implementation
Linked Data
my data
data owner
custom implementation
Linked Data
my data
data owner
Linked Data
Linked Data
Linked Data
Linked Data
XMLCSV JSONDB
format-specific implementation
data owner
Linked Data
Linked Data
Linked Data
Linked Data
XMLCSV JSONDB
Linked Data
facilitate Linked Data generation byreducing data & semantic heterogeneity
& increasing Linked Data quality
data owner
Linked Data
Linked Data
Linked Data
Linked Data
DB CSV XML JSON
multiple data sources
data owner
DB CSV XML JSON
uniform execution
Linked Data
data owner
DB CSV XML JSON
?
uniform declaration
Linked Data
data owner
DB CSV XML JSON
?
uniform declaration
Linked Data
declarationexecutionassessment
data owner
DB CSV XML JSON
Linked Data
?
data owner
DB CSV XML JSON
R2RML
Linked Data
Linked Data
data owner
DB CSV XML JSON
R2RML
Linked Data
Linked Data
Extending R2RML to a source-independent mapping language for RDFA. Dimou et al.
data owner
DB CSV XML JSON
R2RML
Linked Data
http://RML.ioRML: A Generic Language for Integrated RDF Mappings
of Heterogeneous Data. A. Dimou et al.
bibo:Document
bibo:presentedAt<http://ex.com/paper/{id}>
id title venue
1 Assessing & Refining Mappings to RDF to improve Dataset Quality ISWC 2015
2 RMLEditor : a graph-based editor for Linked Data mappings ESWC 2016
3 An ontology to semantically declare and describe functions ESWC 2016
4 Modeling, Generating, Publishing Knowledge as Linked Data EKAW 2017
5 Semi-automatic example-driven linked data mapping creation ISWC 2017
<http://ex.com/conf/{venue}>
bibo:Event
bibo:Document
bibo:presentedAt<http://ex.com/paper/1>
id title venue
1 Assessing & Refining Mappings to RDF to improve Dataset Quality ISWC 2015
2 RMLEditor : a graph-based editor for Linked Data mappings ESWC 2016
3 An ontology to semantically declare and describe functions ESWC 2016
4 Modeling, Generating, Publishing Knowledge as Linked Data EKAW 2017
5 Semi-automatic example-driven linked data mapping creation ISWC 2017
<http://ex.com/conf/ISWC2015>
bibo:Event
Who defines those rules?
<datafield tag="001#"><subfield code="0">2,192</subfield></datafield><datafield tag="001A"><subfield code="0">1961:22-06-88</subfield></datafield><datafield tag="001U"><subfield code="0">utf8</subfield></datafield><datafield tag="002C"><subfield code="a">tekst</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent/dut</subfield></datafield>
<datafield tag="002D"><subfield code="a">zonder medium</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia/dut</subfield></datafield><datafield tag="003#"><subfield code="0">047268573</subfield></datafield><datafield tag="003O"><subfield code="a">OCoLC</subfield><subfield code="0">64827916</subfield></datafield>
Sem Web dev
rr:subjectMap [ rr:template "http://data.kb.nl/ppn/{repox:metadata/record/datafield[@tag=\"003@\"]/subfield[@code=\"0\"]}"; rr:class ex:Record ];
rr:predicateObjectMap [ rr:predicate rdfs:label; rr:objectMap [ rml:reference "repox:metadata/record/datafield[@tag=\"021A\"]/subfield[@code=\"a\"]"; rr:language "nl" ] ];
rr:predicateObjectMap [ rr:predicate dcterms:type; rr:objectMap [ rml:reference "repox:metadata/record/datafield[@tag=\"002@\"]/subfield[@code=\"0\"]" ] ];
rr:predicateObjectMap [ rr:predicate dcterms:publisher; rr:objectMap [ rml:reference "repox:metadata/record/datafield[@tag=\"033A\"]/subfield[@code=\"n\"]" ] ];
rr:predicateObjectMap [ rr:predicate rdfs:seeAlso;rr:objectMap [ rr:template "http://www.worldcat.org/oclc/{metadata/record/datafield[@tag=\"003O\"]/subfield[@code=\"0\"]}"] ].
data owner
RMLEditor: A Graph-Based Mapping Editor for Linked Data MappingsP. Heyvaert, et al.
declarationexecutionassessment
declare mapping rules
RMLLanguage
data owner
DB CSV XML JSON
RML
Linked Data
data owner
DB CSV XML JSON
RMLRML handler
RML Processor
Linked Data
Machine-interpretable dataset & service descriptions for heterogeneous data access and retrieval. A. Dimou et al.
data owner
DB CSV XML JSON
RMLRML handler
RML Processor
Linked Data
data retrievalhandler
Machine-interpretable dataset & service descriptions for heterogeneous data access and retrieval. A. Dimou et al.
data owner
table CSV XML JSON
RMLRML handler
RML Processor
source desc
Linked Data
data retrievalhandler
“15th October 2017”
“2017-10-15”^^xsd:date
data owner
table CSV XML JSON
RMLRML handler
RML Processor
source desc
Linked Data
data retrievalhandler
data owner
table CSV XML JSON
RMLRML handler
RML Processor
source desc
Linked Data
data retrievalhandler
functionhandler
An Ontology to Semantically Declare & Describe FunctionsB.De Meester, A. Dimou, R. Verborgh, E. Mannens & R. Van De Walle
data owner
table CSV XML JSON
RMLFnO
RML handler
RML Processor
source desc
Linked Data
data retrievalhandler
functionhandler GREL
DBpediafunctions
Automated Metadata Generation for Linked Data Generation and Publishing Workflows. A. Dimou et al.
data owner
table CSV XML JSON
RMLFnO
RML handler
RML Processor
source desc
metadatahandler
Linked Data
meta data
data retrievalhandler
functionhandler GREL
DBpediafunctions
data owner
table CSV XML JSON
RMLFnO
RML handler
RML Processor
source desc
metadatahandler
Linked Data
meta data
RML Mapper
data retrievalhandler
functionhandler
RML Mapper: a tool for uniform Linked Data generation from heterogeneous data. A. Dimou et al.
declarationexecutionassessment
declarationexecutionassessment
RMLMapper
execute RML rulesRML rules
Linked Data
data
declare RML rules
RMLLanguage
bibo:Document
bibo:presentedAt<http://ex.com/paper/{id}>
id title venue
1 Assessing & Refining Mappings to RDF to improve Dataset Quality ISWC 2015
2 RMLEditor : a graph-based editor for Linked Data mappings ESWC 2016
3 An ontology to semantically declare and describe functions ESWC 2016
4 Modeling, Generating, Publishing Knowledge as Linked Data EKAW 2017
5 Semi-automatic example-driven linked data mapping creation ISWC 2017
<http://ex.com/conf/{venue}>
bibo:Event
foaf:Person
bibo:presentedAt<http://ex.com/paper/{id}>
id title venue
1 Assessing & Refining Mappings to RDF to improve Dataset Quality ISWC 2015
2 RMLEditor : a graph-based editor for Linked Data mappings ESWC 2016
3 An ontology to semantically declare and describe functions ESWC 2016
4 Modeling, Generating, Publishing Knowledge as Linked Data EKAW 2017
5 Semi-automatic example-driven linked data mapping creation ISWC 2017
<http://ex.com/conf/{venue}>
bibo:Proceedings
foaf:Person
bibo:presentedAt<http://ex.com/paper/1>
id title venue
1 Assessing & Refining Mappings to RDF to improve Dataset Quality ISWC 2015
2 RMLEditor : a graph-based editor for Linked Data mappings ESWC 2016
3 An ontology to semantically declare and describe functions ESWC 2016
4 Modeling, Generating, Publishing Knowledge as Linked Data EKAW 2017
5 Semi-automatic example-driven linked data mapping creation ISWC 2017
<http://ex.com/conf/ISWC2015>
bibo:Proceedings
What happens then?
foaf:Person
bibo:presentedAt<http://ex.com/paper/{id}>
id title venue
1 Assessing & Refining Mappings to RDF to improve Dataset Quality ISWC 2015
2 RMLEditor : a graph-based editor for Linked Data mappings ESWC 2016
3 An ontology to semantically declare and describe functions ESWC 2016
4 Modeling, Generating, Publishing Knowledge as Linked Data EKAW 2017
5 Semi-automatic example-driven linked data mapping creation ISWC 2017
<http://ex.com/conf/{venue}>
bibo:Proceedings
100 triples2 violations/triple200 violations!
foaf:Person
bibo:presentedAt<http://ex.com/paper/{id}>
id title venue
1 Assessing & Refining Mappings to RDF to improve Dataset Quality ISWC 2015
2 RMLEditor : a graph-based editor for Linked Data mappings ESWC 2016
3 An ontology to semantically declare and describe functions ESWC 2016
4 Modeling, Generating, Publishing Knowledge as Linked Data EKAW 2017
5 Semi-automatic example-driven linked data mapping creation ISWC 2017
<http://ex.com/conf/{venue}>
bibo:Proceedings
1,000,000 triples2 violations/triple2,000,000 violations!
foaf:Person
bibo:presentedAt<http://ex.com/paper/{id}>
id title venue
1 Assessing & Refining Mappings to RDF to improve Dataset Quality ISWC 2015
2 RMLEditor : a graph-based editor for Linked Data mappings ESWC 2016
3 An ontology to semantically declare and describe functions ESWC 2016
4 Modeling, Generating, Publishing Knowledge as Linked Data EKAW 2017
5 Semi-automatic example-driven linked data mapping creation ISWC 2017
<http://ex.com/conf/{venue}>
bibo:Proceedings
1,000,000 triples2 violations/triple2,000,000 violations!
You think this doesn’t happen?!
Conference
Year 2014 2015
Solution 1.1 swc:OrganizedEvent swc:OrganizedEvent
Solution 1.2 swc:Event bibo:Conference
Solution 1.3 swrc:Event swrc:Event
Solution 1.4 swrc:Event
Challenges as enablers for high quality Linked Data: insights from the Semantic Publishing Challenge A. Dimou, et al.
Semantic Publishing
Challenge 2014 - 2016
statistics
Workshop
Year 2014 2015
Solution 1.1 bibo:Workshop bibo:Workshop
Solution 1.2 swc:Event bibo:Workshop
Solution 1.3 swrc:Event swrc:Workshop
Solution 1.4 swrc:Section
Challenges as enablers for high quality Linked Data: insights from the Semantic Publishing Challenge A. Dimou, et al.
Semantic Publishing
Challenge 2014 - 2016
statistics
Paper
Year 2014 2015
Solution 1.1 swrc:InProceedings foaf:Document
Solution 1.2 bibo:Article swrc:InProceedings
Solution 1.3 swrc:Publication swrc:Publication
Solution 1.4 swc:Paper
Challenges as enablers for high quality Linked Data: insights from the Semantic Publishing Challenge A. Dimou, et al.
Semantic Publishing
Challenge 2014 - 2016
statistics
Can we prevent violations?
RML Mapper
DQALinked Data
data
RML
violations
RML Mapper
violationsMDQA
Linked Data
data
RML
Assessing and Refining Mappings to RDF to Improve Dataset Quality. A. Dimou et al.
new RML
RML Mapper
Mapping Refinements
violationsMDQA
(optional)
Linked Data
data
RML
DBpediaUse Case
Sustainable Linked Data Generation: The case of DBpedia: W. Maroy et al.
Certain test cases require a complete Linked Data set
DBpedia Quality AssessmentLinked Data: 16hRML rules: 32s
new RML
RML Mapper
Mapping Refinements
violationsMDQA
(optional)
Linked Data
data
RML
declarationexecutionassessment
RMLMapper
execute RML rules
RMLValidator
validate RML rules RML rules
validatedRML rules
Linked Data
data
declare RML rules
RMLLanguage
declarationexecutionassessment
RMLMapper
execute RML rules
administrate Linked Data generation workflow RMLWorkbench
RMLValidator
validate RML rules RML rules
validatedRML rules
Linked Data
data
declare RML rules
RMLLanguage
RMLMapper
execute RML rules
administrate Linked Data generation workflow RMLWorkbench
RMLValidator
validate RML rules RML rules
validatedRML rules
Linked Data
data
declare RML rules
RMLLanguage
Human agents do not need to put in much effort to provide Linked Data
Human agents do not need to put in much effort to provide Linked Data
Intelligent software agents which function with Semantic Web technologies
will have enough Linked Data to work with
High Quality Linked Data Generation
Dr Anastasia DimouPost-Doc researcher
imec.be - [email protected]@natadimou
Top Related