FAME.Q – A Formal approach to Master Quality in Enterprise Linked Data
-
Upload
linked-enterprise-date-services -
Category
Data & Analytics
-
view
18 -
download
0
Transcript of FAME.Q – A Formal approach to Master Quality in Enterprise Linked Data
Chemnitz University of Technology Prof. Dr.-Ing. Martin Gaedke & Team 29.11.2016
FAME.Q A Formal Approach to Master Quality in Enterprise Linked Data
André Langer and Martin Gaedke
Semantic Web and XML @ ICWI2016
VSR://IntelligentInformationManagement/LEDS
What we do
2
VSR://IntelligentInformationManagement/LEDS/Background 3
Linked Open Data Corporate Data Social Networks
Data Lake
Knowledge Graphs
Management of Background Knowledge
Data Quality and Coherence
Knowledge Extraction
Search in Linked Data
E-Commerce Applications
VSR://IntelligentInformationManagement/LEDS/Background 4
Linked Open Data Corporate Data Social Networks
Data Lake
Knowledge Graphs
Management of Background Knowledge
Data Quality and Coherence
Knowledge Extraction
Search in Linked Data
E-Commerce Applications
Main Focus
Initial
Question
5
What is Data Quality?
Which common definitions exist?
How can DQ be measured ?
VSR://IntelligentInformationManagement/LEDS/Motivation 6
Data Quality
• Is a multi-dimensional concept
• data that is „fit for use“ by data consumers (Wang & Strong, 1996; Strong, Lee & Wang, 1997b)
• data that is „free of defects and posesses desired features“ (Redman, 2001)
VSR://IntelligentInformationManagement/LEDS/Motivation 7
2.1 Simple Example
8
VSR://IntelligentInformationManagement/LEDS/Motivation 9
VSR://IntelligentInformationManagement/LEDS/Motivation 10
VSR://IntelligentInformationManagement/LEDS/Motivation 11
How can existing
definitions be formalized? ?
VSR://IntelligentInformationManagement/LEDS/Motivation 12
Approach
13
Data Quality characterizes
data to which degree it corresponds
to specific requirements
VSR://IntelligentInformationManagement/LEDS/Approach 14
Data Quality characterizes
data to which degree it corresponds
to specific requirements
VSR://IntelligentInformationManagement/LEDS/Approach 15
Context
Data Quality characterizes
data to which degree it corresponds
to specific requirements
VSR://IntelligentInformationManagement/LEDS/Approach 16
Context
metrics
Data Quality characterizes
data to which degree it corresponds
to specific requirements
VSR://IntelligentInformationManagement/LEDS/Approach 17
Context
metrics a percentage
VSR://IntelligentInformationManagement/LEDS/Approach 18
Context
metrics a percentage
Simplified Version
VSR://IntelligentInformationManagement/LEDS/Approach 19
Context
metrics a percentage
Simplified Version
VSR://IntelligentInformationManagement/LEDS/Approach 20
Common Quality dimensions and
appropriate metrics have already been
extensively classified by other authors
• Wang & Strong, 1996
• Zaveri et al, 2014
VSR://IntelligentInformationManagement/LEDS/Approach 21
Common Quality dimensions and
appropriate metrics have already been
extensively classified by other authors
• Wang & Strong, 1996
• Zaveri et al, 2014
Zaveri, A. et al., 2014. Quality Assessment for Linked Open Data: A Survey. Semantic Web Journal, 1, p. 22
VSR://IntelligentInformationManagement/LEDS/Approach 22
FAME.Q Quality Assessment Levels
Data Quality
Instance Level Schema Level Service Level
VSR://IntelligentInformationManagement/LEDS/Approach 23
Example calculation 1
VSR://IntelligentInformationManagement/LEDS/Approach 24
Example calculation 2
VSR://IntelligentInformationManagement/LEDS/Approach 25
Summary: What is Data Quality?
„fatal“ „perfect“
Conclusion
26
VSR://IntelligentInformationManagement/LEDS/Conclusion 27
Data Quality can be interpreted as the degree to which data fits to current requirements • Build upon and reuse existing definitions • Apply it to the field of the Semantic Web • Set it in a formalized schema
4.1 Future Steps
28
VSR://IntelligentInformationManagement/LEDS/Conclusion 29
VSR://IntelligentInformationManagement/LEDS/Conclusion 30
Several related Quality Measurement frameworks already exist(ed) with different result output capabilities • SWIQA (Fürber & Hepp, 2011a) • Luzzu (Debattista et al., 2015) • Roomba OpenData Checker (Assaf et al., 2015)
VSR://IntelligentInformationManagement/LEDS/Conclusion 31
We output the results of our Quality Assessment tool with the means of the data quality vocabulary (dqv)
VSR
Chemnitz University of Technology Prof. Dr.-Ing. Martin Gaedke & Team 29.11.2016
Inspired and Interested?
VSR.Informatik.TU-Chemnitz.de
@andrelanger @myVSR /myVSR