Equity workshop: Concepts and measurement of fairness of green economies
Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data...
Transcript of Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data...
Cost-based measurement of data FAIRness of public data sources in life sciences
Antoon Bronselaer, PhD [email protected]
Filip Pattyn, PhD [email protected]
Assessing FAIRness
?
Counting the principles?
F1. F2. F3. F4. A1. A1.1. A1.2. A2. I1. I2. I3. R1. R1.1. R1.2. R1.3.
F1. F2. F3. F4. A1. A1.1. A1.2. A2. I1. I2. I3. R1. R1.1. R1.2. R1.3.
Total count Total count
Data source 1 Data source 2
Counting the principles?
F1. F2. F3. F4. A1. A1.1. A1.2. A2. I1. I2. I3. R1. R1.1. R1.2. R1.3.
F1. F2. F3. F4. A1. A1.1. A1.2. A2. I1. I2. I3. R1. R1.1. R1.2. R1.3.
Total count Total count
Data source version 1 Data source version 2
Difficult to compare
FAIRness
?
0% FAIR ? 100% FAIR ?
• F1. (meta)data are assigned a globally unique and persistent identifier
• F1. (meta)data are assigned a globally unique and persistent identifier
– Article ID
– Journal ID
• F1. (meta)data are assigned a globally unique and persistent identifier
– Article ID
– Journal ID
– Author URIs
• F1. (meta)data are assigned a globally unique and persistent identifier
– Article ID
– Journal ID
– Author URIs
– Reference IDs
– Classification IDs
• F1. (meta)data are assigned a globally unique and persistent identifier
– Article ID
– Journal ID
– Author URIs
– Reference IDs
– Classification IDs
– Affiliation?
• F1. (meta)data are assigned a globally unique and persistent identifier
– Article ID
– Journal ID
– Author URIs
– Reference IDs
– Classification IDs
– Affiliation?
• F1. (meta)data are assigned a globally unique and persistent identifier
– Article ID
– Journal ID
– Reference IDs
– Classification IDs
– Affiliation?
– Author URIs?
When is a dataset FAIR or FAIR enough?
• Propagation of FAIRness
– I2. (meta)data use vocabularies that follow FAIR principles
> >
How to measure FAIRness?
• Measuring FAIRness
–Clear definition of what is being measured and why one wants to measure it.
–Describe what’s a valid result and how one obtains it, thus reproducible
• Qualities of a good measurement
–Clear: easy to understand
–Realistic: no over-engineering
–Objective: quantitative, machine-interpretable, scalable and reproducible
–Discriminating: able to distinguish differences
Thanks Michel Dumontier
What’s the rationale behind FAIR?
• (Re-)use data for multiple purposes
• What’s the impact for the end-user? Who’s the audience?
• More FAIRness should mean less hurdles to solve a use case
More FAIR means less effort
• What’s the effort needed to make a data source more FAIR so one can solve a use case?
• Effort quantified as a cost
–Time
–Human and Machine resources
• Unit of measure
–Price
• Potential to calculate the Return On Investment (ROI) on FAIR data
More FAIR means less effort
transformations
…
…
…
more FAIR
application
graphical UI
API
Different types of effort
transformations
…
…
…
more FAIR
application
graphical UI
API
FAIR enough means less effort
application
graphical UI
API
DISQOVER
>1
30
dat
a so
urc
es
www.disqover.com
User friendliness
Usability / Transparency / Traceability
Food for thought
• Cost vs. Time of data transformations
– Faster by more expensive skilled data scientist
– Slower by less expensive junior data scientist
– Manual vs. automated
• Track evolution data source FAIRness
–More FAIR data generation
–Make legacy data more FAIR
–Technological advancements
FAIRness as a cost-based measurement
Calculate ROI of FAIRness
Consensus units of cost
May the
ONTOFORCE …
Hans Constandt Bérénice Wulbrecht Ali Adiby Dries Schaumont Paulo Van Huffel Niels Vanneste Peter Verrykt Kenny Knecht, PhD Paul Vauterin, PhD
Faculty of Engineering and Architecture Department of Telecommunications and
information processing
Prof. Antoon Bronselaer
Yoram Timmerman
Filip Pattyn, PhD
[email protected] +32 486 739 129 www.disqover.com www.ontoforce.com