Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data...

34
Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD [email protected] Filip Pattyn, PhD [email protected]

Transcript of Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data...

Page 1: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

Cost-based measurement of data FAIRness of public data sources in life sciences

Antoon Bronselaer, PhD [email protected]

Filip Pattyn, PhD [email protected]

Page 2: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

Assessing FAIRness

?

Page 3: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

Counting the principles?

F1. F2. F3. F4. A1. A1.1. A1.2. A2. I1. I2. I3. R1. R1.1. R1.2. R1.3.

F1. F2. F3. F4. A1. A1.1. A1.2. A2. I1. I2. I3. R1. R1.1. R1.2. R1.3.

Total count Total count

Data source 1 Data source 2

Page 4: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

Counting the principles?

F1. F2. F3. F4. A1. A1.1. A1.2. A2. I1. I2. I3. R1. R1.1. R1.2. R1.3.

F1. F2. F3. F4. A1. A1.1. A1.2. A2. I1. I2. I3. R1. R1.1. R1.2. R1.3.

Total count Total count

Data source version 1 Data source version 2

Page 5: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

Difficult to compare

Page 6: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be
Page 7: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

FAIRness

?

0% FAIR ? 100% FAIR ?

Page 8: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

• F1. (meta)data are assigned a globally unique and persistent identifier

Page 9: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

• F1. (meta)data are assigned a globally unique and persistent identifier

– Article ID

– Journal ID

Page 10: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

• F1. (meta)data are assigned a globally unique and persistent identifier

– Article ID

– Journal ID

– Author URIs

Page 11: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

• F1. (meta)data are assigned a globally unique and persistent identifier

– Article ID

– Journal ID

– Author URIs

– Reference IDs

– Classification IDs

Page 12: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

• F1. (meta)data are assigned a globally unique and persistent identifier

– Article ID

– Journal ID

– Author URIs

– Reference IDs

– Classification IDs

– Affiliation?

Page 13: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

• F1. (meta)data are assigned a globally unique and persistent identifier

– Article ID

– Journal ID

– Author URIs

– Reference IDs

– Classification IDs

– Affiliation?

Page 14: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

• F1. (meta)data are assigned a globally unique and persistent identifier

– Article ID

– Journal ID

– Reference IDs

– Classification IDs

– Affiliation?

– Author URIs?

Page 15: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be
Page 16: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be
Page 17: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

When is a dataset FAIR or FAIR enough?

• Propagation of FAIRness

– I2. (meta)data use vocabularies that follow FAIR principles

> >

Page 18: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

How to measure FAIRness?

• Measuring FAIRness

–Clear definition of what is being measured and why one wants to measure it.

–Describe what’s a valid result and how one obtains it, thus reproducible

• Qualities of a good measurement

–Clear: easy to understand

–Realistic: no over-engineering

–Objective: quantitative, machine-interpretable, scalable and reproducible

–Discriminating: able to distinguish differences

Thanks Michel Dumontier

Page 19: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

What’s the rationale behind FAIR?

• (Re-)use data for multiple purposes

• What’s the impact for the end-user? Who’s the audience?

• More FAIRness should mean less hurdles to solve a use case

Page 20: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

More FAIR means less effort

• What’s the effort needed to make a data source more FAIR so one can solve a use case?

• Effort quantified as a cost

–Time

–Human and Machine resources

• Unit of measure

–Price

• Potential to calculate the Return On Investment (ROI) on FAIR data

Page 21: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

More FAIR means less effort

transformations

more FAIR

application

graphical UI

API

Page 22: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

Different types of effort

transformations

more FAIR

application

graphical UI

API

Page 23: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

FAIR enough means less effort

application

graphical UI

API

Page 24: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

DISQOVER

>1

30

dat

a so

urc

es

www.disqover.com

Page 25: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be
Page 26: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be
Page 27: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be
Page 28: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be
Page 29: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be
Page 30: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

User friendliness

Page 31: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

Usability / Transparency / Traceability

Page 32: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

Food for thought

• Cost vs. Time of data transformations

– Faster by more expensive skilled data scientist

– Slower by less expensive junior data scientist

– Manual vs. automated

• Track evolution data source FAIRness

–More FAIR data generation

–Make legacy data more FAIR

–Technological advancements

Page 33: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

FAIRness as a cost-based measurement

Calculate ROI of FAIRness

Consensus units of cost

Page 34: Cost-based measurement of data FAIRness of public data ... · Cost-based measurement of data FAIRness of public data sources in life sciences Antoon Bronselaer, PhD antoon.bronselaer@ugent.be

May the

ONTOFORCE …

Hans Constandt Bérénice Wulbrecht Ali Adiby Dries Schaumont Paulo Van Huffel Niels Vanneste Peter Verrykt Kenny Knecht, PhD Paul Vauterin, PhD

Faculty of Engineering and Architecture Department of Telecommunications and

information processing

Prof. Antoon Bronselaer

Yoram Timmerman

Filip Pattyn, PhD

[email protected] +32 486 739 129 www.disqover.com www.ontoforce.com