The Internet for Social Machines - Choose your...

34
The Internet for Social Machines The end of data sharing as we know it Barend Mons Feb. 2019

Transcript of The Internet for Social Machines - Choose your...

The Internet for Social MachinesThe end of data sharing as we know it

Barend MonsFeb. 2019

(2005)

Which Gene Did You Mean?why bury it first and then mine it again?

What does FAIR eventually entail?The Machine knows what I mean

(2018)

The Internet of FAIR data and Services

The Internet for Social MachinesThe end of data sharing as we know it

Lorentz

FAIRdICT

EOSC

IFDS

Birth Infancy Adolescence Maturity

2014 2015 2016 2017 2018…

FAIR and GO FAIR

http://ec.europa.eu/research/openscience/index.cfm? pg=open-science-cloud

Prof. Dr. Barend Mons

DirectorGO FAIR International Support and Coordination Office

Visiting address: Poortgebouw N-01, Rijnsburgerweg 10, 2333 AA Leiden,The Netherlands, E-mail: [email protected], Mobile: +31 6 24879779, Skype: dnerab

www.go-fair.org

The Internet for Social Machinesapplied to dovetailing of omics and clinical data

Primary use

Secondary use

Research and care data are not ‘technically’ different

Health care managers

other Health care

professionals

Health researchers

Goverment Commercial / innovation

Population

Health care professionals

Healthcare

Health research

Public health management

Management, financial, politics

Analytics and Statistics

Quality and process improvement

Innovation & commercial development

artifical barriersartifical barriers

Primary use

Secondary use

Care and research data should be dovetailed

Health care managers

other Health care

professionals

Health researchers Goverment

Commercial / innovation

Population

Health care professionals

Healthcare

Health research

Public health management

Management, financial, politics

Analytics and Statistics

Quality and process improvement

Innovation & commercial development

dovetailingdovetailing

12

Challenges with secondary use of health data

Access to data involves significant legal, organisational and technical hurdles

Burdensome registration for health care professionals - limited benefits from (multiple) registration

Low semantic and technical interoperability

Technical debt/legacies

Lack of basic, broad purpose common infrastructure and broadly accessible analytical capabilities

schemaof

required data

schemaof

required data

Formof

required data

datasteward

EHR 1

EHR 1

DICA

§

research

archive

biobank

savesave

reuseforany

purpose

schemaof

required data

schemaof

required data

Formof

required data

datasteward

EHR 1

EHR 1

DICA

§

research

archive

biobank

savesave

reuseforany

purposeFAIR

FAIR

Http://GOFAIR/UUID.??

object, entity, defined meaning

ss pp oo

UPRIUPRI

provenanceprovenance

SPO tripples as collections of connected DO’s

Knowledge discovery

Distributed learning by VM’s

ss pp oo

UPRIUPRI

provenanceprovenance

A nanopublication is the smallest meaningful assertion, minimally one Subject-Predicate-Object tripleS,P, & O are all concepts and thus all have Unique, Persistent and Resolvable Identifiers. Many nanopublications are small graphs with multiple triples forming the assertion

ss pp oo

UPRIUPRI

provenanceprovenance

Two nanopublications representing the same meaningful assertion, i.e. the Subject-Predicate-Object triples are identicalmay have different provenance (they come from different sources)They each have their Persistent and resolvable Identifier. and different provenance graphs

ss pp oo

UPRIUPRI

provenanceprovenanceUPRIUPRI

provenanceprovenance

A Cardinal Assertion is one assertion that is linked to 1-n provenance graphs (up to thousands in some cases)

A

B

C

ss pp oo

UPRIUPRI

provenanceprovenance

ss pp oo

UPRIUPRI

provenanceprovenance

similarity measuresimilarity measure

Variant-TTS-

protein-Pathology

Disease Causing Variants

Combine FANTOM5

&LOVD

> 20% of knowndisease-causing

Variants map to TSS

5 objects are shared between all three knowlets (in this case: metabolic syndrome, diabetes, and e.o Alzheimer)

Aconceptual similarity

(hypothesis generation)

Cconceptual drift

(meta-data/blockchain)

DQUA’s

(semantic bias)

The value of knowlets in dynamic ontological graphs

BNear sameness

(semantic lenses)

UPRI

UPRI

URI

Hospital 1Hospital 1UPRI

Hospital 2Hospital 2

Hospital nHospital n

Thyroid Gland <> COPD?

Pubmed: no

Thyroid Gland <> COPD

196.425 K patients

Thyroid Gland <> COPD?

EURETOS: YES

Thyroid Gland <> COPD?

DICA: measure!Measure CLCX13

Measure CLCX13

Measure CLCX13

predispositon? CLCX13

Thyroid Gland <> COPD?

doctors: no

UPRI

UPRI U

PRI

UPRI

UPRI

UPRI

UPRI

UPRI

UPRI

UPRI U

PRI

UPRI

UPRI

UPRI

pathwaypathwayworkflowworkflow

The Internet of Data and Services (PHT model)controlled at two levels: UPRI’s and a web of associative Knowletsrepresenting abstract concepts, digital objects (including distributed VMs) and physical objects, devices & things in a universal way with

scalable, unique, resolvable identifiers at is core.

UPRI

UPRI MEGA

UPRIUPRI

other metadata

data set

UPRIUPRI

other metadata

curated database

UPRIUPRI

other metadata

workflow

UPRIUPRI

other metadata

smartwatch

similarity measuresimilarity measure

UPRI

preferred term

definition

symbols used

The Internet….The end of data sharing a s we know it

from data sharing to data visiting in

The Internet for Social Machines

Thank you!

and

‘see you at your data’