Force11 JDDCP workshop presentation, @ Force2015, Oxford

87
EU Lead Mark Wilkinson Fundacion BBVA Chair in Biological Informatics Isaac Peral Distinguished Researcher, CBGP-UPM USA Lead Michel Dumontier Associate Professor, Biomedical Informatics, Stanford FAIRport Project Lead Barend Mons Professor, Leiden University Medical Centre FAIRport Skunkworks

Transcript of Force11 JDDCP workshop presentation, @ Force2015, Oxford

Page 1: Force11 JDDCP workshop presentation, @ Force2015, Oxford

EU LeadMark Wilkinson

Fundacion BBVA Chair in Biological Informatics

Isaac Peral Distinguished Researcher, CBGP-UPM

USA LeadMichel Dumontier

Associate Professor, Biomedical Informatics, Stanford

FAIRport Project LeadBarend Mons

Professor, Leiden University Medical Centre

FAIRport Skunkworks

Page 2: Force11 JDDCP workshop presentation, @ Force2015, Oxford

“Skunkworks”

Team Update

Objectives and Outcomes

(...so far...)

Page 3: Force11 JDDCP workshop presentation, @ Force2015, Oxford

What is a FAIRport?

● Findable - (meta)data should be uniquely and persistently identifiable

● Accessible - identifiers should provide a mechanism for (meta)data

access, including authentication, access protocol, license, etc.

● Interoperable - (meta)data should be machine-accessible, using a

machine-parseable syntax and, where possible, shared common

vocabularies.

● Reusable - there should be sufficient machine-readable metadata that it is

possible to “integrate like-with-like”, and that component data objects can

be precisely and comprehensively cited post-integration.

Page 4: Force11 JDDCP workshop presentation, @ Force2015, Oxford

“Skunkworks”

“...a group within an organization given a high

degree of autonomy and unhampered by

bureaucracy, tasked with working on advanced

or secret projects.” -- Wikipedia: http://en.wikipedia.org/wiki/Skunk_Works

Page 5: Force11 JDDCP workshop presentation, @ Force2015, Oxford

“Skunkworks” FAIRport group

Objective (ongoing) - explore existing technologies and attempt to build

prototype FAIRport code components using, whenever possible, existing

standards. Once desirable FAIR behaviors have been achieved, hand-off

to a professional coding team to ensure production-quality outcomes.

● Self-selected “hackers”

● Self-identified tasks (next few slides)

● Led to a series of Web meetings, and a joint Hackathon, with

participants at venues in Netherlands and USA.

Page 6: Force11 JDDCP workshop presentation, @ Force2015, Oxford

Typical Problem

I’m looking for microarray data of human liver cells on a

time-course following liver transplant.

What repositories *could* contain this data?

● GEO? EUDat? NPG Scientific Data?

● What fields in those repositories would I need to

search, using what vocabularies, to find what I

need?

Page 7: Force11 JDDCP workshop presentation, @ Force2015, Oxford

“Skunkworks” - initial observations

There are a lot of repositories out there!

General Purpose: Dryad, EUDat, Figshare, DataVerse, etc.

Special Purpose: PDB, UniProt, NCBI, EnsEMBL

Lack of rich, machine-readable descriptions of the contents of these

repositories hinders us from (for example):

● knowing where we can look for certain types of data

● knowing if two repositories contain records about the same thing

● Cross-referencing or “joining” across repositories to integrate

disparate data about the same thing

● Knowing which repository I could/should deposit my data to (and how)

Page 8: Force11 JDDCP workshop presentation, @ Force2015, Oxford

If we wanted to enable this kind of FAIR discovery and

integration over myriad repositories, what infrastructure

(existing/new) would we need?

Challenge

Page 9: Force11 JDDCP workshop presentation, @ Force2015, Oxford

Task:

harmonized cross-repository meta-descriptors

Though self-selected as a FAIRport Skunkworks task, this significantly

overlaps with the Force11 Data Citation Implementation Working Group

Team 4 - “Common repository interfaces”.

...so we joined forces :-)

Page 10: Force11 JDDCP workshop presentation, @ Force2015, Oxford

Exemplar use-cases:

A piece of software that can generate a “sensible” query form/interface for

any repository

A piece of software that can generate a “sensible” and comprehensive

data submission form for any repository

Task:

harmonized cross-repository meta-descriptors

Page 11: Force11 JDDCP workshop presentation, @ Force2015, Oxford

Prior Art?

“DCAT is an RDF vocabulary designed to facilitate interoperability

between data catalogs published on the Web…. By using DCAT to

describe datasets in data catalogs, publishers increase discoverability

and enable applications easily to consume metadata from multiple

catalogs. It further enables decentralized publishing of catalogs and

facilitates federated dataset search across sites. Aggregated DCAT

metadata can serve as a manifest file to facilitate digital preservation.”

http://www.w3.org/TR/vocab-dcat/

W3C Recommendation 16 January 2014

DCAT Data Catalog Vocabulary

Page 12: Force11 JDDCP workshop presentation, @ Force2015, Oxford

DCAT is an RDF Schema that defines core metadata elements describing

dataset collections and the datasets within those collections. e.g.

:dataset-001

a dcat:Dataset ;

dct:title "Imaginary dataset" ;

dcat:keyword "accountability","transparency" ,"payments" ;

dct:issued "2011-12-05"^^xsd:date ;

dct:modified "2011-12-05"^^xsd:date ;

dct:temporal <http://reference.data.gov.uk/id/quarter/2006-Q1> ;

dct:spatial <http://www.geonames.org/6695072> ;

dct:publisher :finance-ministry ;

dct:language <http://id.loc.gov/vocabulary/iso639-1/en> ;

dcat:distribution :dataset-001-csv ;

Prior Art?

DCAT Data Catalog Vocabulary

Page 13: Force11 JDDCP workshop presentation, @ Force2015, Oxford

So the core metadata of a repository’s collections

could be described in DCAT...

Page 14: Force11 JDDCP workshop presentation, @ Force2015, Oxford

So the core metadata of a repository’s collections

could be described in DCAT...

...if the repositories used DCAT…

Page 15: Force11 JDDCP workshop presentation, @ Force2015, Oxford

So the core metadata of a repository’s collections

could be described in DCAT...

...if the repositories used DCAT…

...generally speaking, they don’t...

Page 16: Force11 JDDCP workshop presentation, @ Force2015, Oxford

So the core metadata of a repository’s collections

could be described in DCAT...

...if the repositories used DCAT…

...generally speaking, they don’t...

...and we need more than just core metadata to enable

cross-repository search anyway…

Page 17: Force11 JDDCP workshop presentation, @ Force2015, Oxford

So DCAT itself isn’t the solution to our problem

because, among other things, it does not

provide sufficiently rich descriptors

Page 18: Force11 JDDCP workshop presentation, @ Force2015, Oxford

What exactly *is* our problem?

Page 19: Force11 JDDCP workshop presentation, @ Force2015, Oxford

What exactly *is* our problem?

Data Record (e.g. XML, RDF)

Page 20: Force11 JDDCP workshop presentation, @ Force2015, Oxford

What exactly *is* our problem?

Data Record (e.g. XML, RDF)

Data Schema (e.g. XMLS, RDFS)

Defines

Page 21: Force11 JDDCP workshop presentation, @ Force2015, Oxford

What exactly *is* our problem?

Data Record (e.g. XML, RDF)

Data Schema (e.g. XMLS, RDFS)

Metadata Record (e.g. DCAT-compliant RDF)

Defines

Describes

Page 22: Force11 JDDCP workshop presentation, @ Force2015, Oxford

What exactly *is* our problem?

Data Record (e.g. XML, RDF)

Data Schema (e.g. XMLS, RDFS)

Metadata Record (e.g. DCAT-compliant RDF)

DCAT RDFS Schema

Defines

Describes

Defines

Page 23: Force11 JDDCP workshop presentation, @ Force2015, Oxford

What exactly *is* our problem?

Data Record (e.g. XML, RDF)

Data Schema (e.g. XMLS, RDFS)

Metadata Record (e.g. DCAT-compliant RDF)

DCAT RDFS Schema

If everyone was using all elements of the DCAT schema

to define their core metadata

then (that part of) the problem would be solved at this point

Page 24: Force11 JDDCP workshop presentation, @ Force2015, Oxford

What exactly *is* our problem?

Data Record (e.g. XML, RDF)

Data Schema (e.g. XMLS, RDFS)

Metadata Record (e.g. DCAT-compliant RDF)

DCAT RDFS Schema

If everyone was using all elements of the DCAT schema

to define their core metadata

then (that part of) the problem would be solved at this point

We could use THIS

Page 25: Force11 JDDCP workshop presentation, @ Force2015, Oxford

What exactly *is* our problem?

Data Record (e.g. XML, RDF)

Data Schema (e.g. XMLS, RDFS)

Metadata Record (e.g. DCAT-compliant RDF)

DCAT RDFS Schema

If everyone was using all elements of the DCAT schema

to define their core metadata

then (that part of) the problem would be solved at this point

To build queries

about THIS

Page 26: Force11 JDDCP workshop presentation, @ Force2015, Oxford

What exactly *is* our problem?

XML

Data Record

XMLS

Data Schema

DCAT RDF

Metadata Record

RDF

Data Record

RDFS

Data Schema

UniProt RDF

Metadata Record

ACEDB

Data Record

ACEDB

Data Schema

DragonDB Form

Metadata Record

DCAT

RDFS SchemaUniProt RDFS

MetadataSchema

DragonDB Form

Metadata Schema

REALITY

Page 27: Force11 JDDCP workshop presentation, @ Force2015, Oxford

What exactly *is* our problem?

XML

Data Record

XMLS

Data Schema

DCAT RDF

Metadata Record

RDF

Data Record

RDFS

Data Schema

UniProt RDF

Metadata Record

ACEDB

Data Record

ACEDB

Data Schema

DragonDB Form

Metadata Record

DCAT

RDFS SchemaUniProt RDFS

MetadataSchema

DragonDB Form

Metadata Schema

Repositories don’t all use DCAT Schema

Page 28: Force11 JDDCP workshop presentation, @ Force2015, Oxford

What exactly *is* our problem?

XML

Data Record

XMLS

Data Schema

DCAT RDF

Metadata Record

RDF

Data Record

RDFS

Data Schema

UniProt RDF

Metadata Record

ACEDB

Data Record

ACEDB

Data Schema

DragonDB Form

Metadata Record

DCAT

RDFS SchemaUniProt RDFS

MetadataSchema

DragonDB Form

Metadata Schema

Those that use DCAT Schema, use only parts of it

Page 29: Force11 JDDCP workshop presentation, @ Force2015, Oxford

What exactly *is* our problem?

XML

Data Record

XMLS

Data Schema

DCAT RDF

Metadata Record

RDF

Data Record

RDFS

Data Schema

UniProt RDF

Metadata Record

ACEDB

Data Record

ACEDB

Data Schema

DragonDB Form

Metadata Record

DCAT

RDFS SchemaUniProt RDFS

MetadataSchema

DragonDB Form

Metadata Schema

Those that don’t use DCAT

use a myriad of alternatives (some very loosely defined)

Page 30: Force11 JDDCP workshop presentation, @ Force2015, Oxford

What exactly *is* our problem?

XML

Data Record

XMLS

Data Schema

DCAT RDF

Metadata Record

RDF

Data Record

RDFS

Data Schema

UniProt RDF

Metadata Record

ACEDB

Data Record

ACEDB

Data Schema

DragonDB Form

Metadata Record

DCAT

RDFS SchemaUniProt RDFS

MetadataSchema

DragonDB Form

Metadata Schema

And don’t necessarily use

all elements of those alternatives either

Page 31: Force11 JDDCP workshop presentation, @ Force2015, Oxford

What exactly *is* our problem?

XML

Data Record

XMLS

Data Schema

DCAT RDF

Metadata Record

RDF

Data Record

RDFS

Data Schema

UniProt RDF

Metadata Record

ACEDB

Data Record

ACEDB

Data Schema

DragonDB Form

Metadata Record

DCAT

RDFS SchemaUniProt RDFS

MetadataSchema

DragonDB Form

Metadata Schema

So how are we going to do RICH queries over all

of these?

Page 32: Force11 JDDCP workshop presentation, @ Force2015, Oxford

What exactly *is* our problem?

XML

Data Record

XMLS

Data Schema

DCAT RDF

Metadata Record

RDF

Data Record

RDFS

Data Schema

UniProt RDF

Metadata Record

ACEDB

Data Record

ACEDB

Data Schema

DragonDB Form

Metadata Record

DCAT

RDFS SchemaUniProt RDFS

MetadataSchema

DragonDB Form

Metadata Schema

We need a way to describe the descriptors...

Page 33: Force11 JDDCP workshop presentation, @ Force2015, Oxford

The DCAT WG suggested the same thingThey said there was a need for “DCAT Profiles”

A DCAT Profile is a specification for data catalogs that adds additional

constraints to DCAT. Additional constraints in a profile MAY include:

● A minimum set of required metadata fields

● Classes and properties for additional metadata fields not covered in DCAT

● Controlled vocabularies or URI sets as acceptable values for properties

● Requirements for specific access mechanisms (RDF syntaxes, protocols) to the catalog's RDF

description

http://www.w3.org/TR/vocab-dcat/

Page 34: Force11 JDDCP workshop presentation, @ Force2015, Oxford

The DCAT WG suggested the same thingThey said there was a need for “DCAT Profiles”

A DCAT Profile is a specification for data catalogs that adds additional

constraints to DCAT. Additional constraints in a profile MAY include:

● A minimum set of required metadata fields

● Classes and properties for additional metadata fields not covered in DCAT

● Controlled vocabularies or URI sets as acceptable values for properties

● Requirements for specific access mechanisms (RDF syntaxes, protocols) to the catalog's RDF

description

http://www.w3.org/TR/vocab-dcat/

A DCAT Profile is:

A generic way to describe what metadata fields a repository has

and what the constraints on those fields are

Page 35: Force11 JDDCP workshop presentation, @ Force2015, Oxford

But the DCAT WG also suggested...

A DCAT Profile is a specification for data catalogs that adds additional

constraints to DCAT. Additional constraints in a profile MAY include:

● A minimum set of required metadata fields

● Classes and properties for additional metadata fields not covered in DCAT

● Controlled vocabularies or URI sets as acceptable values for properties

● Requirements for specific access mechanisms (RDF syntaxes, protocols) to the catalog's RDF

description

DCAT Profiles don’t exist!

http://www.w3.org/TR/vocab-dcat/

Page 36: Force11 JDDCP workshop presentation, @ Force2015, Oxford

“FAIR Profiles”

At the Hackathon, the “Skunkers” decided to invent the DCAT Profile technology.

Since they are intended to allow descriptions of

● Descriptor metadata fields not included in DCAT...

● ...in many cases, Descriptors with ZERO metadata fields from DCAT...

● ...and in many cases, Descriptors that are not even in RDF...

We call them “FAIR Profiles” rather than DCAT profiles

(However, clear acknowledgements to the

DCAT Working Group for conceiving of the idea!)

Page 37: Force11 JDDCP workshop presentation, @ Force2015, Oxford

XML

Data Record

XMLS

Data Schema

DCAT RDF

Metadata Record

RDF

Data Record

RDFS

Data Schema

UniProt RDF

Metadata Record

ACEDB

Data Record

ACEDB

Data Schema

DragonDB Form

Metadata Record

DCAT

RDFS SchemaUniProt RDFS

MetadataSchema

DragonDB Form

Metadata Schema

What the FAIR profile technology accomplishes

Page 38: Force11 JDDCP workshop presentation, @ Force2015, Oxford

XML

Data Record

XMLS

Data Schema

DCAT RDF

Metadata Record

RDF

Data Record

RDFS

Data Schema

UniProt RDF

Metadata Record

ACEDB

Data Record

ACEDB

Data Schema

DragonDB Form

Metadata Record

DCAT

RDFS SchemaUniProt RDFS

MetadataSchema

DragonDB Form

Metadata Schema

FAIR Profile

DCAT Schema

FAIR Profile

UniProt Metadata

Schema

FAIR Profile

DragonDB Metadata

Schema

What the FAIR profile technology accomplishes

Page 39: Force11 JDDCP workshop presentation, @ Force2015, Oxford

XML

Data Record

XMLS

Data Schema

DCAT RDF

Metadata Record

RDF

Data Record

RDFS

Data Schema

UniProt RDF

Metadata Record

ACEDB

Data Record

ACEDB

Data Schema

DragonDB Form

Metadata Record

DCAT

RDFS SchemaUniProt RDFS

MetadataSchema

DragonDB Form

Metadata Schema

FAIR Profile

DCAT Schema

FAIR Profile

UniProt Metadata

Schema

FAIR Profile

DragonDB Metadata

Schema

Though they are potentially describing very different things

(from Web FORM fields to OWL Ontologies!)

all FAIR Profiles are written using the same vocabulary and structure, defined by...

Page 40: Force11 JDDCP workshop presentation, @ Force2015, Oxford

XML

Data Record

XMLS

Data Schema

DCAT RDF

Metadata Record

RDF

Data Record

RDFS

Data Schema

UniProt RDF

Metadata Record

ACEDB

Data Record

ACEDB

Data Schema

DragonDB Form

Metadata Record

DCAT

RDFS SchemaUniProt RDFS

MetadataSchema

DragonDB Form

Metadata Schema

FAIR Profile of

DCAT Schema

FAIR Profile of

UniProt Metadata

Schema

FAIR Profile of

DragonDB Metadata

Schema

Page 41: Force11 JDDCP workshop presentation, @ Force2015, Oxford

The FAIR Profile

Schema

(the thing the Skunkworks team invented)

Page 42: Force11 JDDCP workshop presentation, @ Force2015, Oxford

Repo. Data Record (e.g. XML, RDF)

Repo. Data Schema (e.g. XMLS, RDFS)

Repository Metadata Record

Repository Metadata Schema

Defines

Describes

Defines

Defines

Describes

Repository’s Fair Profile

Fair Profile Schema

Page 43: Force11 JDDCP workshop presentation, @ Force2015, Oxford

“All problems in computer

science can be solved by

another level of indirection”-- David Wheeler

inventor of the subroutine

Page 44: Force11 JDDCP workshop presentation, @ Force2015, Oxford

"...But that usually will create

another problem."-- David Wheeler

“All problems in computer

science can be solved by

another level of indirection”-- David Wheeler

inventor of the subroutine

Diomidis Spinellis. Another level of indirection. In Andy Oram and Greg Wilson, editors, Beautiful Code: Leading Programmers Explain How They Think, chapter 17, pages 279–

291. O'Reilly and Associates, Sebastopol, CA, 2007.

Page 45: Force11 JDDCP workshop presentation, @ Force2015, Oxford

Desiderata for FAIR Profile Schema

● Must describe legacy data (i.e. not just DCAT or other “modern” data)

● Must describe a multitude of data formats (XML, RDF, Key/Value, etc.)

● Must be capable of describing OWL-DL-governed data (still rare, but

increasingly used… Classes, property-restrictions, etc.)

● Must be capable of describing any kind of value constraint, e.g. arbitrary CV,

rdf:range, or equivalent OWL construct

● Must be hierarchical (i.e. the value-constraint of a field can be set as an

entirely separate FAIR Profile)

● Must be modular, identifiable, shareable, and reusable (to stem the

proliferation of new formats)

● Must use standard technologies, and re-use existing vocabularies if poss.

● Must be extremely lightweight

● Must NOT require the participation of the repository host (no buy-in required)

Page 46: Force11 JDDCP workshop presentation, @ Force2015, Oxford

FAIR Profile SchemaA very lightweight meta-meta-descriptor, in RDFS language

FAIR Profile FP Class FP Property

Property

Restriction

Definition

hasClass hasProperty allowed

Values

classType propertyType

External Ontology

or RDFS Class

(optional)

External Ontology

or RDFS Predicate

(optional)

http://github.com/DataFairPort/DataFairPort/blob/Master/Schema/DCATProfile.rdfs

Page 47: Force11 JDDCP workshop presentation, @ Force2015, Oxford

FAIR Profile SchemaA very lightweight meta-meta-descriptor, in RDFS language

FAIR Profile FP Class FP Property

Property

Restriction

Definition

hasClass hasProperty allowed

Values

classType propertyType

External Ontology

or RDFS Class

(optional)

External Ontology

or RDFS Predicate

(optional)

Requirement Status?

Cardinality?

Other Constraint?

http://github.com/DataFairPort/DataFairPort/blob/Master/Schema/DCATProfile.rdfs

Page 48: Force11 JDDCP workshop presentation, @ Force2015, Oxford

Property Restriction

Definition

(XSD, FAIR Profile, SKOS)

Describes the constraints on the possible

values for a predicate in the target-

Repository’s metadata Schema

Page 49: Force11 JDDCP workshop presentation, @ Force2015, Oxford

Property Restriction

Definition

(XSD, FAIR Profile, SKOS)

Describes the constraints on the possible

values for a predicate in the target-

Repository’s metadata Schema

NOTE: we cannot use rdfs:range because

we are meta-modelling! The predicate is a

CLASS at the meta-model level, so use of

rdfs:range is not appropriate.

Page 50: Force11 JDDCP workshop presentation, @ Force2015, Oxford

Property Restriction

Definition

(XSD, FAIR Profile, SKOS)

Describes the constraints on the possible

values for a predicate in the target-

Repository’s metadata Schema

The possible values are:

● An XSD Datatype

● Another DCAT Profile (i.e. hierarchical profiles)

● A SKOS View on a set of ontology terms from

one or more ontologies

Page 51: Force11 JDDCP workshop presentation, @ Force2015, Oxford

A FAIR Profile (an RDF document that follows the FAIR Profile Schema)

This!

Metadata Record (e.g. DCAT-compliant RDF)

DCAT RDFS Schema

Fair Profile

Fair Profile Schema

Page 52: Force11 JDDCP workshop presentation, @ Force2015, Oxford

A FAIR Profile

FAIR Profile FP Class FP Property

Property

Restriction

DefinitionhasClass hasProperty allowed

Values

classType propertyType

External Class External Predicate

Page 53: Force11 JDDCP workshop presentation, @ Force2015, Oxford

A FAIR Profile

FAIR Profile FP Class FP Property

Property

Restriction

DefinitionhasClass hasProperty allowed

Values

classType propertyType

External Class External Predicate

FAIR Profiles are FAIR!

(Identifiable, Re-usable, and Shareable)

Page 54: Force11 JDDCP workshop presentation, @ Force2015, Oxford

A FAIR Profile

FAIR Profile FP Class FP Property

Property

Restriction

DefinitionhasClass hasProperty allowed

Values

classType propertyType

External Class External Predicate

Page 55: Force11 JDDCP workshop presentation, @ Force2015, Oxford

A FAIR Profile

FAIR Profile FP Class FP Property

Property

Restriction

Definition

hasClass hasProperty allowed

Values

classType propertyType

External Class External Predicate

Page 56: Force11 JDDCP workshop presentation, @ Force2015, Oxford

A FAIR ProfileThe CoreMicroarrayDistributionMetadata Descriptor Class

FAIR Profile FP Class FP Property

hasClass hasProperty allowed

Values

classType propertyType

External Class External Predicate

Property

Restriction

Definition

Page 57: Force11 JDDCP workshop presentation, @ Force2015, Oxford

CoreMicroarrayDistributionMetadataClass Descriptor

FAIR Profile FP Class FP Property

hasClass hasProperty allowed

Values

classType propertyType

External Class External Predicate

Property

Restriction

Definition

Page 58: Force11 JDDCP workshop presentation, @ Force2015, Oxford

CoreMicroarrayDistributionMetadata Descriptor

The Class follows the “DCAT Distribution” Class model

FAIR Profile FP Class FP Property

hasClass hasProperty allowed

Values

classType propertyType

External Class External Predicate

Property

Restriction

Definition

Page 59: Force11 JDDCP workshop presentation, @ Force2015, Oxford

CoreMicroarrayDistributionMetadata Descriptor

It uses only 3 properties from the “DCAT Distribution” Class model

FAIR Profile FP Class FP Property

hasClass hasPropertyallowed

Values

propertyType

External Class External Predicate

Property

Restriction

Definition

classType

Page 60: Force11 JDDCP workshop presentation, @ Force2015, Oxford

CoreMicroarrayDistributionMetadata Descriptor: Property #1

It uses only 3 properties from the “DCAT Distribution” Class model...let’s look at one of them in detail

FAIR Profile FP Class FP Property

hasClass hasPropertyallowed

Values

propertyType

External Class External Predicate

classType

Property

Restriction

Definition

Page 61: Force11 JDDCP workshop presentation, @ Force2015, Oxford

CoreMicroarrayDistributionMetadata Descriptor: Property #1

This Meta-Descriptor element is a ‘FAIR Profile Property’ Class

FAIR Profile FP Class FP Property

hasClass hasProperty allowed

Values

classType propertyType

External Class External Predicate

Property

Restriction

Definition

Page 62: Force11 JDDCP workshop presentation, @ Force2015, Oxford

CoreMicroarrayDistributionMetadata Descriptor: Property #1

This is it’s label within that organizations metadata descriptor

FAIR Profile FP Class FP Property

hasClass hasProperty allowed

Values

classType propertyType

External Class External Predicate

Property

Restriction

Definition

Page 63: Force11 JDDCP workshop presentation, @ Force2015, Oxford

CoreMicroarrayDistributionMetadata Descriptor: Property #1

This is the URL of the Predicate used by that descriptor

FAIR Profile FP Class FP Property

hasClass hasProperty allowed

Values

classType propertyType

External Class External Predicate

Property

Restriction

Definition

Page 64: Force11 JDDCP workshop presentation, @ Force2015, Oxford

CoreMicroarrayDistributionMetadata Descriptor: Property #1

This is the “range” of that Predicate within the organizations descriptor

FAIR Profile FP Class FP Property

hasClass hasProperty allowed

ValuesclassType

External Class External Predicate

Property

Restriction

Definition

propertyType

Page 65: Force11 JDDCP workshop presentation, @ Force2015, Oxford

CoreMicroarrayDistributionMetadata Descriptor: Property #2

Let’s look at a different property from the CoreMicroarrayDistributionMetadata Class

FAIR Profile FP Class FP Property

hasClass hasPropertyallowed

Values

propertyType

External Class External Predicate

classType

Property

Restriction

Definition

Page 66: Force11 JDDCP workshop presentation, @ Force2015, Oxford

CoreMicroarrayDistributionMetadata Descriptor: Property #2

FAIR Profile FP Class FP Property

hasClass hasProperty allowed

Values

classType propertyType

External Class External Predicate

Property

Restriction

Definition

Page 67: Force11 JDDCP workshop presentation, @ Force2015, Oxford

CoreMicroarrayDistributionMetadata Descriptor: Property #2

This is the label for that property

FAIR Profile FP Class FP Property

hasClass hasProperty allowed

Values

classType propertyType

External Class External Predicate

Property

Restriction

Definition

Page 68: Force11 JDDCP workshop presentation, @ Force2015, Oxford

CoreMicroarrayDistributionMetadata Descriptor: Property #2

The URL of the predicate of this Property

FAIR Profile FP Class FP Property

hasClass hasProperty allowed

Values

classType propertyType

External Class External Predicate

Property

Restriction

Definition

Page 69: Force11 JDDCP workshop presentation, @ Force2015, Oxford

CoreMicroarrayDistributionMetadata Descriptor: Property #2

In the Metadata Descriptor, this property is constrained by the set of ontology terms defined in the SKOS Concept Scheme

EDAM_Microarray_Data_Format

FAIR Profile FP Class FP Property

hasClass hasProperty allowed

ValuesclassType

External Class External Predicate

Property

Restriction

Definition

propertyType

Page 70: Force11 JDDCP workshop presentation, @ Force2015, Oxford

<rdf:Description xmlns:ns1="http://www.w3.org/2002/07/owl#"

rdf:about="http://biordf.org/DataFairPort/ConceptSchemes/EDAM_Microarray_Data_Format">

<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Ontology"/>

<rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#ConceptScheme"/>

<ns1:imports rdf:resource="http://purl.bioontology.org/ontology/EDAM"/>

</rdf:Description>

<rdf:Description

xmlns:ns1="http://www.w3.org/2000/01/rdf-schema#"

xmlns:ns2="http://www.w3.org/2004/02/skos/core#"

rdf:about="http://edamontology.org/format_1641">

<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>

<rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/>

<ns1:label>affymetrix-exp</ns1:label>

<ns2:broader rdf:resource="http://edamontology.org/format_2056"/>

<ns2:inScheme rdf:resource="http://biordf.org/DataFairPort/ConceptSchemes/EDAM_Microarray_Data_Format"/>

</rdf:Description>

<rdf:Description

xmlns:ns1="http://www.w3.org/2000/01/rdf-schema#"

xmlns:ns2="http://www.w3.org/2004/02/skos/core#"

rdf:about="http://edamontology.org/format_2056">

<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>

<rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/>

<ns1:label>Microarray experiment data format</ns1:label>

<ns2:broader rdf:resource="http://biordf.org/DataFairPort/ConceptSchemes/EDAM_Microarray_Data_Format"/>

<ns2:inScheme rdf:resource="http://biordf.org/DataFairPort/ConceptSchemes/EDAM_Microarray_Data_Format"/>

</rdf:Description>

http://biordf.org/DataFairPort/ConceptSchemes/EDAM_Microarray_Data_Format

This is a “SKOSified” view of the EDAM Ontology

Jupp, et al., “Taking a view on bio-ontologies” ceur-ws.org/Vol-897/session4-paper22.pdf

Page 71: Force11 JDDCP workshop presentation, @ Force2015, Oxford

A DCAT ProfileReturn to the very top of our FAIR Profile

Follow the ExtendedAuthorship Class

FAIR Profile FP Class FP Property

Property

Restriction

Definition

hasClass hasProperty allowed

Values

classType propertyType

External Class External Predicate

Page 72: Force11 JDDCP workshop presentation, @ Force2015, Oxford

ExtendedAuthorship

Follow one of the properties of the ExtendedAuthorship Class

FAIR Profile FP Class FP Property

hasClass hasPropertyallowed

Values

propertyType

External Class External Predicate

classType

Property

Restriction

Definition

Page 73: Force11 JDDCP workshop presentation, @ Force2015, Oxford

Author ORCID

FAIR Profile FP Class FP Property

hasClass hasProperty allowed

Values

classType propertyType

External Class External Predicate

Property

Restriction

Definition

Page 74: Force11 JDDCP workshop presentation, @ Force2015, Oxford

Author ORCIDThe allowed values of this Property are constrained to be

individuals that follow the FAIR Profile Schema “DemoORCIDProfileScheme”

FAIR Profile FP Class FP Property

hasClass hasProperty allowed

ValuesclassType

External Class External Predicate

Property

Restriction

Definition

propertyType

Page 75: Force11 JDDCP workshop presentation, @ Force2015, Oxford

http://biordf.org/DataFairPort/ProfileSchemas/DemoORCIDProfileScheme.rdf

FAIR Profile FP Class FP Property

Property

Restriction

DefinitionhasClass hasProperty allowed

Values

classType propertyType

External Class External Predicate

Page 76: Force11 JDDCP workshop presentation, @ Force2015, Oxford

http://biordf.org/DataFairPort/ProfileSchemas/DemoORCIDProfileScheme.rdf

FAIR Profile FP Class FP Property

hasClass hasProperty allowed

ValuesclassType

External Class External Predicate

propertyType

This is parsed in exactly the same way as our originalDemoMicroarrayProfileScheme, but is embedded within

it as the value of the author_ORCID property.

…Arbitrary, hierarchical layers of complexity…

FAIR Profile FP Class

hasClass hasProperty

classType

External Class

Page 77: Force11 JDDCP workshop presentation, @ Force2015, Oxford

So to build an interface(e.g. query or data-capture)

from a FAIR Profile:

[1] Parse all FAIR Profile classes

Parse the properties of each class

Determine the target predicate

Determine the target value-restrictions

Call [1] if restriction is a FAIR

Profile

Create a metadata [capture/query] facet with

that

predicate and that restriction

Page 78: Force11 JDDCP workshop presentation, @ Force2015, Oxford
Page 79: Force11 JDDCP workshop presentation, @ Force2015, Oxford

DCAT Profile Class #1

DCAT Profile Class #2

DCAT Profile Class #3

DCAT Profile Class #4 (embedded)

Value constraints

Descriptor-specific labels associatedwith ontology predicates (if applicable)

“Classes” may be associated with an ontologyto allow reasoning, or may just represent an“arbitrary” grouping of properties within theTarget metadata descriptor

Metadata Descriptor-specific details are capturede.g. this field is required by this target Metadata Descriptor

Page 80: Force11 JDDCP workshop presentation, @ Force2015, Oxford

Other features of FAIR profiles

● Do not require repository participation

● Provides a purpose-driven, potentially non-comprehensive “view” on a

repository, of which there may be many, according to what the profile

author needs to cross-query

● Profiles of any given repository facet are not required to be identical! e.g.

A different profile might utilize a different controlled vocabulary over any

given facet (e.g. a freetext facet)

● Anybody can define a profile (of course, the profile defined by the

repository owner should be considered “canonical”... the rest are just

purpose-built “best-guesses”)

● FAIR profiles can/should be indexed and shared, to facilitate cross-

repository interoperability and integration

● There is no (obvious) reason why a FAIR profile could not be used to

describe the DATA in the repository, not just the metadata...

Page 81: Force11 JDDCP workshop presentation, @ Force2015, Oxford

Nothin’ ain’t worth nothin’, but it’s free!-- Kris Kristofferson

“All problems in computer

science can be solved by

another level of indirection

...But that usually will create

another problem."-- David Wheeler

Page 82: Force11 JDDCP workshop presentation, @ Force2015, Oxford

Nothin’ ain’t worth nothin’, but it’s free!

The FAIR profile isn’t “a magic bean”!

It DOES NOT ACCOMPLISH SEMANTIC MAPPING

between one field in one repository, and a semantically-

related field in another repository

Page 83: Force11 JDDCP workshop presentation, @ Force2015, Oxford

Nothin’ ain’t worth nothin’, but it’s free!

The FAIR profile isn’t “a magic bean”!

It does give us a standard way to identify, describe, and

meta-link these fields, and a predictable place where a

mapping mechanism could be injected.

Page 84: Force11 JDDCP workshop presentation, @ Force2015, Oxford

Nothin’ ain’t worth nothin’, but it’s free!

The FAIR profile isn’t “a magic bean”!

...we don’t inject it (yet!) because that would require

invention of yet another “standard”, and we want to avoid

that if possible!

Page 85: Force11 JDDCP workshop presentation, @ Force2015, Oxford

Nothin’ ain’t worth nothin’, but it’s free!

The FAIR profile isn’t “a magic bean”!

There may be some in the audience who, like me,

recognize that this problem is nearly identical to the

problem faced by the WSDL -> SAWSDL community.

I will be looking at their solution for guidance in the next

phase of FAIR Profiles...

… so we still have problems, but at least they are now

re-defined as problems for which there are solutions!

Page 86: Force11 JDDCP workshop presentation, @ Force2015, Oxford

Skunkworks Participants

● Mark Wilkinson

● Michel Dumontier

● Barend Mons

● Tim Clark

● Jun Zhao

● Paolo Ciccarese

● Paul Groth

● Erik van Mulligen

● Luiz Olavo Bonino da

Silva Santos

● Matthew Gamble

● Carole Goble

● Joël Kuiper

● Morris Swertz

● Erik Schultes

● Erik Schultes

● Mercè Crosas

● Adrian Garcia

● Barend Mons

● Philip Durbin

● Jeffrey Grethe

● Katy Wolstencroft

● Sudeshna Das

● M. Emily Merrill

Page 87: Force11 JDDCP workshop presentation, @ Force2015, Oxford

Post-presentation comments

We should look at ISO 11179 -> are we

duplicating those efforts or are we creating

something that is an implementation of those

efforts?

See also Dublin Core’s similar initiative.