UKON 2014

32
Linked Experimental Data in Life Sciences Alejandra González-Beltrán, PhD Oxford e-Research Centre, University of Oxford [email protected] @alegonbel UKON 2014 April 24th, 2014 Birmingham, UK

description

Presentation given at the UK Ontology meeting 2014. See programme at: http://ced.aston.ac.uk/ukont2014/programme.html

Transcript of UKON 2014

Page 1: UKON 2014

Linked Experimental Data in Life Sciences

Alejandra González-Beltrán, PhD Oxford e-Research Centre, University of [email protected] @alegonbel

UKON 2014 April 24th, 2014 Birmingham, UK

Page 2: UKON 2014

Data Scientist

Visualization

Analysis

Planning

Data Management

Data CollectionPublication

Use existing data

Perform new experiment

Experimental workflow

Page 3: UKON 2014

Data Scientist

Visualization

Analysis

Planning

Data Management

Data CollectionPublication

Use existing data

Perform new experiment

metadata

data+

Experimental workflow

Page 4: UKON 2014

Data Scientist

Visualization

Analysis

Planning

Data Management

Data CollectionPublication

Use existing data

Perform new experiment

metadata

data+

Data Interoperability

Experimental workflow

Page 5: UKON 2014

Data Scientist

Visualization

Analysis

Planning

Data Management

Data CollectionPublication

Use existing data

Perform new experiment

Data Scientist

Visualization

Analysis

Planning

Data Management

Data CollectionPublication

Use existing data

Perform new experiment

Data Reusability

Experimental workflow

Page 6: UKON 2014

Formats & Database Fragmentation Publication

Page 7: UKON 2014

���5

) infrastructureThe Investigation/Study/Assay (

generic format for experimental description and data exchange

open source software toolscommunity engagement

Page 8: UKON 2014

���6

Page 9: UKON 2014

Experimental workflow - graph representation

H. Sapiens

33 Years

H1

H2

H1.sample1

H1.sample2

H2.sample1

Labeling

Labeling

H1.sample1.labeled

H2.sample1.labeled

h1-s1.cel

h1-s2.cel

h2-s1.cel

H. Sapiens

35 Years

Scanning

Scanning

Scanning

...

...

...

Page 10: UKON 2014

Spreadsheets for end-users

vocabulary for the description of the experimental workflow

H. Sapiens

33 Years

H1

H2

H1.sample1

H1.sample2

H2.sample1

Labeling

Labeling

H1.sample1.labeled

H2.sample1.labeled

h1-s1.cel

h1-s2.cel

h2-s1.cel

H. Sapiens

35 Years

Scanning

Scanning

Scanning

...

...

...

H. Sapiens

H. Sapiens

H. Sapiens

H1

H1

H2

35

35

33

Years

Years

Years

H1.sample1

H1.sample2

H2.sample1

Labeling

Labeling

H1.sample1.labeled

H2.sample1.labeled

h1-s1.cel

h1-s2.cel

h2-s1.cel

Scanning

Scanning

Scanning

...

Experimental workflow - graph representation

Page 11: UKON 2014

Spreadsheets for end-users

vocabulary for the description of the experimental workflow

syntactic interoperabilityacross biological experiments of different types

H. Sapiens

33 Years

H1

H2

H1.sample1

H1.sample2

H2.sample1

Labeling

Labeling

H1.sample1.labeled

H2.sample1.labeled

h1-s1.cel

h1-s2.cel

h2-s1.cel

H. Sapiens

35 Years

Scanning

Scanning

Scanning

...

...

...

H. Sapiens

H. Sapiens

H. Sapiens

H1

H1

H2

35

35

33

Years

Years

Years

H1.sample1

H1.sample2

H2.sample1

Labeling

Labeling

H1.sample1.labeled

H2.sample1.labeled

h1-s1.cel

h1-s2.cel

h2-s1.cel

Scanning

Scanning

Scanning

...

Experimental workflow - graph representation

Page 12: UKON 2014

Spreadsheets for end-users

vocabulary for the description of the experimental workflow

syntactic interoperabilityacross biological experiments of different types

H. Sapiens

33 Years

H1

H2

H1.sample1

H1.sample2

H2.sample1

Labeling

Labeling

H1.sample1.labeled

H2.sample1.labeled

h1-s1.cel

h1-s2.cel

h2-s1.cel

H. Sapiens

35 Years

Scanning

Scanning

Scanning

...

...

...

H. Sapiens

H. Sapiens

H. Sapiens

H1

H1

H2

35

35

33

Years

Years

Years

H1.sample1

H1.sample2

H2.sample1

Labeling

Labeling

H1.sample1.labeled

H2.sample1.labeled

h1-s1.cel

h1-s2.cel

h2-s1.cel

Scanning

Scanning

Scanning

...

Experimental workflow - graph representation

Support for O

ntology

Annotation

Page 13: UKON 2014

H1

semantic interoperabilityacross biological experiments of different types

H1.sample1

H1.sample2

Machine-readable representation Graph + Semantics

obi:material entity

tax:homo sapiens

bfo:derives

from

obi:material sample

bfo:derives _from

labeling1

obi:material processing

obi:is_specifi

ed

_input _of

obi:processed material

H1.sample1. labeled

obi:is_specified

_output _of h1-s1.cel

isa:raw data file

obi:planned process

scanning1

obi:is_specifi

ed

_input _of

obi:is_specified

_output _of

H1.sample2. labeled

labeling2 scanning2

obi:is_specifi

ed

_input _of

obi:is_specified

_output _of obi:is_specifi

ed

_input _of

obi:is_specified

_output _of h1-s2.cel

labeling protocol

obi:protocol

isa:

exec

utes

Page 14: UKON 2014

& more, e.g. MS Excel, OpenOffice

ISA config

files

ISA-Tab files ISA mapping

files

Graph

Analyzer

Conversion

Engine

ISA model

Ontology

Lookup

IRI

Generator

ISA Mapping

Parser

ISA graph

IRIs

ISA mappings

Page 15: UKON 2014

-­‐ Ontology  search  and  automated  tagging      -­‐ (relying  on  NCBO  BioPortal  services  and  also  LOV  services  in  the  2nd  version)  on  Google  Spreadsheets  

  -­‐  Collabora?ve  annota?on;  support  for  distributed  users     -­‐  Version  control  &  history

OntoMaton:(a(Bioportal(powered(Ontology(widget(for(Google(

Spreadsheets(Maguire(et(al,((2013(

Bioinforma?cs(

Page 16: UKON 2014

OntoMaton:(a(Bioportal(powered(Ontology(widget(for(Google(

Spreadsheets(Maguire(et(al,((2013(

Bioinforma?cs(

Page 17: UKON 2014

OntoMaton:(a(Bioportal(powered(Ontology(widget(for(Google(

Spreadsheets(Maguire(et(al,((2013(

Bioinforma?cs(

Page 18: UKON 2014

OntoMaton:(a(Bioportal(powered(Ontology(widget(for(Google(

Spreadsheets(Maguire(et(al,((2013(

Bioinforma?cs(

Page 19: UKON 2014

OntoMaton:(a(Bioportal(powered(Ontology(widget(for(Google(

Spreadsheets(Maguire(et(al,((2013(

Bioinforma?cs(

Page 20: UKON 2014

ISA$OBI'mapping'Ontology for Biomedical

Investigations

Page 21: UKON 2014

ISA$OBI'mapping'Ontology for Biomedical

Investigations

Also mappings to SIO, PROV-O

Page 22: UKON 2014
Page 23: UKON 2014

investigation studies assays

measurement technology

Page 24: UKON 2014

investigation studies assays

measurement technology

Underlying RDF representation

Page 25: UKON 2014

Bio-GraphIIn web application

Page 26: UKON 2014

Bio-GraphIIn web application

Page 27: UKON 2014

http://isa-tools.github.io/soapdenovo2/

Page 28: UKON 2014

!18

http://isa-tools.github.io/stato/

• General-purpose statistics ontology

• Coverage for processes (e.g. statistical tests and their condition of application) and information needed or resulting from statistical methods (e.g. probability distributions, variable, spread and variation metrics)

• STATO also benefits from: (i) extensive documentation with the provision of textual and formal definitions; (ii) an associated R code snippets using the dedicated R-command metadata tag, aiming at facilitating teaching and learning while relying of the popular R language; (iii) query examples documentation, highlighting how the ontology can be harnessed for reviewers/tutors/student alike.

Developed in collaboration with Dr Burke, Senior Statistician, Nuffield Department of Population Health, University of Oxford

Page 29: UKON 2014

!18

http://isa-tools.github.io/stato/

• General-purpose statistics ontology

• Coverage for processes (e.g. statistical tests and their condition of application) and information needed or resulting from statistical methods (e.g. probability distributions, variable, spread and variation metrics)

• STATO also benefits from: (i) extensive documentation with the provision of textual and formal definitions; (ii) an associated R code snippets using the dedicated R-command metadata tag, aiming at facilitating teaching and learning while relying of the popular R language; (iii) query examples documentation, highlighting how the ontology can be harnessed for reviewers/tutors/student alike.

Developed in collaboration with Dr Burke, Senior Statistician, Nuffield Department of Population Health, University of Oxford

Page 30: UKON 2014

Linked Experimental Data

• Importance of data interoperability for experimental data

• Fragmentation of formats and databases

• ISA-TAB for syntactic interoperability

• ISA-OWL for semantic interoperability

• Decoupling conversion engine from semantic framework

• Support for data integration, uniform semantic queries across experiments enabled by a common semantic framework (provided by ISA2OWL)

• Application: Bio-GraphIIn, SOAPdenovo2 use case

• STATistical Ontology for annotation of statistical analysis results

Page 31: UKON 2014

funders

Page 32: UKON 2014

Questions?

You can email us... [email protected]

View our blog http://isatools.wordpress.com

Follow us on Twitter @isatools

View our website http://www.isa-tools.org

View our Git repo & contribute http://github.com/ISA-tools

Thanks for your attention!