Pragmatic Big Data View for Translational...

30
Translational Medicine Gary Berg-Cross SOCoP EarthCube S/O group RDA WG on Data Foundations & Terminology Scimaps Advisory Board

Transcript of Pragmatic Big Data View for Translational...

Page 1: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Translational Medicine

Gary Berg-Cross

SOCoP EarthCube SO group

RDA WG on Data Foundations amp Terminology

Scimaps Advisory Board

Opportunities in Translational Research Cycle Many

Biological amp BedsideMedical (BampB) Areas of Big Data

Broad Biology Landscape

Medical Genomics Large-scale Genomics

Network

Biology Opportunities to explore data

and accelerate process of

discovery amp treatment

New fields of related research amp application

Lab Medical Practice (15Y)

Bench

Bedside Med-Concepts

SNOMED-CT

Patient

Encounters

Observations

Trials

Outline of Remaining Discussion

A few slides on general ontology issues

Import some strategies from the EarthCube Semantics and

Ontology WG effort

Slides show some adaptation from earth science to medical

domains

Keeping an eye on start up work on data sharing from Research

Data Alliance

Sound Ontology Development should

Leverage some Existing Semantic Theories

Theory of Parts Mereology or mereotopology Is parthood transitive

Some are some not

Theory of Wholes what is the difference between a part and a whole

The whole of a treatment

Theory of Essence and Identity what are essential ie necessary properties

If you lose a necessary property you lose identity If patient Jim loses an arm he‟s still Jim but not if he dies in treatment

When is a treatment a different treatment a genetic variant

Theory of Dependence some things and properties depend on others

Theory of Qualities features attributes qualia quality spaces

Theory of Composition and Constitution What makes up Diabetes or a treatment

bull Adapted from Guarino N Multiple tutorials 2002-2010 But also seen

in BioPortal work

There are Additional Theories to

Consider for Ontology Development

ndashTheory of Participation amp Roles (very important for Health Delivery part) a conceptual framework for describing and analyzing communicative phenomena agency community problem-solving intersects formal pragmatics speech acts intents etc

ndashTheory of Representation how does one thing represent another Models represent transcription influence of cell function Treatment plan or specification represent real delivery acts

ndashTheory of Time Spacetime and Events Cellular Events and Organ States how to bridge these

Adapted from Guarino N Multiple tutorials 2002-2010

Opportunities in Translational Research Cycle Many

Biological amp BedsideMedical (BampB) Areas of Big Data

Bridging Vocabularies amp Ontologies

Lab Medical Practice

ldquoEHR-driven genomic researchrdquo

(EDGR) EHR data linked to DNA

samples

Trials

Technology Insertion

Lightweight Methods

TM use cases ndash eg Alzheimer along the TM spectrum

Differing granular units emerge from interactions

Ontologies

TMO etc

Push

Semantics and Ontologies For Translational Medicine 7

Relevant Ideas from EarthCube Semantics amp

Ontology and RDA

For Example

Knowledge-based infrastructures using semantic annotation of metadata realized using shared vocabularies typically ontologies

Semantic technologies enable better searching metadata catalogs

Many Semantic Technology parts but

an important driver has been the Semantic

Web amp Linked Data framework LOD delivering a platform agnostic variant of ODBC and

JDBC Data Source Names (DSNs) via hyperlinks

Ontologies amp KR languages that restrict the interpretation of domain vocabulary towards their intended meaning

Enable reasoning services

Semantic

Approaches

Managing Scientific Data From Data Integration to Scientific Workflows

httpuserssdscedu~ludaeschPapergsa-smspdf (Ludascher et al)

Semantic Automated Discovery

and Integration

httpsadiframeworkorg ldquoShow me patients whose

creatinine level is increasing

over time along with their

latest BUN and creatinine

levelsrdquo

Linking Open

Biomedical Data

(twc-lobd) httpcodegooglec

omptwc-lobd

TMO (Dumontier et

al 2010)

Semantics and Ontologies For Translational Medicine 8

Graphic Overview of Strategy Semantics amp Ontologies

-Ideas from EarthCube SO group

Knowledge Infrastructure Vision Community Understanding of

Semantic role and value

Guiding principles ndashShare Methods

1 Understand the Drivers (LoD)

2 Lightweight -opportunistic

1 Modular patterns

2 Resuablehellip

3 Semantic interoperability with

semantic heterogeneity

4 Bottom-up amp top-down approaches

5 Domain - ontology engineer teams

6 Formalized bodies of knowledge

across TM domains

7 Reasoning services

ldquonew tech

Insertionrdquo

TM

Genomics Proteomics Disease

Semantics and Ontologies For Translational Medicine 9

Linked B amp B Data Applications Horizontal amp Vertical Integration

chemical

DBs

Genomic

Domain

DBs

Proteomic

Domain

DBs

Disease

Domain

DBs

Treatment

Domain

DBs

Treatment

things

Chem

things

Proteomic

things Genomic

things

Disease

things

Sigma

Etc

Marbles

Etc

Dbpedia

Mobile

Etc

RDFs

OWL

After Christian Bizer

The Web of Linked

Data (26072009)

URIBurner

Etc

EHR

BioPaths

Domain

DBs

Observations

Longitudinal

data

Lab Testhellip

TMO ~ 75 classes for material entities (molecule protein cell lines

pharm preps)roles (subject target active ingredient) processes

(diagnosis study intervention) amp info entities(eg dosage

mechanism of action signsymptom

Design Pattern Annotation Approach Premise improve discovery re-usage and integration of TM data from

different sources by means of semantic annotation

Process

Leverage ontology design pattern flexible and self-contained building

blocks more suitable for simple annotation of interdisciplinary multi-

thematic and multi-perspective data than foundational ontologies alone

Annotate based on the common vocabularies with domain-specific

aspects added on top of them Example of Semantic Trajectory

Axiom - an x is enforced to have a timestamp and a

position associated An x must belong to a trajectory

enforced by this Axiom

Fix le Exists atTimeOWL-TimeTemporal Thing and le

hasLocationPosition and hasFix_ SemanticTrajectory

helliphellipWe automatize the creation of proper-

ties hasNext hasSuccessor hasPrevious and

hasPredecessor making use of DL Axioms etchellip

Schematic Basis

for annotation

Semantics and Ontologies For Translational Medicine

11

Use Case and Competency Question Methods

Guide lightweight design development from use cases and

Competency Questions needed for applications (Tells us what an Ontology is good for)

Semantic Trajectory Qs

Show moving objects which stop at x and y(could be only x and y)

Show the objects which move at a ground speed of 04 ms

Each moving object(x) has attributes (temperature ground speed heading direction) which describe s status and the environment at that x

Show the trajectories which cross national parks

Parks as Points of Interest and available on maps by lat-lon

Semantics and Ontologies For Translational Medicine

12

Lightweight Building Block Illustration

Low hanging fruit leverages initial vocabularies amp existing

conceptual models to ensure that semantics models are available

for use in early stages of work

Reduced entry barrier for domain scientists to contribute dataLoD

directly applicable to a variety of trajectory datasets and

easily extensible eg to align with existing ontologies foundational

ontologies or other domain specific vocabularies

Simple partspatterns amp direct relations to data Triple like parts

Semantics and Ontologies For EarthCube 13

Incremental Approaches Richer Schemata

Simple Feature-State Model (from GRAIL) becomes a richer schema

Warm patienthellip

Too simple a triple

Richer schema

Semantics and Ontologies For EarthCube 14

Allow for Bottom-up amp Top-down Approaches to Semantics

This will ensure a vertical integration from the observations-based data level up

to the theory-driven formalization of key domain facts (such networks in System

Biology)

Transcription

Process

Integrating systems biology models amp biomedical ontologies RHoehndorf

MDumontier JGennari S Wimalaratne5 Bde Bono D Cook and Geo G Koutos

Lab amp Clinical

Observations

From Systems

Biology Markup

Language

After The Translational Medicine Ontology KB driving personalized

medicine by bridging the gap between bench and bedside Joanne S

Luciano et al J Biomed Semantics 2011 2(Suppl 2) S1

15

Do it ldquoBehind the Scenesrdquo

Automate linking the

BampB data via terms and

Correlated measurements

Patient Observations

Lab Tests

Formalize

BampB Models amp data

relations

Semantic technologies require knowledge of formal logic that is

unfamiliar to most BioMedical scientists So Institutionalize what we can

ldquoYou mean I donrsquot have to be able to read

XML RDF or OWL Yea

EPR

Reengineered

EHR biochemical

haematological

amp SNP profile

Acceptable

diagnosis of AD

with behavioral

assessments

cognitive tests

and appropriate

brain scan

BioInformation

Models

Tagging

Annotation etc

Treatment

Plan

SNP verdict

efficacious

disease receptor positive

ndash toxic metabolites

Drug available

Pharmacogenomics

DB

16

Communicating an Understandable Value Proposition

What Uncover hidden heterogeneities amp make them explicit

This affords incompatibility discovery prevent users from mixing apples and oranges

How Promote common vocabularies for annotating and describing data using terms formalized ontologies

Leverage vast number of available repositories ontologies methods standards and tools that support scientists in publishing sharing and discovering data

Value gt expected from annotation using simple metadata

But the community needs to understand the semantic technologies

vision-infrastructure-value in a non-technical language

After

Patrick Maueacute Roth Marcell

(2012) Lost in Translation ndash

Mediating between

distributed environmental

resources 6th

International Congress on

Environmental Modelling

and Software (iEMSs) 2012

Leipzig

Translational Medicine

Objectsconcepts

TM Ontology

Semantics and Ontologies For EarthCube

17

Next Generation TM Vision

Support distributed and interdisciplinary knowledge infrastructures to handle exchange

integration and reuse of heterogeneous Big Data

Linked Data Argument ndash

Linked Data is an easily adoptable

and ready-to-use paradigm that

enables data integration and

interoperation by opening up data

silos

- Part of a knowledge infrastructure

Digital Medicine

Semantics and Ontologies For EarthCube 18

Seven (or so) Guiding Principles for

Facilitating Implementation and Application

Methods

1 Driven by concrete TM use cases needs

2 Use lightweight (semantic) approaches

3 Foster semantic interoperability without restricting extant semantic heterogeneity

4 Employ bottom-up and top-down semantics approaches

5 Involve amp enable domain experts assisted by ontology engineers

6 Continue work with S amp O to build a formal body of knowledge in the various health domains involved in TM

Technology

7 Employ classical and non-classical reasoning services

Semantics and Ontologies For EarthCube 19

Understand Requirements

Concrete Use Cases

Work should be driven by use cases generated by members of the TM community ndash eg Alzheimer

Collecting a set of use cases from groups along the TM spectrum

Need a substantial study of interconnected use cases which expose requirements related to data models and tools

which have clear implications for data interoperability ontology and semantics infrastructure

Semantics and Ontologies For EarthCube 20

Foster Semantic Interoperability without restricting

underlying Semantic Heterogeneity

Problem Heterogeneity is introduced by the diverse clinical amp research communities

Solution Provide methods that enable users to flexibly load and combine different ontologies instead of hardwiring data to particular ontologies and thus hinder their flexible reusability

Example - Work from modular building blocks with microtheories of locally valid semantics

Manage multiple small internally consistent ontologies and focus on interrelations as needed for inter-operation

S Duce amp K Janowicz

ldquoMicrotheories for SDIrdquo

2010

Semantics and Ontologies For Translational Medicine 21

Integrated KE Teams amp Process- domain

experts and semantic technologists

Projects must be structured so domain experts are active participants in building semantic models from use cases thru conceptualization to validating final products

Use

Consistent strategies amp methods

Facilitate good documentation and

May need regular Educational Workshops on how to do this and also publish retrieve and integrate data models and workflows

Practical model for the design amp execution of

translational informatics projects

From Philip Payne‟s Biomedical Knowledge Integration Dec 2012

Illustrates major phases

exemplary input or output

resources and data sets

Insert Semantics

Semantics and Ontologies For Translational Medicine

23

Methods for Useful Formalized Bodies of

Knowledge

Apply ontological engineeringKE to capture the body of knowledge for various TM domains

Conceptualization of local models

Work on primitives ie base symbols for such ontologies

Ground primitives in real observations and align them to knowledge patterns

Track categorical data back to measurements using provenance

(eg RDF in context)

Work to make ontologies first class citizens usable by statistic methods

After construction phase organize building blocks amp ontological models

To help access data domain models and their use in tools

This can also be used for educational applications for learning about domain concepts and extracting information

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 2: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Opportunities in Translational Research Cycle Many

Biological amp BedsideMedical (BampB) Areas of Big Data

Broad Biology Landscape

Medical Genomics Large-scale Genomics

Network

Biology Opportunities to explore data

and accelerate process of

discovery amp treatment

New fields of related research amp application

Lab Medical Practice (15Y)

Bench

Bedside Med-Concepts

SNOMED-CT

Patient

Encounters

Observations

Trials

Outline of Remaining Discussion

A few slides on general ontology issues

Import some strategies from the EarthCube Semantics and

Ontology WG effort

Slides show some adaptation from earth science to medical

domains

Keeping an eye on start up work on data sharing from Research

Data Alliance

Sound Ontology Development should

Leverage some Existing Semantic Theories

Theory of Parts Mereology or mereotopology Is parthood transitive

Some are some not

Theory of Wholes what is the difference between a part and a whole

The whole of a treatment

Theory of Essence and Identity what are essential ie necessary properties

If you lose a necessary property you lose identity If patient Jim loses an arm he‟s still Jim but not if he dies in treatment

When is a treatment a different treatment a genetic variant

Theory of Dependence some things and properties depend on others

Theory of Qualities features attributes qualia quality spaces

Theory of Composition and Constitution What makes up Diabetes or a treatment

bull Adapted from Guarino N Multiple tutorials 2002-2010 But also seen

in BioPortal work

There are Additional Theories to

Consider for Ontology Development

ndashTheory of Participation amp Roles (very important for Health Delivery part) a conceptual framework for describing and analyzing communicative phenomena agency community problem-solving intersects formal pragmatics speech acts intents etc

ndashTheory of Representation how does one thing represent another Models represent transcription influence of cell function Treatment plan or specification represent real delivery acts

ndashTheory of Time Spacetime and Events Cellular Events and Organ States how to bridge these

Adapted from Guarino N Multiple tutorials 2002-2010

Opportunities in Translational Research Cycle Many

Biological amp BedsideMedical (BampB) Areas of Big Data

Bridging Vocabularies amp Ontologies

Lab Medical Practice

ldquoEHR-driven genomic researchrdquo

(EDGR) EHR data linked to DNA

samples

Trials

Technology Insertion

Lightweight Methods

TM use cases ndash eg Alzheimer along the TM spectrum

Differing granular units emerge from interactions

Ontologies

TMO etc

Push

Semantics and Ontologies For Translational Medicine 7

Relevant Ideas from EarthCube Semantics amp

Ontology and RDA

For Example

Knowledge-based infrastructures using semantic annotation of metadata realized using shared vocabularies typically ontologies

Semantic technologies enable better searching metadata catalogs

Many Semantic Technology parts but

an important driver has been the Semantic

Web amp Linked Data framework LOD delivering a platform agnostic variant of ODBC and

JDBC Data Source Names (DSNs) via hyperlinks

Ontologies amp KR languages that restrict the interpretation of domain vocabulary towards their intended meaning

Enable reasoning services

Semantic

Approaches

Managing Scientific Data From Data Integration to Scientific Workflows

httpuserssdscedu~ludaeschPapergsa-smspdf (Ludascher et al)

Semantic Automated Discovery

and Integration

httpsadiframeworkorg ldquoShow me patients whose

creatinine level is increasing

over time along with their

latest BUN and creatinine

levelsrdquo

Linking Open

Biomedical Data

(twc-lobd) httpcodegooglec

omptwc-lobd

TMO (Dumontier et

al 2010)

Semantics and Ontologies For Translational Medicine 8

Graphic Overview of Strategy Semantics amp Ontologies

-Ideas from EarthCube SO group

Knowledge Infrastructure Vision Community Understanding of

Semantic role and value

Guiding principles ndashShare Methods

1 Understand the Drivers (LoD)

2 Lightweight -opportunistic

1 Modular patterns

2 Resuablehellip

3 Semantic interoperability with

semantic heterogeneity

4 Bottom-up amp top-down approaches

5 Domain - ontology engineer teams

6 Formalized bodies of knowledge

across TM domains

7 Reasoning services

ldquonew tech

Insertionrdquo

TM

Genomics Proteomics Disease

Semantics and Ontologies For Translational Medicine 9

Linked B amp B Data Applications Horizontal amp Vertical Integration

chemical

DBs

Genomic

Domain

DBs

Proteomic

Domain

DBs

Disease

Domain

DBs

Treatment

Domain

DBs

Treatment

things

Chem

things

Proteomic

things Genomic

things

Disease

things

Sigma

Etc

Marbles

Etc

Dbpedia

Mobile

Etc

RDFs

OWL

After Christian Bizer

The Web of Linked

Data (26072009)

URIBurner

Etc

EHR

BioPaths

Domain

DBs

Observations

Longitudinal

data

Lab Testhellip

TMO ~ 75 classes for material entities (molecule protein cell lines

pharm preps)roles (subject target active ingredient) processes

(diagnosis study intervention) amp info entities(eg dosage

mechanism of action signsymptom

Design Pattern Annotation Approach Premise improve discovery re-usage and integration of TM data from

different sources by means of semantic annotation

Process

Leverage ontology design pattern flexible and self-contained building

blocks more suitable for simple annotation of interdisciplinary multi-

thematic and multi-perspective data than foundational ontologies alone

Annotate based on the common vocabularies with domain-specific

aspects added on top of them Example of Semantic Trajectory

Axiom - an x is enforced to have a timestamp and a

position associated An x must belong to a trajectory

enforced by this Axiom

Fix le Exists atTimeOWL-TimeTemporal Thing and le

hasLocationPosition and hasFix_ SemanticTrajectory

helliphellipWe automatize the creation of proper-

ties hasNext hasSuccessor hasPrevious and

hasPredecessor making use of DL Axioms etchellip

Schematic Basis

for annotation

Semantics and Ontologies For Translational Medicine

11

Use Case and Competency Question Methods

Guide lightweight design development from use cases and

Competency Questions needed for applications (Tells us what an Ontology is good for)

Semantic Trajectory Qs

Show moving objects which stop at x and y(could be only x and y)

Show the objects which move at a ground speed of 04 ms

Each moving object(x) has attributes (temperature ground speed heading direction) which describe s status and the environment at that x

Show the trajectories which cross national parks

Parks as Points of Interest and available on maps by lat-lon

Semantics and Ontologies For Translational Medicine

12

Lightweight Building Block Illustration

Low hanging fruit leverages initial vocabularies amp existing

conceptual models to ensure that semantics models are available

for use in early stages of work

Reduced entry barrier for domain scientists to contribute dataLoD

directly applicable to a variety of trajectory datasets and

easily extensible eg to align with existing ontologies foundational

ontologies or other domain specific vocabularies

Simple partspatterns amp direct relations to data Triple like parts

Semantics and Ontologies For EarthCube 13

Incremental Approaches Richer Schemata

Simple Feature-State Model (from GRAIL) becomes a richer schema

Warm patienthellip

Too simple a triple

Richer schema

Semantics and Ontologies For EarthCube 14

Allow for Bottom-up amp Top-down Approaches to Semantics

This will ensure a vertical integration from the observations-based data level up

to the theory-driven formalization of key domain facts (such networks in System

Biology)

Transcription

Process

Integrating systems biology models amp biomedical ontologies RHoehndorf

MDumontier JGennari S Wimalaratne5 Bde Bono D Cook and Geo G Koutos

Lab amp Clinical

Observations

From Systems

Biology Markup

Language

After The Translational Medicine Ontology KB driving personalized

medicine by bridging the gap between bench and bedside Joanne S

Luciano et al J Biomed Semantics 2011 2(Suppl 2) S1

15

Do it ldquoBehind the Scenesrdquo

Automate linking the

BampB data via terms and

Correlated measurements

Patient Observations

Lab Tests

Formalize

BampB Models amp data

relations

Semantic technologies require knowledge of formal logic that is

unfamiliar to most BioMedical scientists So Institutionalize what we can

ldquoYou mean I donrsquot have to be able to read

XML RDF or OWL Yea

EPR

Reengineered

EHR biochemical

haematological

amp SNP profile

Acceptable

diagnosis of AD

with behavioral

assessments

cognitive tests

and appropriate

brain scan

BioInformation

Models

Tagging

Annotation etc

Treatment

Plan

SNP verdict

efficacious

disease receptor positive

ndash toxic metabolites

Drug available

Pharmacogenomics

DB

16

Communicating an Understandable Value Proposition

What Uncover hidden heterogeneities amp make them explicit

This affords incompatibility discovery prevent users from mixing apples and oranges

How Promote common vocabularies for annotating and describing data using terms formalized ontologies

Leverage vast number of available repositories ontologies methods standards and tools that support scientists in publishing sharing and discovering data

Value gt expected from annotation using simple metadata

But the community needs to understand the semantic technologies

vision-infrastructure-value in a non-technical language

After

Patrick Maueacute Roth Marcell

(2012) Lost in Translation ndash

Mediating between

distributed environmental

resources 6th

International Congress on

Environmental Modelling

and Software (iEMSs) 2012

Leipzig

Translational Medicine

Objectsconcepts

TM Ontology

Semantics and Ontologies For EarthCube

17

Next Generation TM Vision

Support distributed and interdisciplinary knowledge infrastructures to handle exchange

integration and reuse of heterogeneous Big Data

Linked Data Argument ndash

Linked Data is an easily adoptable

and ready-to-use paradigm that

enables data integration and

interoperation by opening up data

silos

- Part of a knowledge infrastructure

Digital Medicine

Semantics and Ontologies For EarthCube 18

Seven (or so) Guiding Principles for

Facilitating Implementation and Application

Methods

1 Driven by concrete TM use cases needs

2 Use lightweight (semantic) approaches

3 Foster semantic interoperability without restricting extant semantic heterogeneity

4 Employ bottom-up and top-down semantics approaches

5 Involve amp enable domain experts assisted by ontology engineers

6 Continue work with S amp O to build a formal body of knowledge in the various health domains involved in TM

Technology

7 Employ classical and non-classical reasoning services

Semantics and Ontologies For EarthCube 19

Understand Requirements

Concrete Use Cases

Work should be driven by use cases generated by members of the TM community ndash eg Alzheimer

Collecting a set of use cases from groups along the TM spectrum

Need a substantial study of interconnected use cases which expose requirements related to data models and tools

which have clear implications for data interoperability ontology and semantics infrastructure

Semantics and Ontologies For EarthCube 20

Foster Semantic Interoperability without restricting

underlying Semantic Heterogeneity

Problem Heterogeneity is introduced by the diverse clinical amp research communities

Solution Provide methods that enable users to flexibly load and combine different ontologies instead of hardwiring data to particular ontologies and thus hinder their flexible reusability

Example - Work from modular building blocks with microtheories of locally valid semantics

Manage multiple small internally consistent ontologies and focus on interrelations as needed for inter-operation

S Duce amp K Janowicz

ldquoMicrotheories for SDIrdquo

2010

Semantics and Ontologies For Translational Medicine 21

Integrated KE Teams amp Process- domain

experts and semantic technologists

Projects must be structured so domain experts are active participants in building semantic models from use cases thru conceptualization to validating final products

Use

Consistent strategies amp methods

Facilitate good documentation and

May need regular Educational Workshops on how to do this and also publish retrieve and integrate data models and workflows

Practical model for the design amp execution of

translational informatics projects

From Philip Payne‟s Biomedical Knowledge Integration Dec 2012

Illustrates major phases

exemplary input or output

resources and data sets

Insert Semantics

Semantics and Ontologies For Translational Medicine

23

Methods for Useful Formalized Bodies of

Knowledge

Apply ontological engineeringKE to capture the body of knowledge for various TM domains

Conceptualization of local models

Work on primitives ie base symbols for such ontologies

Ground primitives in real observations and align them to knowledge patterns

Track categorical data back to measurements using provenance

(eg RDF in context)

Work to make ontologies first class citizens usable by statistic methods

After construction phase organize building blocks amp ontological models

To help access data domain models and their use in tools

This can also be used for educational applications for learning about domain concepts and extracting information

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 3: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Outline of Remaining Discussion

A few slides on general ontology issues

Import some strategies from the EarthCube Semantics and

Ontology WG effort

Slides show some adaptation from earth science to medical

domains

Keeping an eye on start up work on data sharing from Research

Data Alliance

Sound Ontology Development should

Leverage some Existing Semantic Theories

Theory of Parts Mereology or mereotopology Is parthood transitive

Some are some not

Theory of Wholes what is the difference between a part and a whole

The whole of a treatment

Theory of Essence and Identity what are essential ie necessary properties

If you lose a necessary property you lose identity If patient Jim loses an arm he‟s still Jim but not if he dies in treatment

When is a treatment a different treatment a genetic variant

Theory of Dependence some things and properties depend on others

Theory of Qualities features attributes qualia quality spaces

Theory of Composition and Constitution What makes up Diabetes or a treatment

bull Adapted from Guarino N Multiple tutorials 2002-2010 But also seen

in BioPortal work

There are Additional Theories to

Consider for Ontology Development

ndashTheory of Participation amp Roles (very important for Health Delivery part) a conceptual framework for describing and analyzing communicative phenomena agency community problem-solving intersects formal pragmatics speech acts intents etc

ndashTheory of Representation how does one thing represent another Models represent transcription influence of cell function Treatment plan or specification represent real delivery acts

ndashTheory of Time Spacetime and Events Cellular Events and Organ States how to bridge these

Adapted from Guarino N Multiple tutorials 2002-2010

Opportunities in Translational Research Cycle Many

Biological amp BedsideMedical (BampB) Areas of Big Data

Bridging Vocabularies amp Ontologies

Lab Medical Practice

ldquoEHR-driven genomic researchrdquo

(EDGR) EHR data linked to DNA

samples

Trials

Technology Insertion

Lightweight Methods

TM use cases ndash eg Alzheimer along the TM spectrum

Differing granular units emerge from interactions

Ontologies

TMO etc

Push

Semantics and Ontologies For Translational Medicine 7

Relevant Ideas from EarthCube Semantics amp

Ontology and RDA

For Example

Knowledge-based infrastructures using semantic annotation of metadata realized using shared vocabularies typically ontologies

Semantic technologies enable better searching metadata catalogs

Many Semantic Technology parts but

an important driver has been the Semantic

Web amp Linked Data framework LOD delivering a platform agnostic variant of ODBC and

JDBC Data Source Names (DSNs) via hyperlinks

Ontologies amp KR languages that restrict the interpretation of domain vocabulary towards their intended meaning

Enable reasoning services

Semantic

Approaches

Managing Scientific Data From Data Integration to Scientific Workflows

httpuserssdscedu~ludaeschPapergsa-smspdf (Ludascher et al)

Semantic Automated Discovery

and Integration

httpsadiframeworkorg ldquoShow me patients whose

creatinine level is increasing

over time along with their

latest BUN and creatinine

levelsrdquo

Linking Open

Biomedical Data

(twc-lobd) httpcodegooglec

omptwc-lobd

TMO (Dumontier et

al 2010)

Semantics and Ontologies For Translational Medicine 8

Graphic Overview of Strategy Semantics amp Ontologies

-Ideas from EarthCube SO group

Knowledge Infrastructure Vision Community Understanding of

Semantic role and value

Guiding principles ndashShare Methods

1 Understand the Drivers (LoD)

2 Lightweight -opportunistic

1 Modular patterns

2 Resuablehellip

3 Semantic interoperability with

semantic heterogeneity

4 Bottom-up amp top-down approaches

5 Domain - ontology engineer teams

6 Formalized bodies of knowledge

across TM domains

7 Reasoning services

ldquonew tech

Insertionrdquo

TM

Genomics Proteomics Disease

Semantics and Ontologies For Translational Medicine 9

Linked B amp B Data Applications Horizontal amp Vertical Integration

chemical

DBs

Genomic

Domain

DBs

Proteomic

Domain

DBs

Disease

Domain

DBs

Treatment

Domain

DBs

Treatment

things

Chem

things

Proteomic

things Genomic

things

Disease

things

Sigma

Etc

Marbles

Etc

Dbpedia

Mobile

Etc

RDFs

OWL

After Christian Bizer

The Web of Linked

Data (26072009)

URIBurner

Etc

EHR

BioPaths

Domain

DBs

Observations

Longitudinal

data

Lab Testhellip

TMO ~ 75 classes for material entities (molecule protein cell lines

pharm preps)roles (subject target active ingredient) processes

(diagnosis study intervention) amp info entities(eg dosage

mechanism of action signsymptom

Design Pattern Annotation Approach Premise improve discovery re-usage and integration of TM data from

different sources by means of semantic annotation

Process

Leverage ontology design pattern flexible and self-contained building

blocks more suitable for simple annotation of interdisciplinary multi-

thematic and multi-perspective data than foundational ontologies alone

Annotate based on the common vocabularies with domain-specific

aspects added on top of them Example of Semantic Trajectory

Axiom - an x is enforced to have a timestamp and a

position associated An x must belong to a trajectory

enforced by this Axiom

Fix le Exists atTimeOWL-TimeTemporal Thing and le

hasLocationPosition and hasFix_ SemanticTrajectory

helliphellipWe automatize the creation of proper-

ties hasNext hasSuccessor hasPrevious and

hasPredecessor making use of DL Axioms etchellip

Schematic Basis

for annotation

Semantics and Ontologies For Translational Medicine

11

Use Case and Competency Question Methods

Guide lightweight design development from use cases and

Competency Questions needed for applications (Tells us what an Ontology is good for)

Semantic Trajectory Qs

Show moving objects which stop at x and y(could be only x and y)

Show the objects which move at a ground speed of 04 ms

Each moving object(x) has attributes (temperature ground speed heading direction) which describe s status and the environment at that x

Show the trajectories which cross national parks

Parks as Points of Interest and available on maps by lat-lon

Semantics and Ontologies For Translational Medicine

12

Lightweight Building Block Illustration

Low hanging fruit leverages initial vocabularies amp existing

conceptual models to ensure that semantics models are available

for use in early stages of work

Reduced entry barrier for domain scientists to contribute dataLoD

directly applicable to a variety of trajectory datasets and

easily extensible eg to align with existing ontologies foundational

ontologies or other domain specific vocabularies

Simple partspatterns amp direct relations to data Triple like parts

Semantics and Ontologies For EarthCube 13

Incremental Approaches Richer Schemata

Simple Feature-State Model (from GRAIL) becomes a richer schema

Warm patienthellip

Too simple a triple

Richer schema

Semantics and Ontologies For EarthCube 14

Allow for Bottom-up amp Top-down Approaches to Semantics

This will ensure a vertical integration from the observations-based data level up

to the theory-driven formalization of key domain facts (such networks in System

Biology)

Transcription

Process

Integrating systems biology models amp biomedical ontologies RHoehndorf

MDumontier JGennari S Wimalaratne5 Bde Bono D Cook and Geo G Koutos

Lab amp Clinical

Observations

From Systems

Biology Markup

Language

After The Translational Medicine Ontology KB driving personalized

medicine by bridging the gap between bench and bedside Joanne S

Luciano et al J Biomed Semantics 2011 2(Suppl 2) S1

15

Do it ldquoBehind the Scenesrdquo

Automate linking the

BampB data via terms and

Correlated measurements

Patient Observations

Lab Tests

Formalize

BampB Models amp data

relations

Semantic technologies require knowledge of formal logic that is

unfamiliar to most BioMedical scientists So Institutionalize what we can

ldquoYou mean I donrsquot have to be able to read

XML RDF or OWL Yea

EPR

Reengineered

EHR biochemical

haematological

amp SNP profile

Acceptable

diagnosis of AD

with behavioral

assessments

cognitive tests

and appropriate

brain scan

BioInformation

Models

Tagging

Annotation etc

Treatment

Plan

SNP verdict

efficacious

disease receptor positive

ndash toxic metabolites

Drug available

Pharmacogenomics

DB

16

Communicating an Understandable Value Proposition

What Uncover hidden heterogeneities amp make them explicit

This affords incompatibility discovery prevent users from mixing apples and oranges

How Promote common vocabularies for annotating and describing data using terms formalized ontologies

Leverage vast number of available repositories ontologies methods standards and tools that support scientists in publishing sharing and discovering data

Value gt expected from annotation using simple metadata

But the community needs to understand the semantic technologies

vision-infrastructure-value in a non-technical language

After

Patrick Maueacute Roth Marcell

(2012) Lost in Translation ndash

Mediating between

distributed environmental

resources 6th

International Congress on

Environmental Modelling

and Software (iEMSs) 2012

Leipzig

Translational Medicine

Objectsconcepts

TM Ontology

Semantics and Ontologies For EarthCube

17

Next Generation TM Vision

Support distributed and interdisciplinary knowledge infrastructures to handle exchange

integration and reuse of heterogeneous Big Data

Linked Data Argument ndash

Linked Data is an easily adoptable

and ready-to-use paradigm that

enables data integration and

interoperation by opening up data

silos

- Part of a knowledge infrastructure

Digital Medicine

Semantics and Ontologies For EarthCube 18

Seven (or so) Guiding Principles for

Facilitating Implementation and Application

Methods

1 Driven by concrete TM use cases needs

2 Use lightweight (semantic) approaches

3 Foster semantic interoperability without restricting extant semantic heterogeneity

4 Employ bottom-up and top-down semantics approaches

5 Involve amp enable domain experts assisted by ontology engineers

6 Continue work with S amp O to build a formal body of knowledge in the various health domains involved in TM

Technology

7 Employ classical and non-classical reasoning services

Semantics and Ontologies For EarthCube 19

Understand Requirements

Concrete Use Cases

Work should be driven by use cases generated by members of the TM community ndash eg Alzheimer

Collecting a set of use cases from groups along the TM spectrum

Need a substantial study of interconnected use cases which expose requirements related to data models and tools

which have clear implications for data interoperability ontology and semantics infrastructure

Semantics and Ontologies For EarthCube 20

Foster Semantic Interoperability without restricting

underlying Semantic Heterogeneity

Problem Heterogeneity is introduced by the diverse clinical amp research communities

Solution Provide methods that enable users to flexibly load and combine different ontologies instead of hardwiring data to particular ontologies and thus hinder their flexible reusability

Example - Work from modular building blocks with microtheories of locally valid semantics

Manage multiple small internally consistent ontologies and focus on interrelations as needed for inter-operation

S Duce amp K Janowicz

ldquoMicrotheories for SDIrdquo

2010

Semantics and Ontologies For Translational Medicine 21

Integrated KE Teams amp Process- domain

experts and semantic technologists

Projects must be structured so domain experts are active participants in building semantic models from use cases thru conceptualization to validating final products

Use

Consistent strategies amp methods

Facilitate good documentation and

May need regular Educational Workshops on how to do this and also publish retrieve and integrate data models and workflows

Practical model for the design amp execution of

translational informatics projects

From Philip Payne‟s Biomedical Knowledge Integration Dec 2012

Illustrates major phases

exemplary input or output

resources and data sets

Insert Semantics

Semantics and Ontologies For Translational Medicine

23

Methods for Useful Formalized Bodies of

Knowledge

Apply ontological engineeringKE to capture the body of knowledge for various TM domains

Conceptualization of local models

Work on primitives ie base symbols for such ontologies

Ground primitives in real observations and align them to knowledge patterns

Track categorical data back to measurements using provenance

(eg RDF in context)

Work to make ontologies first class citizens usable by statistic methods

After construction phase organize building blocks amp ontological models

To help access data domain models and their use in tools

This can also be used for educational applications for learning about domain concepts and extracting information

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 4: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Sound Ontology Development should

Leverage some Existing Semantic Theories

Theory of Parts Mereology or mereotopology Is parthood transitive

Some are some not

Theory of Wholes what is the difference between a part and a whole

The whole of a treatment

Theory of Essence and Identity what are essential ie necessary properties

If you lose a necessary property you lose identity If patient Jim loses an arm he‟s still Jim but not if he dies in treatment

When is a treatment a different treatment a genetic variant

Theory of Dependence some things and properties depend on others

Theory of Qualities features attributes qualia quality spaces

Theory of Composition and Constitution What makes up Diabetes or a treatment

bull Adapted from Guarino N Multiple tutorials 2002-2010 But also seen

in BioPortal work

There are Additional Theories to

Consider for Ontology Development

ndashTheory of Participation amp Roles (very important for Health Delivery part) a conceptual framework for describing and analyzing communicative phenomena agency community problem-solving intersects formal pragmatics speech acts intents etc

ndashTheory of Representation how does one thing represent another Models represent transcription influence of cell function Treatment plan or specification represent real delivery acts

ndashTheory of Time Spacetime and Events Cellular Events and Organ States how to bridge these

Adapted from Guarino N Multiple tutorials 2002-2010

Opportunities in Translational Research Cycle Many

Biological amp BedsideMedical (BampB) Areas of Big Data

Bridging Vocabularies amp Ontologies

Lab Medical Practice

ldquoEHR-driven genomic researchrdquo

(EDGR) EHR data linked to DNA

samples

Trials

Technology Insertion

Lightweight Methods

TM use cases ndash eg Alzheimer along the TM spectrum

Differing granular units emerge from interactions

Ontologies

TMO etc

Push

Semantics and Ontologies For Translational Medicine 7

Relevant Ideas from EarthCube Semantics amp

Ontology and RDA

For Example

Knowledge-based infrastructures using semantic annotation of metadata realized using shared vocabularies typically ontologies

Semantic technologies enable better searching metadata catalogs

Many Semantic Technology parts but

an important driver has been the Semantic

Web amp Linked Data framework LOD delivering a platform agnostic variant of ODBC and

JDBC Data Source Names (DSNs) via hyperlinks

Ontologies amp KR languages that restrict the interpretation of domain vocabulary towards their intended meaning

Enable reasoning services

Semantic

Approaches

Managing Scientific Data From Data Integration to Scientific Workflows

httpuserssdscedu~ludaeschPapergsa-smspdf (Ludascher et al)

Semantic Automated Discovery

and Integration

httpsadiframeworkorg ldquoShow me patients whose

creatinine level is increasing

over time along with their

latest BUN and creatinine

levelsrdquo

Linking Open

Biomedical Data

(twc-lobd) httpcodegooglec

omptwc-lobd

TMO (Dumontier et

al 2010)

Semantics and Ontologies For Translational Medicine 8

Graphic Overview of Strategy Semantics amp Ontologies

-Ideas from EarthCube SO group

Knowledge Infrastructure Vision Community Understanding of

Semantic role and value

Guiding principles ndashShare Methods

1 Understand the Drivers (LoD)

2 Lightweight -opportunistic

1 Modular patterns

2 Resuablehellip

3 Semantic interoperability with

semantic heterogeneity

4 Bottom-up amp top-down approaches

5 Domain - ontology engineer teams

6 Formalized bodies of knowledge

across TM domains

7 Reasoning services

ldquonew tech

Insertionrdquo

TM

Genomics Proteomics Disease

Semantics and Ontologies For Translational Medicine 9

Linked B amp B Data Applications Horizontal amp Vertical Integration

chemical

DBs

Genomic

Domain

DBs

Proteomic

Domain

DBs

Disease

Domain

DBs

Treatment

Domain

DBs

Treatment

things

Chem

things

Proteomic

things Genomic

things

Disease

things

Sigma

Etc

Marbles

Etc

Dbpedia

Mobile

Etc

RDFs

OWL

After Christian Bizer

The Web of Linked

Data (26072009)

URIBurner

Etc

EHR

BioPaths

Domain

DBs

Observations

Longitudinal

data

Lab Testhellip

TMO ~ 75 classes for material entities (molecule protein cell lines

pharm preps)roles (subject target active ingredient) processes

(diagnosis study intervention) amp info entities(eg dosage

mechanism of action signsymptom

Design Pattern Annotation Approach Premise improve discovery re-usage and integration of TM data from

different sources by means of semantic annotation

Process

Leverage ontology design pattern flexible and self-contained building

blocks more suitable for simple annotation of interdisciplinary multi-

thematic and multi-perspective data than foundational ontologies alone

Annotate based on the common vocabularies with domain-specific

aspects added on top of them Example of Semantic Trajectory

Axiom - an x is enforced to have a timestamp and a

position associated An x must belong to a trajectory

enforced by this Axiom

Fix le Exists atTimeOWL-TimeTemporal Thing and le

hasLocationPosition and hasFix_ SemanticTrajectory

helliphellipWe automatize the creation of proper-

ties hasNext hasSuccessor hasPrevious and

hasPredecessor making use of DL Axioms etchellip

Schematic Basis

for annotation

Semantics and Ontologies For Translational Medicine

11

Use Case and Competency Question Methods

Guide lightweight design development from use cases and

Competency Questions needed for applications (Tells us what an Ontology is good for)

Semantic Trajectory Qs

Show moving objects which stop at x and y(could be only x and y)

Show the objects which move at a ground speed of 04 ms

Each moving object(x) has attributes (temperature ground speed heading direction) which describe s status and the environment at that x

Show the trajectories which cross national parks

Parks as Points of Interest and available on maps by lat-lon

Semantics and Ontologies For Translational Medicine

12

Lightweight Building Block Illustration

Low hanging fruit leverages initial vocabularies amp existing

conceptual models to ensure that semantics models are available

for use in early stages of work

Reduced entry barrier for domain scientists to contribute dataLoD

directly applicable to a variety of trajectory datasets and

easily extensible eg to align with existing ontologies foundational

ontologies or other domain specific vocabularies

Simple partspatterns amp direct relations to data Triple like parts

Semantics and Ontologies For EarthCube 13

Incremental Approaches Richer Schemata

Simple Feature-State Model (from GRAIL) becomes a richer schema

Warm patienthellip

Too simple a triple

Richer schema

Semantics and Ontologies For EarthCube 14

Allow for Bottom-up amp Top-down Approaches to Semantics

This will ensure a vertical integration from the observations-based data level up

to the theory-driven formalization of key domain facts (such networks in System

Biology)

Transcription

Process

Integrating systems biology models amp biomedical ontologies RHoehndorf

MDumontier JGennari S Wimalaratne5 Bde Bono D Cook and Geo G Koutos

Lab amp Clinical

Observations

From Systems

Biology Markup

Language

After The Translational Medicine Ontology KB driving personalized

medicine by bridging the gap between bench and bedside Joanne S

Luciano et al J Biomed Semantics 2011 2(Suppl 2) S1

15

Do it ldquoBehind the Scenesrdquo

Automate linking the

BampB data via terms and

Correlated measurements

Patient Observations

Lab Tests

Formalize

BampB Models amp data

relations

Semantic technologies require knowledge of formal logic that is

unfamiliar to most BioMedical scientists So Institutionalize what we can

ldquoYou mean I donrsquot have to be able to read

XML RDF or OWL Yea

EPR

Reengineered

EHR biochemical

haematological

amp SNP profile

Acceptable

diagnosis of AD

with behavioral

assessments

cognitive tests

and appropriate

brain scan

BioInformation

Models

Tagging

Annotation etc

Treatment

Plan

SNP verdict

efficacious

disease receptor positive

ndash toxic metabolites

Drug available

Pharmacogenomics

DB

16

Communicating an Understandable Value Proposition

What Uncover hidden heterogeneities amp make them explicit

This affords incompatibility discovery prevent users from mixing apples and oranges

How Promote common vocabularies for annotating and describing data using terms formalized ontologies

Leverage vast number of available repositories ontologies methods standards and tools that support scientists in publishing sharing and discovering data

Value gt expected from annotation using simple metadata

But the community needs to understand the semantic technologies

vision-infrastructure-value in a non-technical language

After

Patrick Maueacute Roth Marcell

(2012) Lost in Translation ndash

Mediating between

distributed environmental

resources 6th

International Congress on

Environmental Modelling

and Software (iEMSs) 2012

Leipzig

Translational Medicine

Objectsconcepts

TM Ontology

Semantics and Ontologies For EarthCube

17

Next Generation TM Vision

Support distributed and interdisciplinary knowledge infrastructures to handle exchange

integration and reuse of heterogeneous Big Data

Linked Data Argument ndash

Linked Data is an easily adoptable

and ready-to-use paradigm that

enables data integration and

interoperation by opening up data

silos

- Part of a knowledge infrastructure

Digital Medicine

Semantics and Ontologies For EarthCube 18

Seven (or so) Guiding Principles for

Facilitating Implementation and Application

Methods

1 Driven by concrete TM use cases needs

2 Use lightweight (semantic) approaches

3 Foster semantic interoperability without restricting extant semantic heterogeneity

4 Employ bottom-up and top-down semantics approaches

5 Involve amp enable domain experts assisted by ontology engineers

6 Continue work with S amp O to build a formal body of knowledge in the various health domains involved in TM

Technology

7 Employ classical and non-classical reasoning services

Semantics and Ontologies For EarthCube 19

Understand Requirements

Concrete Use Cases

Work should be driven by use cases generated by members of the TM community ndash eg Alzheimer

Collecting a set of use cases from groups along the TM spectrum

Need a substantial study of interconnected use cases which expose requirements related to data models and tools

which have clear implications for data interoperability ontology and semantics infrastructure

Semantics and Ontologies For EarthCube 20

Foster Semantic Interoperability without restricting

underlying Semantic Heterogeneity

Problem Heterogeneity is introduced by the diverse clinical amp research communities

Solution Provide methods that enable users to flexibly load and combine different ontologies instead of hardwiring data to particular ontologies and thus hinder their flexible reusability

Example - Work from modular building blocks with microtheories of locally valid semantics

Manage multiple small internally consistent ontologies and focus on interrelations as needed for inter-operation

S Duce amp K Janowicz

ldquoMicrotheories for SDIrdquo

2010

Semantics and Ontologies For Translational Medicine 21

Integrated KE Teams amp Process- domain

experts and semantic technologists

Projects must be structured so domain experts are active participants in building semantic models from use cases thru conceptualization to validating final products

Use

Consistent strategies amp methods

Facilitate good documentation and

May need regular Educational Workshops on how to do this and also publish retrieve and integrate data models and workflows

Practical model for the design amp execution of

translational informatics projects

From Philip Payne‟s Biomedical Knowledge Integration Dec 2012

Illustrates major phases

exemplary input or output

resources and data sets

Insert Semantics

Semantics and Ontologies For Translational Medicine

23

Methods for Useful Formalized Bodies of

Knowledge

Apply ontological engineeringKE to capture the body of knowledge for various TM domains

Conceptualization of local models

Work on primitives ie base symbols for such ontologies

Ground primitives in real observations and align them to knowledge patterns

Track categorical data back to measurements using provenance

(eg RDF in context)

Work to make ontologies first class citizens usable by statistic methods

After construction phase organize building blocks amp ontological models

To help access data domain models and their use in tools

This can also be used for educational applications for learning about domain concepts and extracting information

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 5: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

There are Additional Theories to

Consider for Ontology Development

ndashTheory of Participation amp Roles (very important for Health Delivery part) a conceptual framework for describing and analyzing communicative phenomena agency community problem-solving intersects formal pragmatics speech acts intents etc

ndashTheory of Representation how does one thing represent another Models represent transcription influence of cell function Treatment plan or specification represent real delivery acts

ndashTheory of Time Spacetime and Events Cellular Events and Organ States how to bridge these

Adapted from Guarino N Multiple tutorials 2002-2010

Opportunities in Translational Research Cycle Many

Biological amp BedsideMedical (BampB) Areas of Big Data

Bridging Vocabularies amp Ontologies

Lab Medical Practice

ldquoEHR-driven genomic researchrdquo

(EDGR) EHR data linked to DNA

samples

Trials

Technology Insertion

Lightweight Methods

TM use cases ndash eg Alzheimer along the TM spectrum

Differing granular units emerge from interactions

Ontologies

TMO etc

Push

Semantics and Ontologies For Translational Medicine 7

Relevant Ideas from EarthCube Semantics amp

Ontology and RDA

For Example

Knowledge-based infrastructures using semantic annotation of metadata realized using shared vocabularies typically ontologies

Semantic technologies enable better searching metadata catalogs

Many Semantic Technology parts but

an important driver has been the Semantic

Web amp Linked Data framework LOD delivering a platform agnostic variant of ODBC and

JDBC Data Source Names (DSNs) via hyperlinks

Ontologies amp KR languages that restrict the interpretation of domain vocabulary towards their intended meaning

Enable reasoning services

Semantic

Approaches

Managing Scientific Data From Data Integration to Scientific Workflows

httpuserssdscedu~ludaeschPapergsa-smspdf (Ludascher et al)

Semantic Automated Discovery

and Integration

httpsadiframeworkorg ldquoShow me patients whose

creatinine level is increasing

over time along with their

latest BUN and creatinine

levelsrdquo

Linking Open

Biomedical Data

(twc-lobd) httpcodegooglec

omptwc-lobd

TMO (Dumontier et

al 2010)

Semantics and Ontologies For Translational Medicine 8

Graphic Overview of Strategy Semantics amp Ontologies

-Ideas from EarthCube SO group

Knowledge Infrastructure Vision Community Understanding of

Semantic role and value

Guiding principles ndashShare Methods

1 Understand the Drivers (LoD)

2 Lightweight -opportunistic

1 Modular patterns

2 Resuablehellip

3 Semantic interoperability with

semantic heterogeneity

4 Bottom-up amp top-down approaches

5 Domain - ontology engineer teams

6 Formalized bodies of knowledge

across TM domains

7 Reasoning services

ldquonew tech

Insertionrdquo

TM

Genomics Proteomics Disease

Semantics and Ontologies For Translational Medicine 9

Linked B amp B Data Applications Horizontal amp Vertical Integration

chemical

DBs

Genomic

Domain

DBs

Proteomic

Domain

DBs

Disease

Domain

DBs

Treatment

Domain

DBs

Treatment

things

Chem

things

Proteomic

things Genomic

things

Disease

things

Sigma

Etc

Marbles

Etc

Dbpedia

Mobile

Etc

RDFs

OWL

After Christian Bizer

The Web of Linked

Data (26072009)

URIBurner

Etc

EHR

BioPaths

Domain

DBs

Observations

Longitudinal

data

Lab Testhellip

TMO ~ 75 classes for material entities (molecule protein cell lines

pharm preps)roles (subject target active ingredient) processes

(diagnosis study intervention) amp info entities(eg dosage

mechanism of action signsymptom

Design Pattern Annotation Approach Premise improve discovery re-usage and integration of TM data from

different sources by means of semantic annotation

Process

Leverage ontology design pattern flexible and self-contained building

blocks more suitable for simple annotation of interdisciplinary multi-

thematic and multi-perspective data than foundational ontologies alone

Annotate based on the common vocabularies with domain-specific

aspects added on top of them Example of Semantic Trajectory

Axiom - an x is enforced to have a timestamp and a

position associated An x must belong to a trajectory

enforced by this Axiom

Fix le Exists atTimeOWL-TimeTemporal Thing and le

hasLocationPosition and hasFix_ SemanticTrajectory

helliphellipWe automatize the creation of proper-

ties hasNext hasSuccessor hasPrevious and

hasPredecessor making use of DL Axioms etchellip

Schematic Basis

for annotation

Semantics and Ontologies For Translational Medicine

11

Use Case and Competency Question Methods

Guide lightweight design development from use cases and

Competency Questions needed for applications (Tells us what an Ontology is good for)

Semantic Trajectory Qs

Show moving objects which stop at x and y(could be only x and y)

Show the objects which move at a ground speed of 04 ms

Each moving object(x) has attributes (temperature ground speed heading direction) which describe s status and the environment at that x

Show the trajectories which cross national parks

Parks as Points of Interest and available on maps by lat-lon

Semantics and Ontologies For Translational Medicine

12

Lightweight Building Block Illustration

Low hanging fruit leverages initial vocabularies amp existing

conceptual models to ensure that semantics models are available

for use in early stages of work

Reduced entry barrier for domain scientists to contribute dataLoD

directly applicable to a variety of trajectory datasets and

easily extensible eg to align with existing ontologies foundational

ontologies or other domain specific vocabularies

Simple partspatterns amp direct relations to data Triple like parts

Semantics and Ontologies For EarthCube 13

Incremental Approaches Richer Schemata

Simple Feature-State Model (from GRAIL) becomes a richer schema

Warm patienthellip

Too simple a triple

Richer schema

Semantics and Ontologies For EarthCube 14

Allow for Bottom-up amp Top-down Approaches to Semantics

This will ensure a vertical integration from the observations-based data level up

to the theory-driven formalization of key domain facts (such networks in System

Biology)

Transcription

Process

Integrating systems biology models amp biomedical ontologies RHoehndorf

MDumontier JGennari S Wimalaratne5 Bde Bono D Cook and Geo G Koutos

Lab amp Clinical

Observations

From Systems

Biology Markup

Language

After The Translational Medicine Ontology KB driving personalized

medicine by bridging the gap between bench and bedside Joanne S

Luciano et al J Biomed Semantics 2011 2(Suppl 2) S1

15

Do it ldquoBehind the Scenesrdquo

Automate linking the

BampB data via terms and

Correlated measurements

Patient Observations

Lab Tests

Formalize

BampB Models amp data

relations

Semantic technologies require knowledge of formal logic that is

unfamiliar to most BioMedical scientists So Institutionalize what we can

ldquoYou mean I donrsquot have to be able to read

XML RDF or OWL Yea

EPR

Reengineered

EHR biochemical

haematological

amp SNP profile

Acceptable

diagnosis of AD

with behavioral

assessments

cognitive tests

and appropriate

brain scan

BioInformation

Models

Tagging

Annotation etc

Treatment

Plan

SNP verdict

efficacious

disease receptor positive

ndash toxic metabolites

Drug available

Pharmacogenomics

DB

16

Communicating an Understandable Value Proposition

What Uncover hidden heterogeneities amp make them explicit

This affords incompatibility discovery prevent users from mixing apples and oranges

How Promote common vocabularies for annotating and describing data using terms formalized ontologies

Leverage vast number of available repositories ontologies methods standards and tools that support scientists in publishing sharing and discovering data

Value gt expected from annotation using simple metadata

But the community needs to understand the semantic technologies

vision-infrastructure-value in a non-technical language

After

Patrick Maueacute Roth Marcell

(2012) Lost in Translation ndash

Mediating between

distributed environmental

resources 6th

International Congress on

Environmental Modelling

and Software (iEMSs) 2012

Leipzig

Translational Medicine

Objectsconcepts

TM Ontology

Semantics and Ontologies For EarthCube

17

Next Generation TM Vision

Support distributed and interdisciplinary knowledge infrastructures to handle exchange

integration and reuse of heterogeneous Big Data

Linked Data Argument ndash

Linked Data is an easily adoptable

and ready-to-use paradigm that

enables data integration and

interoperation by opening up data

silos

- Part of a knowledge infrastructure

Digital Medicine

Semantics and Ontologies For EarthCube 18

Seven (or so) Guiding Principles for

Facilitating Implementation and Application

Methods

1 Driven by concrete TM use cases needs

2 Use lightweight (semantic) approaches

3 Foster semantic interoperability without restricting extant semantic heterogeneity

4 Employ bottom-up and top-down semantics approaches

5 Involve amp enable domain experts assisted by ontology engineers

6 Continue work with S amp O to build a formal body of knowledge in the various health domains involved in TM

Technology

7 Employ classical and non-classical reasoning services

Semantics and Ontologies For EarthCube 19

Understand Requirements

Concrete Use Cases

Work should be driven by use cases generated by members of the TM community ndash eg Alzheimer

Collecting a set of use cases from groups along the TM spectrum

Need a substantial study of interconnected use cases which expose requirements related to data models and tools

which have clear implications for data interoperability ontology and semantics infrastructure

Semantics and Ontologies For EarthCube 20

Foster Semantic Interoperability without restricting

underlying Semantic Heterogeneity

Problem Heterogeneity is introduced by the diverse clinical amp research communities

Solution Provide methods that enable users to flexibly load and combine different ontologies instead of hardwiring data to particular ontologies and thus hinder their flexible reusability

Example - Work from modular building blocks with microtheories of locally valid semantics

Manage multiple small internally consistent ontologies and focus on interrelations as needed for inter-operation

S Duce amp K Janowicz

ldquoMicrotheories for SDIrdquo

2010

Semantics and Ontologies For Translational Medicine 21

Integrated KE Teams amp Process- domain

experts and semantic technologists

Projects must be structured so domain experts are active participants in building semantic models from use cases thru conceptualization to validating final products

Use

Consistent strategies amp methods

Facilitate good documentation and

May need regular Educational Workshops on how to do this and also publish retrieve and integrate data models and workflows

Practical model for the design amp execution of

translational informatics projects

From Philip Payne‟s Biomedical Knowledge Integration Dec 2012

Illustrates major phases

exemplary input or output

resources and data sets

Insert Semantics

Semantics and Ontologies For Translational Medicine

23

Methods for Useful Formalized Bodies of

Knowledge

Apply ontological engineeringKE to capture the body of knowledge for various TM domains

Conceptualization of local models

Work on primitives ie base symbols for such ontologies

Ground primitives in real observations and align them to knowledge patterns

Track categorical data back to measurements using provenance

(eg RDF in context)

Work to make ontologies first class citizens usable by statistic methods

After construction phase organize building blocks amp ontological models

To help access data domain models and their use in tools

This can also be used for educational applications for learning about domain concepts and extracting information

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 6: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Opportunities in Translational Research Cycle Many

Biological amp BedsideMedical (BampB) Areas of Big Data

Bridging Vocabularies amp Ontologies

Lab Medical Practice

ldquoEHR-driven genomic researchrdquo

(EDGR) EHR data linked to DNA

samples

Trials

Technology Insertion

Lightweight Methods

TM use cases ndash eg Alzheimer along the TM spectrum

Differing granular units emerge from interactions

Ontologies

TMO etc

Push

Semantics and Ontologies For Translational Medicine 7

Relevant Ideas from EarthCube Semantics amp

Ontology and RDA

For Example

Knowledge-based infrastructures using semantic annotation of metadata realized using shared vocabularies typically ontologies

Semantic technologies enable better searching metadata catalogs

Many Semantic Technology parts but

an important driver has been the Semantic

Web amp Linked Data framework LOD delivering a platform agnostic variant of ODBC and

JDBC Data Source Names (DSNs) via hyperlinks

Ontologies amp KR languages that restrict the interpretation of domain vocabulary towards their intended meaning

Enable reasoning services

Semantic

Approaches

Managing Scientific Data From Data Integration to Scientific Workflows

httpuserssdscedu~ludaeschPapergsa-smspdf (Ludascher et al)

Semantic Automated Discovery

and Integration

httpsadiframeworkorg ldquoShow me patients whose

creatinine level is increasing

over time along with their

latest BUN and creatinine

levelsrdquo

Linking Open

Biomedical Data

(twc-lobd) httpcodegooglec

omptwc-lobd

TMO (Dumontier et

al 2010)

Semantics and Ontologies For Translational Medicine 8

Graphic Overview of Strategy Semantics amp Ontologies

-Ideas from EarthCube SO group

Knowledge Infrastructure Vision Community Understanding of

Semantic role and value

Guiding principles ndashShare Methods

1 Understand the Drivers (LoD)

2 Lightweight -opportunistic

1 Modular patterns

2 Resuablehellip

3 Semantic interoperability with

semantic heterogeneity

4 Bottom-up amp top-down approaches

5 Domain - ontology engineer teams

6 Formalized bodies of knowledge

across TM domains

7 Reasoning services

ldquonew tech

Insertionrdquo

TM

Genomics Proteomics Disease

Semantics and Ontologies For Translational Medicine 9

Linked B amp B Data Applications Horizontal amp Vertical Integration

chemical

DBs

Genomic

Domain

DBs

Proteomic

Domain

DBs

Disease

Domain

DBs

Treatment

Domain

DBs

Treatment

things

Chem

things

Proteomic

things Genomic

things

Disease

things

Sigma

Etc

Marbles

Etc

Dbpedia

Mobile

Etc

RDFs

OWL

After Christian Bizer

The Web of Linked

Data (26072009)

URIBurner

Etc

EHR

BioPaths

Domain

DBs

Observations

Longitudinal

data

Lab Testhellip

TMO ~ 75 classes for material entities (molecule protein cell lines

pharm preps)roles (subject target active ingredient) processes

(diagnosis study intervention) amp info entities(eg dosage

mechanism of action signsymptom

Design Pattern Annotation Approach Premise improve discovery re-usage and integration of TM data from

different sources by means of semantic annotation

Process

Leverage ontology design pattern flexible and self-contained building

blocks more suitable for simple annotation of interdisciplinary multi-

thematic and multi-perspective data than foundational ontologies alone

Annotate based on the common vocabularies with domain-specific

aspects added on top of them Example of Semantic Trajectory

Axiom - an x is enforced to have a timestamp and a

position associated An x must belong to a trajectory

enforced by this Axiom

Fix le Exists atTimeOWL-TimeTemporal Thing and le

hasLocationPosition and hasFix_ SemanticTrajectory

helliphellipWe automatize the creation of proper-

ties hasNext hasSuccessor hasPrevious and

hasPredecessor making use of DL Axioms etchellip

Schematic Basis

for annotation

Semantics and Ontologies For Translational Medicine

11

Use Case and Competency Question Methods

Guide lightweight design development from use cases and

Competency Questions needed for applications (Tells us what an Ontology is good for)

Semantic Trajectory Qs

Show moving objects which stop at x and y(could be only x and y)

Show the objects which move at a ground speed of 04 ms

Each moving object(x) has attributes (temperature ground speed heading direction) which describe s status and the environment at that x

Show the trajectories which cross national parks

Parks as Points of Interest and available on maps by lat-lon

Semantics and Ontologies For Translational Medicine

12

Lightweight Building Block Illustration

Low hanging fruit leverages initial vocabularies amp existing

conceptual models to ensure that semantics models are available

for use in early stages of work

Reduced entry barrier for domain scientists to contribute dataLoD

directly applicable to a variety of trajectory datasets and

easily extensible eg to align with existing ontologies foundational

ontologies or other domain specific vocabularies

Simple partspatterns amp direct relations to data Triple like parts

Semantics and Ontologies For EarthCube 13

Incremental Approaches Richer Schemata

Simple Feature-State Model (from GRAIL) becomes a richer schema

Warm patienthellip

Too simple a triple

Richer schema

Semantics and Ontologies For EarthCube 14

Allow for Bottom-up amp Top-down Approaches to Semantics

This will ensure a vertical integration from the observations-based data level up

to the theory-driven formalization of key domain facts (such networks in System

Biology)

Transcription

Process

Integrating systems biology models amp biomedical ontologies RHoehndorf

MDumontier JGennari S Wimalaratne5 Bde Bono D Cook and Geo G Koutos

Lab amp Clinical

Observations

From Systems

Biology Markup

Language

After The Translational Medicine Ontology KB driving personalized

medicine by bridging the gap between bench and bedside Joanne S

Luciano et al J Biomed Semantics 2011 2(Suppl 2) S1

15

Do it ldquoBehind the Scenesrdquo

Automate linking the

BampB data via terms and

Correlated measurements

Patient Observations

Lab Tests

Formalize

BampB Models amp data

relations

Semantic technologies require knowledge of formal logic that is

unfamiliar to most BioMedical scientists So Institutionalize what we can

ldquoYou mean I donrsquot have to be able to read

XML RDF or OWL Yea

EPR

Reengineered

EHR biochemical

haematological

amp SNP profile

Acceptable

diagnosis of AD

with behavioral

assessments

cognitive tests

and appropriate

brain scan

BioInformation

Models

Tagging

Annotation etc

Treatment

Plan

SNP verdict

efficacious

disease receptor positive

ndash toxic metabolites

Drug available

Pharmacogenomics

DB

16

Communicating an Understandable Value Proposition

What Uncover hidden heterogeneities amp make them explicit

This affords incompatibility discovery prevent users from mixing apples and oranges

How Promote common vocabularies for annotating and describing data using terms formalized ontologies

Leverage vast number of available repositories ontologies methods standards and tools that support scientists in publishing sharing and discovering data

Value gt expected from annotation using simple metadata

But the community needs to understand the semantic technologies

vision-infrastructure-value in a non-technical language

After

Patrick Maueacute Roth Marcell

(2012) Lost in Translation ndash

Mediating between

distributed environmental

resources 6th

International Congress on

Environmental Modelling

and Software (iEMSs) 2012

Leipzig

Translational Medicine

Objectsconcepts

TM Ontology

Semantics and Ontologies For EarthCube

17

Next Generation TM Vision

Support distributed and interdisciplinary knowledge infrastructures to handle exchange

integration and reuse of heterogeneous Big Data

Linked Data Argument ndash

Linked Data is an easily adoptable

and ready-to-use paradigm that

enables data integration and

interoperation by opening up data

silos

- Part of a knowledge infrastructure

Digital Medicine

Semantics and Ontologies For EarthCube 18

Seven (or so) Guiding Principles for

Facilitating Implementation and Application

Methods

1 Driven by concrete TM use cases needs

2 Use lightweight (semantic) approaches

3 Foster semantic interoperability without restricting extant semantic heterogeneity

4 Employ bottom-up and top-down semantics approaches

5 Involve amp enable domain experts assisted by ontology engineers

6 Continue work with S amp O to build a formal body of knowledge in the various health domains involved in TM

Technology

7 Employ classical and non-classical reasoning services

Semantics and Ontologies For EarthCube 19

Understand Requirements

Concrete Use Cases

Work should be driven by use cases generated by members of the TM community ndash eg Alzheimer

Collecting a set of use cases from groups along the TM spectrum

Need a substantial study of interconnected use cases which expose requirements related to data models and tools

which have clear implications for data interoperability ontology and semantics infrastructure

Semantics and Ontologies For EarthCube 20

Foster Semantic Interoperability without restricting

underlying Semantic Heterogeneity

Problem Heterogeneity is introduced by the diverse clinical amp research communities

Solution Provide methods that enable users to flexibly load and combine different ontologies instead of hardwiring data to particular ontologies and thus hinder their flexible reusability

Example - Work from modular building blocks with microtheories of locally valid semantics

Manage multiple small internally consistent ontologies and focus on interrelations as needed for inter-operation

S Duce amp K Janowicz

ldquoMicrotheories for SDIrdquo

2010

Semantics and Ontologies For Translational Medicine 21

Integrated KE Teams amp Process- domain

experts and semantic technologists

Projects must be structured so domain experts are active participants in building semantic models from use cases thru conceptualization to validating final products

Use

Consistent strategies amp methods

Facilitate good documentation and

May need regular Educational Workshops on how to do this and also publish retrieve and integrate data models and workflows

Practical model for the design amp execution of

translational informatics projects

From Philip Payne‟s Biomedical Knowledge Integration Dec 2012

Illustrates major phases

exemplary input or output

resources and data sets

Insert Semantics

Semantics and Ontologies For Translational Medicine

23

Methods for Useful Formalized Bodies of

Knowledge

Apply ontological engineeringKE to capture the body of knowledge for various TM domains

Conceptualization of local models

Work on primitives ie base symbols for such ontologies

Ground primitives in real observations and align them to knowledge patterns

Track categorical data back to measurements using provenance

(eg RDF in context)

Work to make ontologies first class citizens usable by statistic methods

After construction phase organize building blocks amp ontological models

To help access data domain models and their use in tools

This can also be used for educational applications for learning about domain concepts and extracting information

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 7: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Semantics and Ontologies For Translational Medicine 7

Relevant Ideas from EarthCube Semantics amp

Ontology and RDA

For Example

Knowledge-based infrastructures using semantic annotation of metadata realized using shared vocabularies typically ontologies

Semantic technologies enable better searching metadata catalogs

Many Semantic Technology parts but

an important driver has been the Semantic

Web amp Linked Data framework LOD delivering a platform agnostic variant of ODBC and

JDBC Data Source Names (DSNs) via hyperlinks

Ontologies amp KR languages that restrict the interpretation of domain vocabulary towards their intended meaning

Enable reasoning services

Semantic

Approaches

Managing Scientific Data From Data Integration to Scientific Workflows

httpuserssdscedu~ludaeschPapergsa-smspdf (Ludascher et al)

Semantic Automated Discovery

and Integration

httpsadiframeworkorg ldquoShow me patients whose

creatinine level is increasing

over time along with their

latest BUN and creatinine

levelsrdquo

Linking Open

Biomedical Data

(twc-lobd) httpcodegooglec

omptwc-lobd

TMO (Dumontier et

al 2010)

Semantics and Ontologies For Translational Medicine 8

Graphic Overview of Strategy Semantics amp Ontologies

-Ideas from EarthCube SO group

Knowledge Infrastructure Vision Community Understanding of

Semantic role and value

Guiding principles ndashShare Methods

1 Understand the Drivers (LoD)

2 Lightweight -opportunistic

1 Modular patterns

2 Resuablehellip

3 Semantic interoperability with

semantic heterogeneity

4 Bottom-up amp top-down approaches

5 Domain - ontology engineer teams

6 Formalized bodies of knowledge

across TM domains

7 Reasoning services

ldquonew tech

Insertionrdquo

TM

Genomics Proteomics Disease

Semantics and Ontologies For Translational Medicine 9

Linked B amp B Data Applications Horizontal amp Vertical Integration

chemical

DBs

Genomic

Domain

DBs

Proteomic

Domain

DBs

Disease

Domain

DBs

Treatment

Domain

DBs

Treatment

things

Chem

things

Proteomic

things Genomic

things

Disease

things

Sigma

Etc

Marbles

Etc

Dbpedia

Mobile

Etc

RDFs

OWL

After Christian Bizer

The Web of Linked

Data (26072009)

URIBurner

Etc

EHR

BioPaths

Domain

DBs

Observations

Longitudinal

data

Lab Testhellip

TMO ~ 75 classes for material entities (molecule protein cell lines

pharm preps)roles (subject target active ingredient) processes

(diagnosis study intervention) amp info entities(eg dosage

mechanism of action signsymptom

Design Pattern Annotation Approach Premise improve discovery re-usage and integration of TM data from

different sources by means of semantic annotation

Process

Leverage ontology design pattern flexible and self-contained building

blocks more suitable for simple annotation of interdisciplinary multi-

thematic and multi-perspective data than foundational ontologies alone

Annotate based on the common vocabularies with domain-specific

aspects added on top of them Example of Semantic Trajectory

Axiom - an x is enforced to have a timestamp and a

position associated An x must belong to a trajectory

enforced by this Axiom

Fix le Exists atTimeOWL-TimeTemporal Thing and le

hasLocationPosition and hasFix_ SemanticTrajectory

helliphellipWe automatize the creation of proper-

ties hasNext hasSuccessor hasPrevious and

hasPredecessor making use of DL Axioms etchellip

Schematic Basis

for annotation

Semantics and Ontologies For Translational Medicine

11

Use Case and Competency Question Methods

Guide lightweight design development from use cases and

Competency Questions needed for applications (Tells us what an Ontology is good for)

Semantic Trajectory Qs

Show moving objects which stop at x and y(could be only x and y)

Show the objects which move at a ground speed of 04 ms

Each moving object(x) has attributes (temperature ground speed heading direction) which describe s status and the environment at that x

Show the trajectories which cross national parks

Parks as Points of Interest and available on maps by lat-lon

Semantics and Ontologies For Translational Medicine

12

Lightweight Building Block Illustration

Low hanging fruit leverages initial vocabularies amp existing

conceptual models to ensure that semantics models are available

for use in early stages of work

Reduced entry barrier for domain scientists to contribute dataLoD

directly applicable to a variety of trajectory datasets and

easily extensible eg to align with existing ontologies foundational

ontologies or other domain specific vocabularies

Simple partspatterns amp direct relations to data Triple like parts

Semantics and Ontologies For EarthCube 13

Incremental Approaches Richer Schemata

Simple Feature-State Model (from GRAIL) becomes a richer schema

Warm patienthellip

Too simple a triple

Richer schema

Semantics and Ontologies For EarthCube 14

Allow for Bottom-up amp Top-down Approaches to Semantics

This will ensure a vertical integration from the observations-based data level up

to the theory-driven formalization of key domain facts (such networks in System

Biology)

Transcription

Process

Integrating systems biology models amp biomedical ontologies RHoehndorf

MDumontier JGennari S Wimalaratne5 Bde Bono D Cook and Geo G Koutos

Lab amp Clinical

Observations

From Systems

Biology Markup

Language

After The Translational Medicine Ontology KB driving personalized

medicine by bridging the gap between bench and bedside Joanne S

Luciano et al J Biomed Semantics 2011 2(Suppl 2) S1

15

Do it ldquoBehind the Scenesrdquo

Automate linking the

BampB data via terms and

Correlated measurements

Patient Observations

Lab Tests

Formalize

BampB Models amp data

relations

Semantic technologies require knowledge of formal logic that is

unfamiliar to most BioMedical scientists So Institutionalize what we can

ldquoYou mean I donrsquot have to be able to read

XML RDF or OWL Yea

EPR

Reengineered

EHR biochemical

haematological

amp SNP profile

Acceptable

diagnosis of AD

with behavioral

assessments

cognitive tests

and appropriate

brain scan

BioInformation

Models

Tagging

Annotation etc

Treatment

Plan

SNP verdict

efficacious

disease receptor positive

ndash toxic metabolites

Drug available

Pharmacogenomics

DB

16

Communicating an Understandable Value Proposition

What Uncover hidden heterogeneities amp make them explicit

This affords incompatibility discovery prevent users from mixing apples and oranges

How Promote common vocabularies for annotating and describing data using terms formalized ontologies

Leverage vast number of available repositories ontologies methods standards and tools that support scientists in publishing sharing and discovering data

Value gt expected from annotation using simple metadata

But the community needs to understand the semantic technologies

vision-infrastructure-value in a non-technical language

After

Patrick Maueacute Roth Marcell

(2012) Lost in Translation ndash

Mediating between

distributed environmental

resources 6th

International Congress on

Environmental Modelling

and Software (iEMSs) 2012

Leipzig

Translational Medicine

Objectsconcepts

TM Ontology

Semantics and Ontologies For EarthCube

17

Next Generation TM Vision

Support distributed and interdisciplinary knowledge infrastructures to handle exchange

integration and reuse of heterogeneous Big Data

Linked Data Argument ndash

Linked Data is an easily adoptable

and ready-to-use paradigm that

enables data integration and

interoperation by opening up data

silos

- Part of a knowledge infrastructure

Digital Medicine

Semantics and Ontologies For EarthCube 18

Seven (or so) Guiding Principles for

Facilitating Implementation and Application

Methods

1 Driven by concrete TM use cases needs

2 Use lightweight (semantic) approaches

3 Foster semantic interoperability without restricting extant semantic heterogeneity

4 Employ bottom-up and top-down semantics approaches

5 Involve amp enable domain experts assisted by ontology engineers

6 Continue work with S amp O to build a formal body of knowledge in the various health domains involved in TM

Technology

7 Employ classical and non-classical reasoning services

Semantics and Ontologies For EarthCube 19

Understand Requirements

Concrete Use Cases

Work should be driven by use cases generated by members of the TM community ndash eg Alzheimer

Collecting a set of use cases from groups along the TM spectrum

Need a substantial study of interconnected use cases which expose requirements related to data models and tools

which have clear implications for data interoperability ontology and semantics infrastructure

Semantics and Ontologies For EarthCube 20

Foster Semantic Interoperability without restricting

underlying Semantic Heterogeneity

Problem Heterogeneity is introduced by the diverse clinical amp research communities

Solution Provide methods that enable users to flexibly load and combine different ontologies instead of hardwiring data to particular ontologies and thus hinder their flexible reusability

Example - Work from modular building blocks with microtheories of locally valid semantics

Manage multiple small internally consistent ontologies and focus on interrelations as needed for inter-operation

S Duce amp K Janowicz

ldquoMicrotheories for SDIrdquo

2010

Semantics and Ontologies For Translational Medicine 21

Integrated KE Teams amp Process- domain

experts and semantic technologists

Projects must be structured so domain experts are active participants in building semantic models from use cases thru conceptualization to validating final products

Use

Consistent strategies amp methods

Facilitate good documentation and

May need regular Educational Workshops on how to do this and also publish retrieve and integrate data models and workflows

Practical model for the design amp execution of

translational informatics projects

From Philip Payne‟s Biomedical Knowledge Integration Dec 2012

Illustrates major phases

exemplary input or output

resources and data sets

Insert Semantics

Semantics and Ontologies For Translational Medicine

23

Methods for Useful Formalized Bodies of

Knowledge

Apply ontological engineeringKE to capture the body of knowledge for various TM domains

Conceptualization of local models

Work on primitives ie base symbols for such ontologies

Ground primitives in real observations and align them to knowledge patterns

Track categorical data back to measurements using provenance

(eg RDF in context)

Work to make ontologies first class citizens usable by statistic methods

After construction phase organize building blocks amp ontological models

To help access data domain models and their use in tools

This can also be used for educational applications for learning about domain concepts and extracting information

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 8: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Semantics and Ontologies For Translational Medicine 8

Graphic Overview of Strategy Semantics amp Ontologies

-Ideas from EarthCube SO group

Knowledge Infrastructure Vision Community Understanding of

Semantic role and value

Guiding principles ndashShare Methods

1 Understand the Drivers (LoD)

2 Lightweight -opportunistic

1 Modular patterns

2 Resuablehellip

3 Semantic interoperability with

semantic heterogeneity

4 Bottom-up amp top-down approaches

5 Domain - ontology engineer teams

6 Formalized bodies of knowledge

across TM domains

7 Reasoning services

ldquonew tech

Insertionrdquo

TM

Genomics Proteomics Disease

Semantics and Ontologies For Translational Medicine 9

Linked B amp B Data Applications Horizontal amp Vertical Integration

chemical

DBs

Genomic

Domain

DBs

Proteomic

Domain

DBs

Disease

Domain

DBs

Treatment

Domain

DBs

Treatment

things

Chem

things

Proteomic

things Genomic

things

Disease

things

Sigma

Etc

Marbles

Etc

Dbpedia

Mobile

Etc

RDFs

OWL

After Christian Bizer

The Web of Linked

Data (26072009)

URIBurner

Etc

EHR

BioPaths

Domain

DBs

Observations

Longitudinal

data

Lab Testhellip

TMO ~ 75 classes for material entities (molecule protein cell lines

pharm preps)roles (subject target active ingredient) processes

(diagnosis study intervention) amp info entities(eg dosage

mechanism of action signsymptom

Design Pattern Annotation Approach Premise improve discovery re-usage and integration of TM data from

different sources by means of semantic annotation

Process

Leverage ontology design pattern flexible and self-contained building

blocks more suitable for simple annotation of interdisciplinary multi-

thematic and multi-perspective data than foundational ontologies alone

Annotate based on the common vocabularies with domain-specific

aspects added on top of them Example of Semantic Trajectory

Axiom - an x is enforced to have a timestamp and a

position associated An x must belong to a trajectory

enforced by this Axiom

Fix le Exists atTimeOWL-TimeTemporal Thing and le

hasLocationPosition and hasFix_ SemanticTrajectory

helliphellipWe automatize the creation of proper-

ties hasNext hasSuccessor hasPrevious and

hasPredecessor making use of DL Axioms etchellip

Schematic Basis

for annotation

Semantics and Ontologies For Translational Medicine

11

Use Case and Competency Question Methods

Guide lightweight design development from use cases and

Competency Questions needed for applications (Tells us what an Ontology is good for)

Semantic Trajectory Qs

Show moving objects which stop at x and y(could be only x and y)

Show the objects which move at a ground speed of 04 ms

Each moving object(x) has attributes (temperature ground speed heading direction) which describe s status and the environment at that x

Show the trajectories which cross national parks

Parks as Points of Interest and available on maps by lat-lon

Semantics and Ontologies For Translational Medicine

12

Lightweight Building Block Illustration

Low hanging fruit leverages initial vocabularies amp existing

conceptual models to ensure that semantics models are available

for use in early stages of work

Reduced entry barrier for domain scientists to contribute dataLoD

directly applicable to a variety of trajectory datasets and

easily extensible eg to align with existing ontologies foundational

ontologies or other domain specific vocabularies

Simple partspatterns amp direct relations to data Triple like parts

Semantics and Ontologies For EarthCube 13

Incremental Approaches Richer Schemata

Simple Feature-State Model (from GRAIL) becomes a richer schema

Warm patienthellip

Too simple a triple

Richer schema

Semantics and Ontologies For EarthCube 14

Allow for Bottom-up amp Top-down Approaches to Semantics

This will ensure a vertical integration from the observations-based data level up

to the theory-driven formalization of key domain facts (such networks in System

Biology)

Transcription

Process

Integrating systems biology models amp biomedical ontologies RHoehndorf

MDumontier JGennari S Wimalaratne5 Bde Bono D Cook and Geo G Koutos

Lab amp Clinical

Observations

From Systems

Biology Markup

Language

After The Translational Medicine Ontology KB driving personalized

medicine by bridging the gap between bench and bedside Joanne S

Luciano et al J Biomed Semantics 2011 2(Suppl 2) S1

15

Do it ldquoBehind the Scenesrdquo

Automate linking the

BampB data via terms and

Correlated measurements

Patient Observations

Lab Tests

Formalize

BampB Models amp data

relations

Semantic technologies require knowledge of formal logic that is

unfamiliar to most BioMedical scientists So Institutionalize what we can

ldquoYou mean I donrsquot have to be able to read

XML RDF or OWL Yea

EPR

Reengineered

EHR biochemical

haematological

amp SNP profile

Acceptable

diagnosis of AD

with behavioral

assessments

cognitive tests

and appropriate

brain scan

BioInformation

Models

Tagging

Annotation etc

Treatment

Plan

SNP verdict

efficacious

disease receptor positive

ndash toxic metabolites

Drug available

Pharmacogenomics

DB

16

Communicating an Understandable Value Proposition

What Uncover hidden heterogeneities amp make them explicit

This affords incompatibility discovery prevent users from mixing apples and oranges

How Promote common vocabularies for annotating and describing data using terms formalized ontologies

Leverage vast number of available repositories ontologies methods standards and tools that support scientists in publishing sharing and discovering data

Value gt expected from annotation using simple metadata

But the community needs to understand the semantic technologies

vision-infrastructure-value in a non-technical language

After

Patrick Maueacute Roth Marcell

(2012) Lost in Translation ndash

Mediating between

distributed environmental

resources 6th

International Congress on

Environmental Modelling

and Software (iEMSs) 2012

Leipzig

Translational Medicine

Objectsconcepts

TM Ontology

Semantics and Ontologies For EarthCube

17

Next Generation TM Vision

Support distributed and interdisciplinary knowledge infrastructures to handle exchange

integration and reuse of heterogeneous Big Data

Linked Data Argument ndash

Linked Data is an easily adoptable

and ready-to-use paradigm that

enables data integration and

interoperation by opening up data

silos

- Part of a knowledge infrastructure

Digital Medicine

Semantics and Ontologies For EarthCube 18

Seven (or so) Guiding Principles for

Facilitating Implementation and Application

Methods

1 Driven by concrete TM use cases needs

2 Use lightweight (semantic) approaches

3 Foster semantic interoperability without restricting extant semantic heterogeneity

4 Employ bottom-up and top-down semantics approaches

5 Involve amp enable domain experts assisted by ontology engineers

6 Continue work with S amp O to build a formal body of knowledge in the various health domains involved in TM

Technology

7 Employ classical and non-classical reasoning services

Semantics and Ontologies For EarthCube 19

Understand Requirements

Concrete Use Cases

Work should be driven by use cases generated by members of the TM community ndash eg Alzheimer

Collecting a set of use cases from groups along the TM spectrum

Need a substantial study of interconnected use cases which expose requirements related to data models and tools

which have clear implications for data interoperability ontology and semantics infrastructure

Semantics and Ontologies For EarthCube 20

Foster Semantic Interoperability without restricting

underlying Semantic Heterogeneity

Problem Heterogeneity is introduced by the diverse clinical amp research communities

Solution Provide methods that enable users to flexibly load and combine different ontologies instead of hardwiring data to particular ontologies and thus hinder their flexible reusability

Example - Work from modular building blocks with microtheories of locally valid semantics

Manage multiple small internally consistent ontologies and focus on interrelations as needed for inter-operation

S Duce amp K Janowicz

ldquoMicrotheories for SDIrdquo

2010

Semantics and Ontologies For Translational Medicine 21

Integrated KE Teams amp Process- domain

experts and semantic technologists

Projects must be structured so domain experts are active participants in building semantic models from use cases thru conceptualization to validating final products

Use

Consistent strategies amp methods

Facilitate good documentation and

May need regular Educational Workshops on how to do this and also publish retrieve and integrate data models and workflows

Practical model for the design amp execution of

translational informatics projects

From Philip Payne‟s Biomedical Knowledge Integration Dec 2012

Illustrates major phases

exemplary input or output

resources and data sets

Insert Semantics

Semantics and Ontologies For Translational Medicine

23

Methods for Useful Formalized Bodies of

Knowledge

Apply ontological engineeringKE to capture the body of knowledge for various TM domains

Conceptualization of local models

Work on primitives ie base symbols for such ontologies

Ground primitives in real observations and align them to knowledge patterns

Track categorical data back to measurements using provenance

(eg RDF in context)

Work to make ontologies first class citizens usable by statistic methods

After construction phase organize building blocks amp ontological models

To help access data domain models and their use in tools

This can also be used for educational applications for learning about domain concepts and extracting information

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 9: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Semantics and Ontologies For Translational Medicine 9

Linked B amp B Data Applications Horizontal amp Vertical Integration

chemical

DBs

Genomic

Domain

DBs

Proteomic

Domain

DBs

Disease

Domain

DBs

Treatment

Domain

DBs

Treatment

things

Chem

things

Proteomic

things Genomic

things

Disease

things

Sigma

Etc

Marbles

Etc

Dbpedia

Mobile

Etc

RDFs

OWL

After Christian Bizer

The Web of Linked

Data (26072009)

URIBurner

Etc

EHR

BioPaths

Domain

DBs

Observations

Longitudinal

data

Lab Testhellip

TMO ~ 75 classes for material entities (molecule protein cell lines

pharm preps)roles (subject target active ingredient) processes

(diagnosis study intervention) amp info entities(eg dosage

mechanism of action signsymptom

Design Pattern Annotation Approach Premise improve discovery re-usage and integration of TM data from

different sources by means of semantic annotation

Process

Leverage ontology design pattern flexible and self-contained building

blocks more suitable for simple annotation of interdisciplinary multi-

thematic and multi-perspective data than foundational ontologies alone

Annotate based on the common vocabularies with domain-specific

aspects added on top of them Example of Semantic Trajectory

Axiom - an x is enforced to have a timestamp and a

position associated An x must belong to a trajectory

enforced by this Axiom

Fix le Exists atTimeOWL-TimeTemporal Thing and le

hasLocationPosition and hasFix_ SemanticTrajectory

helliphellipWe automatize the creation of proper-

ties hasNext hasSuccessor hasPrevious and

hasPredecessor making use of DL Axioms etchellip

Schematic Basis

for annotation

Semantics and Ontologies For Translational Medicine

11

Use Case and Competency Question Methods

Guide lightweight design development from use cases and

Competency Questions needed for applications (Tells us what an Ontology is good for)

Semantic Trajectory Qs

Show moving objects which stop at x and y(could be only x and y)

Show the objects which move at a ground speed of 04 ms

Each moving object(x) has attributes (temperature ground speed heading direction) which describe s status and the environment at that x

Show the trajectories which cross national parks

Parks as Points of Interest and available on maps by lat-lon

Semantics and Ontologies For Translational Medicine

12

Lightweight Building Block Illustration

Low hanging fruit leverages initial vocabularies amp existing

conceptual models to ensure that semantics models are available

for use in early stages of work

Reduced entry barrier for domain scientists to contribute dataLoD

directly applicable to a variety of trajectory datasets and

easily extensible eg to align with existing ontologies foundational

ontologies or other domain specific vocabularies

Simple partspatterns amp direct relations to data Triple like parts

Semantics and Ontologies For EarthCube 13

Incremental Approaches Richer Schemata

Simple Feature-State Model (from GRAIL) becomes a richer schema

Warm patienthellip

Too simple a triple

Richer schema

Semantics and Ontologies For EarthCube 14

Allow for Bottom-up amp Top-down Approaches to Semantics

This will ensure a vertical integration from the observations-based data level up

to the theory-driven formalization of key domain facts (such networks in System

Biology)

Transcription

Process

Integrating systems biology models amp biomedical ontologies RHoehndorf

MDumontier JGennari S Wimalaratne5 Bde Bono D Cook and Geo G Koutos

Lab amp Clinical

Observations

From Systems

Biology Markup

Language

After The Translational Medicine Ontology KB driving personalized

medicine by bridging the gap between bench and bedside Joanne S

Luciano et al J Biomed Semantics 2011 2(Suppl 2) S1

15

Do it ldquoBehind the Scenesrdquo

Automate linking the

BampB data via terms and

Correlated measurements

Patient Observations

Lab Tests

Formalize

BampB Models amp data

relations

Semantic technologies require knowledge of formal logic that is

unfamiliar to most BioMedical scientists So Institutionalize what we can

ldquoYou mean I donrsquot have to be able to read

XML RDF or OWL Yea

EPR

Reengineered

EHR biochemical

haematological

amp SNP profile

Acceptable

diagnosis of AD

with behavioral

assessments

cognitive tests

and appropriate

brain scan

BioInformation

Models

Tagging

Annotation etc

Treatment

Plan

SNP verdict

efficacious

disease receptor positive

ndash toxic metabolites

Drug available

Pharmacogenomics

DB

16

Communicating an Understandable Value Proposition

What Uncover hidden heterogeneities amp make them explicit

This affords incompatibility discovery prevent users from mixing apples and oranges

How Promote common vocabularies for annotating and describing data using terms formalized ontologies

Leverage vast number of available repositories ontologies methods standards and tools that support scientists in publishing sharing and discovering data

Value gt expected from annotation using simple metadata

But the community needs to understand the semantic technologies

vision-infrastructure-value in a non-technical language

After

Patrick Maueacute Roth Marcell

(2012) Lost in Translation ndash

Mediating between

distributed environmental

resources 6th

International Congress on

Environmental Modelling

and Software (iEMSs) 2012

Leipzig

Translational Medicine

Objectsconcepts

TM Ontology

Semantics and Ontologies For EarthCube

17

Next Generation TM Vision

Support distributed and interdisciplinary knowledge infrastructures to handle exchange

integration and reuse of heterogeneous Big Data

Linked Data Argument ndash

Linked Data is an easily adoptable

and ready-to-use paradigm that

enables data integration and

interoperation by opening up data

silos

- Part of a knowledge infrastructure

Digital Medicine

Semantics and Ontologies For EarthCube 18

Seven (or so) Guiding Principles for

Facilitating Implementation and Application

Methods

1 Driven by concrete TM use cases needs

2 Use lightweight (semantic) approaches

3 Foster semantic interoperability without restricting extant semantic heterogeneity

4 Employ bottom-up and top-down semantics approaches

5 Involve amp enable domain experts assisted by ontology engineers

6 Continue work with S amp O to build a formal body of knowledge in the various health domains involved in TM

Technology

7 Employ classical and non-classical reasoning services

Semantics and Ontologies For EarthCube 19

Understand Requirements

Concrete Use Cases

Work should be driven by use cases generated by members of the TM community ndash eg Alzheimer

Collecting a set of use cases from groups along the TM spectrum

Need a substantial study of interconnected use cases which expose requirements related to data models and tools

which have clear implications for data interoperability ontology and semantics infrastructure

Semantics and Ontologies For EarthCube 20

Foster Semantic Interoperability without restricting

underlying Semantic Heterogeneity

Problem Heterogeneity is introduced by the diverse clinical amp research communities

Solution Provide methods that enable users to flexibly load and combine different ontologies instead of hardwiring data to particular ontologies and thus hinder their flexible reusability

Example - Work from modular building blocks with microtheories of locally valid semantics

Manage multiple small internally consistent ontologies and focus on interrelations as needed for inter-operation

S Duce amp K Janowicz

ldquoMicrotheories for SDIrdquo

2010

Semantics and Ontologies For Translational Medicine 21

Integrated KE Teams amp Process- domain

experts and semantic technologists

Projects must be structured so domain experts are active participants in building semantic models from use cases thru conceptualization to validating final products

Use

Consistent strategies amp methods

Facilitate good documentation and

May need regular Educational Workshops on how to do this and also publish retrieve and integrate data models and workflows

Practical model for the design amp execution of

translational informatics projects

From Philip Payne‟s Biomedical Knowledge Integration Dec 2012

Illustrates major phases

exemplary input or output

resources and data sets

Insert Semantics

Semantics and Ontologies For Translational Medicine

23

Methods for Useful Formalized Bodies of

Knowledge

Apply ontological engineeringKE to capture the body of knowledge for various TM domains

Conceptualization of local models

Work on primitives ie base symbols for such ontologies

Ground primitives in real observations and align them to knowledge patterns

Track categorical data back to measurements using provenance

(eg RDF in context)

Work to make ontologies first class citizens usable by statistic methods

After construction phase organize building blocks amp ontological models

To help access data domain models and their use in tools

This can also be used for educational applications for learning about domain concepts and extracting information

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 10: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Design Pattern Annotation Approach Premise improve discovery re-usage and integration of TM data from

different sources by means of semantic annotation

Process

Leverage ontology design pattern flexible and self-contained building

blocks more suitable for simple annotation of interdisciplinary multi-

thematic and multi-perspective data than foundational ontologies alone

Annotate based on the common vocabularies with domain-specific

aspects added on top of them Example of Semantic Trajectory

Axiom - an x is enforced to have a timestamp and a

position associated An x must belong to a trajectory

enforced by this Axiom

Fix le Exists atTimeOWL-TimeTemporal Thing and le

hasLocationPosition and hasFix_ SemanticTrajectory

helliphellipWe automatize the creation of proper-

ties hasNext hasSuccessor hasPrevious and

hasPredecessor making use of DL Axioms etchellip

Schematic Basis

for annotation

Semantics and Ontologies For Translational Medicine

11

Use Case and Competency Question Methods

Guide lightweight design development from use cases and

Competency Questions needed for applications (Tells us what an Ontology is good for)

Semantic Trajectory Qs

Show moving objects which stop at x and y(could be only x and y)

Show the objects which move at a ground speed of 04 ms

Each moving object(x) has attributes (temperature ground speed heading direction) which describe s status and the environment at that x

Show the trajectories which cross national parks

Parks as Points of Interest and available on maps by lat-lon

Semantics and Ontologies For Translational Medicine

12

Lightweight Building Block Illustration

Low hanging fruit leverages initial vocabularies amp existing

conceptual models to ensure that semantics models are available

for use in early stages of work

Reduced entry barrier for domain scientists to contribute dataLoD

directly applicable to a variety of trajectory datasets and

easily extensible eg to align with existing ontologies foundational

ontologies or other domain specific vocabularies

Simple partspatterns amp direct relations to data Triple like parts

Semantics and Ontologies For EarthCube 13

Incremental Approaches Richer Schemata

Simple Feature-State Model (from GRAIL) becomes a richer schema

Warm patienthellip

Too simple a triple

Richer schema

Semantics and Ontologies For EarthCube 14

Allow for Bottom-up amp Top-down Approaches to Semantics

This will ensure a vertical integration from the observations-based data level up

to the theory-driven formalization of key domain facts (such networks in System

Biology)

Transcription

Process

Integrating systems biology models amp biomedical ontologies RHoehndorf

MDumontier JGennari S Wimalaratne5 Bde Bono D Cook and Geo G Koutos

Lab amp Clinical

Observations

From Systems

Biology Markup

Language

After The Translational Medicine Ontology KB driving personalized

medicine by bridging the gap between bench and bedside Joanne S

Luciano et al J Biomed Semantics 2011 2(Suppl 2) S1

15

Do it ldquoBehind the Scenesrdquo

Automate linking the

BampB data via terms and

Correlated measurements

Patient Observations

Lab Tests

Formalize

BampB Models amp data

relations

Semantic technologies require knowledge of formal logic that is

unfamiliar to most BioMedical scientists So Institutionalize what we can

ldquoYou mean I donrsquot have to be able to read

XML RDF or OWL Yea

EPR

Reengineered

EHR biochemical

haematological

amp SNP profile

Acceptable

diagnosis of AD

with behavioral

assessments

cognitive tests

and appropriate

brain scan

BioInformation

Models

Tagging

Annotation etc

Treatment

Plan

SNP verdict

efficacious

disease receptor positive

ndash toxic metabolites

Drug available

Pharmacogenomics

DB

16

Communicating an Understandable Value Proposition

What Uncover hidden heterogeneities amp make them explicit

This affords incompatibility discovery prevent users from mixing apples and oranges

How Promote common vocabularies for annotating and describing data using terms formalized ontologies

Leverage vast number of available repositories ontologies methods standards and tools that support scientists in publishing sharing and discovering data

Value gt expected from annotation using simple metadata

But the community needs to understand the semantic technologies

vision-infrastructure-value in a non-technical language

After

Patrick Maueacute Roth Marcell

(2012) Lost in Translation ndash

Mediating between

distributed environmental

resources 6th

International Congress on

Environmental Modelling

and Software (iEMSs) 2012

Leipzig

Translational Medicine

Objectsconcepts

TM Ontology

Semantics and Ontologies For EarthCube

17

Next Generation TM Vision

Support distributed and interdisciplinary knowledge infrastructures to handle exchange

integration and reuse of heterogeneous Big Data

Linked Data Argument ndash

Linked Data is an easily adoptable

and ready-to-use paradigm that

enables data integration and

interoperation by opening up data

silos

- Part of a knowledge infrastructure

Digital Medicine

Semantics and Ontologies For EarthCube 18

Seven (or so) Guiding Principles for

Facilitating Implementation and Application

Methods

1 Driven by concrete TM use cases needs

2 Use lightweight (semantic) approaches

3 Foster semantic interoperability without restricting extant semantic heterogeneity

4 Employ bottom-up and top-down semantics approaches

5 Involve amp enable domain experts assisted by ontology engineers

6 Continue work with S amp O to build a formal body of knowledge in the various health domains involved in TM

Technology

7 Employ classical and non-classical reasoning services

Semantics and Ontologies For EarthCube 19

Understand Requirements

Concrete Use Cases

Work should be driven by use cases generated by members of the TM community ndash eg Alzheimer

Collecting a set of use cases from groups along the TM spectrum

Need a substantial study of interconnected use cases which expose requirements related to data models and tools

which have clear implications for data interoperability ontology and semantics infrastructure

Semantics and Ontologies For EarthCube 20

Foster Semantic Interoperability without restricting

underlying Semantic Heterogeneity

Problem Heterogeneity is introduced by the diverse clinical amp research communities

Solution Provide methods that enable users to flexibly load and combine different ontologies instead of hardwiring data to particular ontologies and thus hinder their flexible reusability

Example - Work from modular building blocks with microtheories of locally valid semantics

Manage multiple small internally consistent ontologies and focus on interrelations as needed for inter-operation

S Duce amp K Janowicz

ldquoMicrotheories for SDIrdquo

2010

Semantics and Ontologies For Translational Medicine 21

Integrated KE Teams amp Process- domain

experts and semantic technologists

Projects must be structured so domain experts are active participants in building semantic models from use cases thru conceptualization to validating final products

Use

Consistent strategies amp methods

Facilitate good documentation and

May need regular Educational Workshops on how to do this and also publish retrieve and integrate data models and workflows

Practical model for the design amp execution of

translational informatics projects

From Philip Payne‟s Biomedical Knowledge Integration Dec 2012

Illustrates major phases

exemplary input or output

resources and data sets

Insert Semantics

Semantics and Ontologies For Translational Medicine

23

Methods for Useful Formalized Bodies of

Knowledge

Apply ontological engineeringKE to capture the body of knowledge for various TM domains

Conceptualization of local models

Work on primitives ie base symbols for such ontologies

Ground primitives in real observations and align them to knowledge patterns

Track categorical data back to measurements using provenance

(eg RDF in context)

Work to make ontologies first class citizens usable by statistic methods

After construction phase organize building blocks amp ontological models

To help access data domain models and their use in tools

This can also be used for educational applications for learning about domain concepts and extracting information

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 11: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Semantics and Ontologies For Translational Medicine

11

Use Case and Competency Question Methods

Guide lightweight design development from use cases and

Competency Questions needed for applications (Tells us what an Ontology is good for)

Semantic Trajectory Qs

Show moving objects which stop at x and y(could be only x and y)

Show the objects which move at a ground speed of 04 ms

Each moving object(x) has attributes (temperature ground speed heading direction) which describe s status and the environment at that x

Show the trajectories which cross national parks

Parks as Points of Interest and available on maps by lat-lon

Semantics and Ontologies For Translational Medicine

12

Lightweight Building Block Illustration

Low hanging fruit leverages initial vocabularies amp existing

conceptual models to ensure that semantics models are available

for use in early stages of work

Reduced entry barrier for domain scientists to contribute dataLoD

directly applicable to a variety of trajectory datasets and

easily extensible eg to align with existing ontologies foundational

ontologies or other domain specific vocabularies

Simple partspatterns amp direct relations to data Triple like parts

Semantics and Ontologies For EarthCube 13

Incremental Approaches Richer Schemata

Simple Feature-State Model (from GRAIL) becomes a richer schema

Warm patienthellip

Too simple a triple

Richer schema

Semantics and Ontologies For EarthCube 14

Allow for Bottom-up amp Top-down Approaches to Semantics

This will ensure a vertical integration from the observations-based data level up

to the theory-driven formalization of key domain facts (such networks in System

Biology)

Transcription

Process

Integrating systems biology models amp biomedical ontologies RHoehndorf

MDumontier JGennari S Wimalaratne5 Bde Bono D Cook and Geo G Koutos

Lab amp Clinical

Observations

From Systems

Biology Markup

Language

After The Translational Medicine Ontology KB driving personalized

medicine by bridging the gap between bench and bedside Joanne S

Luciano et al J Biomed Semantics 2011 2(Suppl 2) S1

15

Do it ldquoBehind the Scenesrdquo

Automate linking the

BampB data via terms and

Correlated measurements

Patient Observations

Lab Tests

Formalize

BampB Models amp data

relations

Semantic technologies require knowledge of formal logic that is

unfamiliar to most BioMedical scientists So Institutionalize what we can

ldquoYou mean I donrsquot have to be able to read

XML RDF or OWL Yea

EPR

Reengineered

EHR biochemical

haematological

amp SNP profile

Acceptable

diagnosis of AD

with behavioral

assessments

cognitive tests

and appropriate

brain scan

BioInformation

Models

Tagging

Annotation etc

Treatment

Plan

SNP verdict

efficacious

disease receptor positive

ndash toxic metabolites

Drug available

Pharmacogenomics

DB

16

Communicating an Understandable Value Proposition

What Uncover hidden heterogeneities amp make them explicit

This affords incompatibility discovery prevent users from mixing apples and oranges

How Promote common vocabularies for annotating and describing data using terms formalized ontologies

Leverage vast number of available repositories ontologies methods standards and tools that support scientists in publishing sharing and discovering data

Value gt expected from annotation using simple metadata

But the community needs to understand the semantic technologies

vision-infrastructure-value in a non-technical language

After

Patrick Maueacute Roth Marcell

(2012) Lost in Translation ndash

Mediating between

distributed environmental

resources 6th

International Congress on

Environmental Modelling

and Software (iEMSs) 2012

Leipzig

Translational Medicine

Objectsconcepts

TM Ontology

Semantics and Ontologies For EarthCube

17

Next Generation TM Vision

Support distributed and interdisciplinary knowledge infrastructures to handle exchange

integration and reuse of heterogeneous Big Data

Linked Data Argument ndash

Linked Data is an easily adoptable

and ready-to-use paradigm that

enables data integration and

interoperation by opening up data

silos

- Part of a knowledge infrastructure

Digital Medicine

Semantics and Ontologies For EarthCube 18

Seven (or so) Guiding Principles for

Facilitating Implementation and Application

Methods

1 Driven by concrete TM use cases needs

2 Use lightweight (semantic) approaches

3 Foster semantic interoperability without restricting extant semantic heterogeneity

4 Employ bottom-up and top-down semantics approaches

5 Involve amp enable domain experts assisted by ontology engineers

6 Continue work with S amp O to build a formal body of knowledge in the various health domains involved in TM

Technology

7 Employ classical and non-classical reasoning services

Semantics and Ontologies For EarthCube 19

Understand Requirements

Concrete Use Cases

Work should be driven by use cases generated by members of the TM community ndash eg Alzheimer

Collecting a set of use cases from groups along the TM spectrum

Need a substantial study of interconnected use cases which expose requirements related to data models and tools

which have clear implications for data interoperability ontology and semantics infrastructure

Semantics and Ontologies For EarthCube 20

Foster Semantic Interoperability without restricting

underlying Semantic Heterogeneity

Problem Heterogeneity is introduced by the diverse clinical amp research communities

Solution Provide methods that enable users to flexibly load and combine different ontologies instead of hardwiring data to particular ontologies and thus hinder their flexible reusability

Example - Work from modular building blocks with microtheories of locally valid semantics

Manage multiple small internally consistent ontologies and focus on interrelations as needed for inter-operation

S Duce amp K Janowicz

ldquoMicrotheories for SDIrdquo

2010

Semantics and Ontologies For Translational Medicine 21

Integrated KE Teams amp Process- domain

experts and semantic technologists

Projects must be structured so domain experts are active participants in building semantic models from use cases thru conceptualization to validating final products

Use

Consistent strategies amp methods

Facilitate good documentation and

May need regular Educational Workshops on how to do this and also publish retrieve and integrate data models and workflows

Practical model for the design amp execution of

translational informatics projects

From Philip Payne‟s Biomedical Knowledge Integration Dec 2012

Illustrates major phases

exemplary input or output

resources and data sets

Insert Semantics

Semantics and Ontologies For Translational Medicine

23

Methods for Useful Formalized Bodies of

Knowledge

Apply ontological engineeringKE to capture the body of knowledge for various TM domains

Conceptualization of local models

Work on primitives ie base symbols for such ontologies

Ground primitives in real observations and align them to knowledge patterns

Track categorical data back to measurements using provenance

(eg RDF in context)

Work to make ontologies first class citizens usable by statistic methods

After construction phase organize building blocks amp ontological models

To help access data domain models and their use in tools

This can also be used for educational applications for learning about domain concepts and extracting information

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 12: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Semantics and Ontologies For Translational Medicine

12

Lightweight Building Block Illustration

Low hanging fruit leverages initial vocabularies amp existing

conceptual models to ensure that semantics models are available

for use in early stages of work

Reduced entry barrier for domain scientists to contribute dataLoD

directly applicable to a variety of trajectory datasets and

easily extensible eg to align with existing ontologies foundational

ontologies or other domain specific vocabularies

Simple partspatterns amp direct relations to data Triple like parts

Semantics and Ontologies For EarthCube 13

Incremental Approaches Richer Schemata

Simple Feature-State Model (from GRAIL) becomes a richer schema

Warm patienthellip

Too simple a triple

Richer schema

Semantics and Ontologies For EarthCube 14

Allow for Bottom-up amp Top-down Approaches to Semantics

This will ensure a vertical integration from the observations-based data level up

to the theory-driven formalization of key domain facts (such networks in System

Biology)

Transcription

Process

Integrating systems biology models amp biomedical ontologies RHoehndorf

MDumontier JGennari S Wimalaratne5 Bde Bono D Cook and Geo G Koutos

Lab amp Clinical

Observations

From Systems

Biology Markup

Language

After The Translational Medicine Ontology KB driving personalized

medicine by bridging the gap between bench and bedside Joanne S

Luciano et al J Biomed Semantics 2011 2(Suppl 2) S1

15

Do it ldquoBehind the Scenesrdquo

Automate linking the

BampB data via terms and

Correlated measurements

Patient Observations

Lab Tests

Formalize

BampB Models amp data

relations

Semantic technologies require knowledge of formal logic that is

unfamiliar to most BioMedical scientists So Institutionalize what we can

ldquoYou mean I donrsquot have to be able to read

XML RDF or OWL Yea

EPR

Reengineered

EHR biochemical

haematological

amp SNP profile

Acceptable

diagnosis of AD

with behavioral

assessments

cognitive tests

and appropriate

brain scan

BioInformation

Models

Tagging

Annotation etc

Treatment

Plan

SNP verdict

efficacious

disease receptor positive

ndash toxic metabolites

Drug available

Pharmacogenomics

DB

16

Communicating an Understandable Value Proposition

What Uncover hidden heterogeneities amp make them explicit

This affords incompatibility discovery prevent users from mixing apples and oranges

How Promote common vocabularies for annotating and describing data using terms formalized ontologies

Leverage vast number of available repositories ontologies methods standards and tools that support scientists in publishing sharing and discovering data

Value gt expected from annotation using simple metadata

But the community needs to understand the semantic technologies

vision-infrastructure-value in a non-technical language

After

Patrick Maueacute Roth Marcell

(2012) Lost in Translation ndash

Mediating between

distributed environmental

resources 6th

International Congress on

Environmental Modelling

and Software (iEMSs) 2012

Leipzig

Translational Medicine

Objectsconcepts

TM Ontology

Semantics and Ontologies For EarthCube

17

Next Generation TM Vision

Support distributed and interdisciplinary knowledge infrastructures to handle exchange

integration and reuse of heterogeneous Big Data

Linked Data Argument ndash

Linked Data is an easily adoptable

and ready-to-use paradigm that

enables data integration and

interoperation by opening up data

silos

- Part of a knowledge infrastructure

Digital Medicine

Semantics and Ontologies For EarthCube 18

Seven (or so) Guiding Principles for

Facilitating Implementation and Application

Methods

1 Driven by concrete TM use cases needs

2 Use lightweight (semantic) approaches

3 Foster semantic interoperability without restricting extant semantic heterogeneity

4 Employ bottom-up and top-down semantics approaches

5 Involve amp enable domain experts assisted by ontology engineers

6 Continue work with S amp O to build a formal body of knowledge in the various health domains involved in TM

Technology

7 Employ classical and non-classical reasoning services

Semantics and Ontologies For EarthCube 19

Understand Requirements

Concrete Use Cases

Work should be driven by use cases generated by members of the TM community ndash eg Alzheimer

Collecting a set of use cases from groups along the TM spectrum

Need a substantial study of interconnected use cases which expose requirements related to data models and tools

which have clear implications for data interoperability ontology and semantics infrastructure

Semantics and Ontologies For EarthCube 20

Foster Semantic Interoperability without restricting

underlying Semantic Heterogeneity

Problem Heterogeneity is introduced by the diverse clinical amp research communities

Solution Provide methods that enable users to flexibly load and combine different ontologies instead of hardwiring data to particular ontologies and thus hinder their flexible reusability

Example - Work from modular building blocks with microtheories of locally valid semantics

Manage multiple small internally consistent ontologies and focus on interrelations as needed for inter-operation

S Duce amp K Janowicz

ldquoMicrotheories for SDIrdquo

2010

Semantics and Ontologies For Translational Medicine 21

Integrated KE Teams amp Process- domain

experts and semantic technologists

Projects must be structured so domain experts are active participants in building semantic models from use cases thru conceptualization to validating final products

Use

Consistent strategies amp methods

Facilitate good documentation and

May need regular Educational Workshops on how to do this and also publish retrieve and integrate data models and workflows

Practical model for the design amp execution of

translational informatics projects

From Philip Payne‟s Biomedical Knowledge Integration Dec 2012

Illustrates major phases

exemplary input or output

resources and data sets

Insert Semantics

Semantics and Ontologies For Translational Medicine

23

Methods for Useful Formalized Bodies of

Knowledge

Apply ontological engineeringKE to capture the body of knowledge for various TM domains

Conceptualization of local models

Work on primitives ie base symbols for such ontologies

Ground primitives in real observations and align them to knowledge patterns

Track categorical data back to measurements using provenance

(eg RDF in context)

Work to make ontologies first class citizens usable by statistic methods

After construction phase organize building blocks amp ontological models

To help access data domain models and their use in tools

This can also be used for educational applications for learning about domain concepts and extracting information

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 13: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Semantics and Ontologies For EarthCube 13

Incremental Approaches Richer Schemata

Simple Feature-State Model (from GRAIL) becomes a richer schema

Warm patienthellip

Too simple a triple

Richer schema

Semantics and Ontologies For EarthCube 14

Allow for Bottom-up amp Top-down Approaches to Semantics

This will ensure a vertical integration from the observations-based data level up

to the theory-driven formalization of key domain facts (such networks in System

Biology)

Transcription

Process

Integrating systems biology models amp biomedical ontologies RHoehndorf

MDumontier JGennari S Wimalaratne5 Bde Bono D Cook and Geo G Koutos

Lab amp Clinical

Observations

From Systems

Biology Markup

Language

After The Translational Medicine Ontology KB driving personalized

medicine by bridging the gap between bench and bedside Joanne S

Luciano et al J Biomed Semantics 2011 2(Suppl 2) S1

15

Do it ldquoBehind the Scenesrdquo

Automate linking the

BampB data via terms and

Correlated measurements

Patient Observations

Lab Tests

Formalize

BampB Models amp data

relations

Semantic technologies require knowledge of formal logic that is

unfamiliar to most BioMedical scientists So Institutionalize what we can

ldquoYou mean I donrsquot have to be able to read

XML RDF or OWL Yea

EPR

Reengineered

EHR biochemical

haematological

amp SNP profile

Acceptable

diagnosis of AD

with behavioral

assessments

cognitive tests

and appropriate

brain scan

BioInformation

Models

Tagging

Annotation etc

Treatment

Plan

SNP verdict

efficacious

disease receptor positive

ndash toxic metabolites

Drug available

Pharmacogenomics

DB

16

Communicating an Understandable Value Proposition

What Uncover hidden heterogeneities amp make them explicit

This affords incompatibility discovery prevent users from mixing apples and oranges

How Promote common vocabularies for annotating and describing data using terms formalized ontologies

Leverage vast number of available repositories ontologies methods standards and tools that support scientists in publishing sharing and discovering data

Value gt expected from annotation using simple metadata

But the community needs to understand the semantic technologies

vision-infrastructure-value in a non-technical language

After

Patrick Maueacute Roth Marcell

(2012) Lost in Translation ndash

Mediating between

distributed environmental

resources 6th

International Congress on

Environmental Modelling

and Software (iEMSs) 2012

Leipzig

Translational Medicine

Objectsconcepts

TM Ontology

Semantics and Ontologies For EarthCube

17

Next Generation TM Vision

Support distributed and interdisciplinary knowledge infrastructures to handle exchange

integration and reuse of heterogeneous Big Data

Linked Data Argument ndash

Linked Data is an easily adoptable

and ready-to-use paradigm that

enables data integration and

interoperation by opening up data

silos

- Part of a knowledge infrastructure

Digital Medicine

Semantics and Ontologies For EarthCube 18

Seven (or so) Guiding Principles for

Facilitating Implementation and Application

Methods

1 Driven by concrete TM use cases needs

2 Use lightweight (semantic) approaches

3 Foster semantic interoperability without restricting extant semantic heterogeneity

4 Employ bottom-up and top-down semantics approaches

5 Involve amp enable domain experts assisted by ontology engineers

6 Continue work with S amp O to build a formal body of knowledge in the various health domains involved in TM

Technology

7 Employ classical and non-classical reasoning services

Semantics and Ontologies For EarthCube 19

Understand Requirements

Concrete Use Cases

Work should be driven by use cases generated by members of the TM community ndash eg Alzheimer

Collecting a set of use cases from groups along the TM spectrum

Need a substantial study of interconnected use cases which expose requirements related to data models and tools

which have clear implications for data interoperability ontology and semantics infrastructure

Semantics and Ontologies For EarthCube 20

Foster Semantic Interoperability without restricting

underlying Semantic Heterogeneity

Problem Heterogeneity is introduced by the diverse clinical amp research communities

Solution Provide methods that enable users to flexibly load and combine different ontologies instead of hardwiring data to particular ontologies and thus hinder their flexible reusability

Example - Work from modular building blocks with microtheories of locally valid semantics

Manage multiple small internally consistent ontologies and focus on interrelations as needed for inter-operation

S Duce amp K Janowicz

ldquoMicrotheories for SDIrdquo

2010

Semantics and Ontologies For Translational Medicine 21

Integrated KE Teams amp Process- domain

experts and semantic technologists

Projects must be structured so domain experts are active participants in building semantic models from use cases thru conceptualization to validating final products

Use

Consistent strategies amp methods

Facilitate good documentation and

May need regular Educational Workshops on how to do this and also publish retrieve and integrate data models and workflows

Practical model for the design amp execution of

translational informatics projects

From Philip Payne‟s Biomedical Knowledge Integration Dec 2012

Illustrates major phases

exemplary input or output

resources and data sets

Insert Semantics

Semantics and Ontologies For Translational Medicine

23

Methods for Useful Formalized Bodies of

Knowledge

Apply ontological engineeringKE to capture the body of knowledge for various TM domains

Conceptualization of local models

Work on primitives ie base symbols for such ontologies

Ground primitives in real observations and align them to knowledge patterns

Track categorical data back to measurements using provenance

(eg RDF in context)

Work to make ontologies first class citizens usable by statistic methods

After construction phase organize building blocks amp ontological models

To help access data domain models and their use in tools

This can also be used for educational applications for learning about domain concepts and extracting information

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 14: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Semantics and Ontologies For EarthCube 14

Allow for Bottom-up amp Top-down Approaches to Semantics

This will ensure a vertical integration from the observations-based data level up

to the theory-driven formalization of key domain facts (such networks in System

Biology)

Transcription

Process

Integrating systems biology models amp biomedical ontologies RHoehndorf

MDumontier JGennari S Wimalaratne5 Bde Bono D Cook and Geo G Koutos

Lab amp Clinical

Observations

From Systems

Biology Markup

Language

After The Translational Medicine Ontology KB driving personalized

medicine by bridging the gap between bench and bedside Joanne S

Luciano et al J Biomed Semantics 2011 2(Suppl 2) S1

15

Do it ldquoBehind the Scenesrdquo

Automate linking the

BampB data via terms and

Correlated measurements

Patient Observations

Lab Tests

Formalize

BampB Models amp data

relations

Semantic technologies require knowledge of formal logic that is

unfamiliar to most BioMedical scientists So Institutionalize what we can

ldquoYou mean I donrsquot have to be able to read

XML RDF or OWL Yea

EPR

Reengineered

EHR biochemical

haematological

amp SNP profile

Acceptable

diagnosis of AD

with behavioral

assessments

cognitive tests

and appropriate

brain scan

BioInformation

Models

Tagging

Annotation etc

Treatment

Plan

SNP verdict

efficacious

disease receptor positive

ndash toxic metabolites

Drug available

Pharmacogenomics

DB

16

Communicating an Understandable Value Proposition

What Uncover hidden heterogeneities amp make them explicit

This affords incompatibility discovery prevent users from mixing apples and oranges

How Promote common vocabularies for annotating and describing data using terms formalized ontologies

Leverage vast number of available repositories ontologies methods standards and tools that support scientists in publishing sharing and discovering data

Value gt expected from annotation using simple metadata

But the community needs to understand the semantic technologies

vision-infrastructure-value in a non-technical language

After

Patrick Maueacute Roth Marcell

(2012) Lost in Translation ndash

Mediating between

distributed environmental

resources 6th

International Congress on

Environmental Modelling

and Software (iEMSs) 2012

Leipzig

Translational Medicine

Objectsconcepts

TM Ontology

Semantics and Ontologies For EarthCube

17

Next Generation TM Vision

Support distributed and interdisciplinary knowledge infrastructures to handle exchange

integration and reuse of heterogeneous Big Data

Linked Data Argument ndash

Linked Data is an easily adoptable

and ready-to-use paradigm that

enables data integration and

interoperation by opening up data

silos

- Part of a knowledge infrastructure

Digital Medicine

Semantics and Ontologies For EarthCube 18

Seven (or so) Guiding Principles for

Facilitating Implementation and Application

Methods

1 Driven by concrete TM use cases needs

2 Use lightweight (semantic) approaches

3 Foster semantic interoperability without restricting extant semantic heterogeneity

4 Employ bottom-up and top-down semantics approaches

5 Involve amp enable domain experts assisted by ontology engineers

6 Continue work with S amp O to build a formal body of knowledge in the various health domains involved in TM

Technology

7 Employ classical and non-classical reasoning services

Semantics and Ontologies For EarthCube 19

Understand Requirements

Concrete Use Cases

Work should be driven by use cases generated by members of the TM community ndash eg Alzheimer

Collecting a set of use cases from groups along the TM spectrum

Need a substantial study of interconnected use cases which expose requirements related to data models and tools

which have clear implications for data interoperability ontology and semantics infrastructure

Semantics and Ontologies For EarthCube 20

Foster Semantic Interoperability without restricting

underlying Semantic Heterogeneity

Problem Heterogeneity is introduced by the diverse clinical amp research communities

Solution Provide methods that enable users to flexibly load and combine different ontologies instead of hardwiring data to particular ontologies and thus hinder their flexible reusability

Example - Work from modular building blocks with microtheories of locally valid semantics

Manage multiple small internally consistent ontologies and focus on interrelations as needed for inter-operation

S Duce amp K Janowicz

ldquoMicrotheories for SDIrdquo

2010

Semantics and Ontologies For Translational Medicine 21

Integrated KE Teams amp Process- domain

experts and semantic technologists

Projects must be structured so domain experts are active participants in building semantic models from use cases thru conceptualization to validating final products

Use

Consistent strategies amp methods

Facilitate good documentation and

May need regular Educational Workshops on how to do this and also publish retrieve and integrate data models and workflows

Practical model for the design amp execution of

translational informatics projects

From Philip Payne‟s Biomedical Knowledge Integration Dec 2012

Illustrates major phases

exemplary input or output

resources and data sets

Insert Semantics

Semantics and Ontologies For Translational Medicine

23

Methods for Useful Formalized Bodies of

Knowledge

Apply ontological engineeringKE to capture the body of knowledge for various TM domains

Conceptualization of local models

Work on primitives ie base symbols for such ontologies

Ground primitives in real observations and align them to knowledge patterns

Track categorical data back to measurements using provenance

(eg RDF in context)

Work to make ontologies first class citizens usable by statistic methods

After construction phase organize building blocks amp ontological models

To help access data domain models and their use in tools

This can also be used for educational applications for learning about domain concepts and extracting information

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 15: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

After The Translational Medicine Ontology KB driving personalized

medicine by bridging the gap between bench and bedside Joanne S

Luciano et al J Biomed Semantics 2011 2(Suppl 2) S1

15

Do it ldquoBehind the Scenesrdquo

Automate linking the

BampB data via terms and

Correlated measurements

Patient Observations

Lab Tests

Formalize

BampB Models amp data

relations

Semantic technologies require knowledge of formal logic that is

unfamiliar to most BioMedical scientists So Institutionalize what we can

ldquoYou mean I donrsquot have to be able to read

XML RDF or OWL Yea

EPR

Reengineered

EHR biochemical

haematological

amp SNP profile

Acceptable

diagnosis of AD

with behavioral

assessments

cognitive tests

and appropriate

brain scan

BioInformation

Models

Tagging

Annotation etc

Treatment

Plan

SNP verdict

efficacious

disease receptor positive

ndash toxic metabolites

Drug available

Pharmacogenomics

DB

16

Communicating an Understandable Value Proposition

What Uncover hidden heterogeneities amp make them explicit

This affords incompatibility discovery prevent users from mixing apples and oranges

How Promote common vocabularies for annotating and describing data using terms formalized ontologies

Leverage vast number of available repositories ontologies methods standards and tools that support scientists in publishing sharing and discovering data

Value gt expected from annotation using simple metadata

But the community needs to understand the semantic technologies

vision-infrastructure-value in a non-technical language

After

Patrick Maueacute Roth Marcell

(2012) Lost in Translation ndash

Mediating between

distributed environmental

resources 6th

International Congress on

Environmental Modelling

and Software (iEMSs) 2012

Leipzig

Translational Medicine

Objectsconcepts

TM Ontology

Semantics and Ontologies For EarthCube

17

Next Generation TM Vision

Support distributed and interdisciplinary knowledge infrastructures to handle exchange

integration and reuse of heterogeneous Big Data

Linked Data Argument ndash

Linked Data is an easily adoptable

and ready-to-use paradigm that

enables data integration and

interoperation by opening up data

silos

- Part of a knowledge infrastructure

Digital Medicine

Semantics and Ontologies For EarthCube 18

Seven (or so) Guiding Principles for

Facilitating Implementation and Application

Methods

1 Driven by concrete TM use cases needs

2 Use lightweight (semantic) approaches

3 Foster semantic interoperability without restricting extant semantic heterogeneity

4 Employ bottom-up and top-down semantics approaches

5 Involve amp enable domain experts assisted by ontology engineers

6 Continue work with S amp O to build a formal body of knowledge in the various health domains involved in TM

Technology

7 Employ classical and non-classical reasoning services

Semantics and Ontologies For EarthCube 19

Understand Requirements

Concrete Use Cases

Work should be driven by use cases generated by members of the TM community ndash eg Alzheimer

Collecting a set of use cases from groups along the TM spectrum

Need a substantial study of interconnected use cases which expose requirements related to data models and tools

which have clear implications for data interoperability ontology and semantics infrastructure

Semantics and Ontologies For EarthCube 20

Foster Semantic Interoperability without restricting

underlying Semantic Heterogeneity

Problem Heterogeneity is introduced by the diverse clinical amp research communities

Solution Provide methods that enable users to flexibly load and combine different ontologies instead of hardwiring data to particular ontologies and thus hinder their flexible reusability

Example - Work from modular building blocks with microtheories of locally valid semantics

Manage multiple small internally consistent ontologies and focus on interrelations as needed for inter-operation

S Duce amp K Janowicz

ldquoMicrotheories for SDIrdquo

2010

Semantics and Ontologies For Translational Medicine 21

Integrated KE Teams amp Process- domain

experts and semantic technologists

Projects must be structured so domain experts are active participants in building semantic models from use cases thru conceptualization to validating final products

Use

Consistent strategies amp methods

Facilitate good documentation and

May need regular Educational Workshops on how to do this and also publish retrieve and integrate data models and workflows

Practical model for the design amp execution of

translational informatics projects

From Philip Payne‟s Biomedical Knowledge Integration Dec 2012

Illustrates major phases

exemplary input or output

resources and data sets

Insert Semantics

Semantics and Ontologies For Translational Medicine

23

Methods for Useful Formalized Bodies of

Knowledge

Apply ontological engineeringKE to capture the body of knowledge for various TM domains

Conceptualization of local models

Work on primitives ie base symbols for such ontologies

Ground primitives in real observations and align them to knowledge patterns

Track categorical data back to measurements using provenance

(eg RDF in context)

Work to make ontologies first class citizens usable by statistic methods

After construction phase organize building blocks amp ontological models

To help access data domain models and their use in tools

This can also be used for educational applications for learning about domain concepts and extracting information

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 16: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

16

Communicating an Understandable Value Proposition

What Uncover hidden heterogeneities amp make them explicit

This affords incompatibility discovery prevent users from mixing apples and oranges

How Promote common vocabularies for annotating and describing data using terms formalized ontologies

Leverage vast number of available repositories ontologies methods standards and tools that support scientists in publishing sharing and discovering data

Value gt expected from annotation using simple metadata

But the community needs to understand the semantic technologies

vision-infrastructure-value in a non-technical language

After

Patrick Maueacute Roth Marcell

(2012) Lost in Translation ndash

Mediating between

distributed environmental

resources 6th

International Congress on

Environmental Modelling

and Software (iEMSs) 2012

Leipzig

Translational Medicine

Objectsconcepts

TM Ontology

Semantics and Ontologies For EarthCube

17

Next Generation TM Vision

Support distributed and interdisciplinary knowledge infrastructures to handle exchange

integration and reuse of heterogeneous Big Data

Linked Data Argument ndash

Linked Data is an easily adoptable

and ready-to-use paradigm that

enables data integration and

interoperation by opening up data

silos

- Part of a knowledge infrastructure

Digital Medicine

Semantics and Ontologies For EarthCube 18

Seven (or so) Guiding Principles for

Facilitating Implementation and Application

Methods

1 Driven by concrete TM use cases needs

2 Use lightweight (semantic) approaches

3 Foster semantic interoperability without restricting extant semantic heterogeneity

4 Employ bottom-up and top-down semantics approaches

5 Involve amp enable domain experts assisted by ontology engineers

6 Continue work with S amp O to build a formal body of knowledge in the various health domains involved in TM

Technology

7 Employ classical and non-classical reasoning services

Semantics and Ontologies For EarthCube 19

Understand Requirements

Concrete Use Cases

Work should be driven by use cases generated by members of the TM community ndash eg Alzheimer

Collecting a set of use cases from groups along the TM spectrum

Need a substantial study of interconnected use cases which expose requirements related to data models and tools

which have clear implications for data interoperability ontology and semantics infrastructure

Semantics and Ontologies For EarthCube 20

Foster Semantic Interoperability without restricting

underlying Semantic Heterogeneity

Problem Heterogeneity is introduced by the diverse clinical amp research communities

Solution Provide methods that enable users to flexibly load and combine different ontologies instead of hardwiring data to particular ontologies and thus hinder their flexible reusability

Example - Work from modular building blocks with microtheories of locally valid semantics

Manage multiple small internally consistent ontologies and focus on interrelations as needed for inter-operation

S Duce amp K Janowicz

ldquoMicrotheories for SDIrdquo

2010

Semantics and Ontologies For Translational Medicine 21

Integrated KE Teams amp Process- domain

experts and semantic technologists

Projects must be structured so domain experts are active participants in building semantic models from use cases thru conceptualization to validating final products

Use

Consistent strategies amp methods

Facilitate good documentation and

May need regular Educational Workshops on how to do this and also publish retrieve and integrate data models and workflows

Practical model for the design amp execution of

translational informatics projects

From Philip Payne‟s Biomedical Knowledge Integration Dec 2012

Illustrates major phases

exemplary input or output

resources and data sets

Insert Semantics

Semantics and Ontologies For Translational Medicine

23

Methods for Useful Formalized Bodies of

Knowledge

Apply ontological engineeringKE to capture the body of knowledge for various TM domains

Conceptualization of local models

Work on primitives ie base symbols for such ontologies

Ground primitives in real observations and align them to knowledge patterns

Track categorical data back to measurements using provenance

(eg RDF in context)

Work to make ontologies first class citizens usable by statistic methods

After construction phase organize building blocks amp ontological models

To help access data domain models and their use in tools

This can also be used for educational applications for learning about domain concepts and extracting information

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 17: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Semantics and Ontologies For EarthCube

17

Next Generation TM Vision

Support distributed and interdisciplinary knowledge infrastructures to handle exchange

integration and reuse of heterogeneous Big Data

Linked Data Argument ndash

Linked Data is an easily adoptable

and ready-to-use paradigm that

enables data integration and

interoperation by opening up data

silos

- Part of a knowledge infrastructure

Digital Medicine

Semantics and Ontologies For EarthCube 18

Seven (or so) Guiding Principles for

Facilitating Implementation and Application

Methods

1 Driven by concrete TM use cases needs

2 Use lightweight (semantic) approaches

3 Foster semantic interoperability without restricting extant semantic heterogeneity

4 Employ bottom-up and top-down semantics approaches

5 Involve amp enable domain experts assisted by ontology engineers

6 Continue work with S amp O to build a formal body of knowledge in the various health domains involved in TM

Technology

7 Employ classical and non-classical reasoning services

Semantics and Ontologies For EarthCube 19

Understand Requirements

Concrete Use Cases

Work should be driven by use cases generated by members of the TM community ndash eg Alzheimer

Collecting a set of use cases from groups along the TM spectrum

Need a substantial study of interconnected use cases which expose requirements related to data models and tools

which have clear implications for data interoperability ontology and semantics infrastructure

Semantics and Ontologies For EarthCube 20

Foster Semantic Interoperability without restricting

underlying Semantic Heterogeneity

Problem Heterogeneity is introduced by the diverse clinical amp research communities

Solution Provide methods that enable users to flexibly load and combine different ontologies instead of hardwiring data to particular ontologies and thus hinder their flexible reusability

Example - Work from modular building blocks with microtheories of locally valid semantics

Manage multiple small internally consistent ontologies and focus on interrelations as needed for inter-operation

S Duce amp K Janowicz

ldquoMicrotheories for SDIrdquo

2010

Semantics and Ontologies For Translational Medicine 21

Integrated KE Teams amp Process- domain

experts and semantic technologists

Projects must be structured so domain experts are active participants in building semantic models from use cases thru conceptualization to validating final products

Use

Consistent strategies amp methods

Facilitate good documentation and

May need regular Educational Workshops on how to do this and also publish retrieve and integrate data models and workflows

Practical model for the design amp execution of

translational informatics projects

From Philip Payne‟s Biomedical Knowledge Integration Dec 2012

Illustrates major phases

exemplary input or output

resources and data sets

Insert Semantics

Semantics and Ontologies For Translational Medicine

23

Methods for Useful Formalized Bodies of

Knowledge

Apply ontological engineeringKE to capture the body of knowledge for various TM domains

Conceptualization of local models

Work on primitives ie base symbols for such ontologies

Ground primitives in real observations and align them to knowledge patterns

Track categorical data back to measurements using provenance

(eg RDF in context)

Work to make ontologies first class citizens usable by statistic methods

After construction phase organize building blocks amp ontological models

To help access data domain models and their use in tools

This can also be used for educational applications for learning about domain concepts and extracting information

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 18: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Semantics and Ontologies For EarthCube 18

Seven (or so) Guiding Principles for

Facilitating Implementation and Application

Methods

1 Driven by concrete TM use cases needs

2 Use lightweight (semantic) approaches

3 Foster semantic interoperability without restricting extant semantic heterogeneity

4 Employ bottom-up and top-down semantics approaches

5 Involve amp enable domain experts assisted by ontology engineers

6 Continue work with S amp O to build a formal body of knowledge in the various health domains involved in TM

Technology

7 Employ classical and non-classical reasoning services

Semantics and Ontologies For EarthCube 19

Understand Requirements

Concrete Use Cases

Work should be driven by use cases generated by members of the TM community ndash eg Alzheimer

Collecting a set of use cases from groups along the TM spectrum

Need a substantial study of interconnected use cases which expose requirements related to data models and tools

which have clear implications for data interoperability ontology and semantics infrastructure

Semantics and Ontologies For EarthCube 20

Foster Semantic Interoperability without restricting

underlying Semantic Heterogeneity

Problem Heterogeneity is introduced by the diverse clinical amp research communities

Solution Provide methods that enable users to flexibly load and combine different ontologies instead of hardwiring data to particular ontologies and thus hinder their flexible reusability

Example - Work from modular building blocks with microtheories of locally valid semantics

Manage multiple small internally consistent ontologies and focus on interrelations as needed for inter-operation

S Duce amp K Janowicz

ldquoMicrotheories for SDIrdquo

2010

Semantics and Ontologies For Translational Medicine 21

Integrated KE Teams amp Process- domain

experts and semantic technologists

Projects must be structured so domain experts are active participants in building semantic models from use cases thru conceptualization to validating final products

Use

Consistent strategies amp methods

Facilitate good documentation and

May need regular Educational Workshops on how to do this and also publish retrieve and integrate data models and workflows

Practical model for the design amp execution of

translational informatics projects

From Philip Payne‟s Biomedical Knowledge Integration Dec 2012

Illustrates major phases

exemplary input or output

resources and data sets

Insert Semantics

Semantics and Ontologies For Translational Medicine

23

Methods for Useful Formalized Bodies of

Knowledge

Apply ontological engineeringKE to capture the body of knowledge for various TM domains

Conceptualization of local models

Work on primitives ie base symbols for such ontologies

Ground primitives in real observations and align them to knowledge patterns

Track categorical data back to measurements using provenance

(eg RDF in context)

Work to make ontologies first class citizens usable by statistic methods

After construction phase organize building blocks amp ontological models

To help access data domain models and their use in tools

This can also be used for educational applications for learning about domain concepts and extracting information

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 19: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Semantics and Ontologies For EarthCube 19

Understand Requirements

Concrete Use Cases

Work should be driven by use cases generated by members of the TM community ndash eg Alzheimer

Collecting a set of use cases from groups along the TM spectrum

Need a substantial study of interconnected use cases which expose requirements related to data models and tools

which have clear implications for data interoperability ontology and semantics infrastructure

Semantics and Ontologies For EarthCube 20

Foster Semantic Interoperability without restricting

underlying Semantic Heterogeneity

Problem Heterogeneity is introduced by the diverse clinical amp research communities

Solution Provide methods that enable users to flexibly load and combine different ontologies instead of hardwiring data to particular ontologies and thus hinder their flexible reusability

Example - Work from modular building blocks with microtheories of locally valid semantics

Manage multiple small internally consistent ontologies and focus on interrelations as needed for inter-operation

S Duce amp K Janowicz

ldquoMicrotheories for SDIrdquo

2010

Semantics and Ontologies For Translational Medicine 21

Integrated KE Teams amp Process- domain

experts and semantic technologists

Projects must be structured so domain experts are active participants in building semantic models from use cases thru conceptualization to validating final products

Use

Consistent strategies amp methods

Facilitate good documentation and

May need regular Educational Workshops on how to do this and also publish retrieve and integrate data models and workflows

Practical model for the design amp execution of

translational informatics projects

From Philip Payne‟s Biomedical Knowledge Integration Dec 2012

Illustrates major phases

exemplary input or output

resources and data sets

Insert Semantics

Semantics and Ontologies For Translational Medicine

23

Methods for Useful Formalized Bodies of

Knowledge

Apply ontological engineeringKE to capture the body of knowledge for various TM domains

Conceptualization of local models

Work on primitives ie base symbols for such ontologies

Ground primitives in real observations and align them to knowledge patterns

Track categorical data back to measurements using provenance

(eg RDF in context)

Work to make ontologies first class citizens usable by statistic methods

After construction phase organize building blocks amp ontological models

To help access data domain models and their use in tools

This can also be used for educational applications for learning about domain concepts and extracting information

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 20: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Semantics and Ontologies For EarthCube 20

Foster Semantic Interoperability without restricting

underlying Semantic Heterogeneity

Problem Heterogeneity is introduced by the diverse clinical amp research communities

Solution Provide methods that enable users to flexibly load and combine different ontologies instead of hardwiring data to particular ontologies and thus hinder their flexible reusability

Example - Work from modular building blocks with microtheories of locally valid semantics

Manage multiple small internally consistent ontologies and focus on interrelations as needed for inter-operation

S Duce amp K Janowicz

ldquoMicrotheories for SDIrdquo

2010

Semantics and Ontologies For Translational Medicine 21

Integrated KE Teams amp Process- domain

experts and semantic technologists

Projects must be structured so domain experts are active participants in building semantic models from use cases thru conceptualization to validating final products

Use

Consistent strategies amp methods

Facilitate good documentation and

May need regular Educational Workshops on how to do this and also publish retrieve and integrate data models and workflows

Practical model for the design amp execution of

translational informatics projects

From Philip Payne‟s Biomedical Knowledge Integration Dec 2012

Illustrates major phases

exemplary input or output

resources and data sets

Insert Semantics

Semantics and Ontologies For Translational Medicine

23

Methods for Useful Formalized Bodies of

Knowledge

Apply ontological engineeringKE to capture the body of knowledge for various TM domains

Conceptualization of local models

Work on primitives ie base symbols for such ontologies

Ground primitives in real observations and align them to knowledge patterns

Track categorical data back to measurements using provenance

(eg RDF in context)

Work to make ontologies first class citizens usable by statistic methods

After construction phase organize building blocks amp ontological models

To help access data domain models and their use in tools

This can also be used for educational applications for learning about domain concepts and extracting information

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 21: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Semantics and Ontologies For Translational Medicine 21

Integrated KE Teams amp Process- domain

experts and semantic technologists

Projects must be structured so domain experts are active participants in building semantic models from use cases thru conceptualization to validating final products

Use

Consistent strategies amp methods

Facilitate good documentation and

May need regular Educational Workshops on how to do this and also publish retrieve and integrate data models and workflows

Practical model for the design amp execution of

translational informatics projects

From Philip Payne‟s Biomedical Knowledge Integration Dec 2012

Illustrates major phases

exemplary input or output

resources and data sets

Insert Semantics

Semantics and Ontologies For Translational Medicine

23

Methods for Useful Formalized Bodies of

Knowledge

Apply ontological engineeringKE to capture the body of knowledge for various TM domains

Conceptualization of local models

Work on primitives ie base symbols for such ontologies

Ground primitives in real observations and align them to knowledge patterns

Track categorical data back to measurements using provenance

(eg RDF in context)

Work to make ontologies first class citizens usable by statistic methods

After construction phase organize building blocks amp ontological models

To help access data domain models and their use in tools

This can also be used for educational applications for learning about domain concepts and extracting information

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 22: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Practical model for the design amp execution of

translational informatics projects

From Philip Payne‟s Biomedical Knowledge Integration Dec 2012

Illustrates major phases

exemplary input or output

resources and data sets

Insert Semantics

Semantics and Ontologies For Translational Medicine

23

Methods for Useful Formalized Bodies of

Knowledge

Apply ontological engineeringKE to capture the body of knowledge for various TM domains

Conceptualization of local models

Work on primitives ie base symbols for such ontologies

Ground primitives in real observations and align them to knowledge patterns

Track categorical data back to measurements using provenance

(eg RDF in context)

Work to make ontologies first class citizens usable by statistic methods

After construction phase organize building blocks amp ontological models

To help access data domain models and their use in tools

This can also be used for educational applications for learning about domain concepts and extracting information

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 23: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Semantics and Ontologies For Translational Medicine

23

Methods for Useful Formalized Bodies of

Knowledge

Apply ontological engineeringKE to capture the body of knowledge for various TM domains

Conceptualization of local models

Work on primitives ie base symbols for such ontologies

Ground primitives in real observations and align them to knowledge patterns

Track categorical data back to measurements using provenance

(eg RDF in context)

Work to make ontologies first class citizens usable by statistic methods

After construction phase organize building blocks amp ontological models

To help access data domain models and their use in tools

This can also be used for educational applications for learning about domain concepts and extracting information

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 24: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Semantics and Ontologies Translational Medicine 24

Provide Reasoning Services for Products

Developed by our Methods Behind the scenes - classical and non-classical reasoning services leveraging resources for

organizing and accessing data

models and tools

learning about them and

extracting information

Reasoning services can be used to

Develop friendly user interfaces

Dialog systems

Scientist assistingassociate services (chains) for

discovering data

integrity constraint checking

generation of new knowledge and hypothesis testing

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 25: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Research Data Alliance Vision Purpose Partners

Vision Researchers around the world share amp use research data without

barriers

Purpose accelerate international data-driven innovation and discovery by

facilitating research data sharing and exchange

use and re-use

standards harmonization and

discoverability

This will be achieved through the development and adoption of infrastructure policy practice standards and other deliverables

Partners brought into existence by an initial 3 research funding organizations

The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative

The European Commission through the iCordi project funded under the 7th Framework Program

In US through the RDAUS activity funded by the National Science Foundation

httprd-allianceorgorgindexhtml

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 26: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

RDA Cites Big Cites in HC as Opportunity

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 27: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Why RDA For Coordinated Action

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 28: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Bob Chen RDA is intersection of internet culture and science culture

EcosystemsSocio-Techical network view

need diversity and many types of it and often rely on bdquoweak‟ interactions

SO hellip partnerships coordination networks work dependencies etc

Super Nodes ndash people CI roles Roles responsibilities and resulting delegation to the smaller nodes around the super nodes (Peter Fox)

RDA networking global data initiatives in era of

Open Science The European

Commission the US

and Australia have

formally launched a

collaboration called

the Research Data

Alliance in

Gothenburg Sweden

People

amp Orgs

CIamp Data

Roles

Sociological

Technical

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 29: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

We need more than Ontologies - Joint Aspects of

Socio-technical system(s)

Innovation is driven by several things

1 Changing relationships between scientists institutions

organisms methods and technologies

2 Changing topology of research literature dominant

topics questions problem areas research fronts

3 Changing relationships between concepts Manfred D Laubichler

Sociological

ndash people and

groups of

people

Technical

ndash

Semantics Semantic

community

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand

Page 30: Pragmatic Big Data View for Translational Medicinescimaps.org/exhibit/images/130325/intro-semantics-berg-cross.pdf · Web & Linked Data framework LOD delivering a platform agnostic

Semantics and Ontologies For Translational Medicine 30

Closing Remarks and Comments

While many details need to be added these should come from continued dialog such as afforded by VoCamps amp TM domain conferences

Goals going forward

Converge on and integrate more (BIG) TM data in an open transparent and inclusive manner

Make adoption easier for clinicians researchers (educators)

Expose data and information to knowledge creation through data-enabled science

Enhance Interworkability of data and information

All of these of course depends on our tools and techniques scaling up to the magnitude of the problems at hand