OBO Foundry Principles BFO RO Barry Smith 1. OBO Foundry Principles open common formal language...

Post on 19-Jan-2018

233 views 0 download

description

OBO Foundry Principles  common architecture (= RO + BFO)  clearly delineated content (redundant – overlaps with orthogonality)  the ontology is well-documented (– overlaps with rules for definitions; needs expanding, for developers, for users, minimal metadata)  plurality of independent users  single locus of authority, trackers, help desk 3

Transcript of OBO Foundry Principles BFO RO Barry Smith 1. OBO Foundry Principles open common formal language...

OBO Foundry PrinciplesBFORO

Barry Smith

1

OBO Foundry Principles open

common formal language (OBO Format, OWL DL, CL)

commitment to collaboration

maintenance in light of scientific advance

unique identifier space (Alan)

naming conventions (Susanna / EBI) – metadata for changes

versioning2

OBO Foundry Principles common architecture (= RO + BFO)

clearly delineated content (redundant – overlaps with orthogonality)

the ontology is well-documented (– overlaps with rules for definitions; needs expanding, for developers, for users, minimal metadata)

plurality of independent users

single locus of authority, trackers, help desk

3

OBO Foundry Principles

textual definitions plus formal definitions all definitions should be of the genus-species

form, utilizing cross-products therefore: single is_a inheritance (= each

ontology should be conceived as consisting of a core of asserted single inheritance with further is_a relations inferred)

4

Orthogonality• For each domain, there should be convergence upon a

single ontology that is recommended for use by those who wish to become involved with the Foundry initiative

• Compare what happens in other parts of science: for each domain, there should be convergence upon a single theory

Preventing silos on the side of annotated data = preventing forking of the ontologies used for annotation

5

Strategy to ensure orthogonality

• If the Foundry already has an ontology O1 covering a domain D, and an outside group creates a second ontology O2 covering D (or part of D), we need to ask:– is it in every respect better? (then replace O1 with

O2)– is it in some respects better? (then negotiate an

improved synthesis, O3)ASSUMPTION: ontologies are always comparablePROBLEM: need better measures of ontology quality)

6

Benefits of orthogonality

• Offers a solution to the problem of silos that is– modular– incremental– empirically based– incorporates a strategy for motivating potential

developers and users

7

Orthogonality = non-redundancy for the reference ontologies inside

the Foundry

• CARO-Mammal will not be orthogonal to CARO

• IDO-Malaria will not be orthogonal to IDO• IDO will not be orthogonal to DO• DO will be orthogonal to CL

8

Absolute redundancy for application ontologies

= all terms in application ontologies should be taken from orthogonal reference ontologies within the Foundry

9

Benefits of orthogonality

• Modularity brings benefits of division of labor, division of authority, minimizes redundancy

10

Benefits of orthogonality

• Scientists become motivated to commit themselves to developing an ontology falling within their domain of expertise because they themselves will need to use this ontology in their own work in the future.

• Forking would erode this motivation

11

Benefits of orthogonality

• Incrementality means that the strategy will still work even if ontologies are still only partial

• this allows adoption and application at early stages

12

Benefits of orthogonality

• Empirically based means that we can always go back and start again if some ontology module does not work (compare the problem of non-modular approaches like SNOMED CT, where it is all or nothing)

13

Benefits of orthogonality

• Modularity brings ownership, motivates on scientist-developers to commit themselves long term to developing the ontology

• This in turn motivates users to commit themselves to adoption – they see strong positive network effects from use

of the ontology)– they gain reassurance from long-term

commitment

14

Benefits of orthogonality

• It helps those new to ontology who need to know where to look in finding an ontology relating to their subject-matter

• it obviates the need for ‘mappings’ between ontologies, which are – difficult to create and use– error-prone – hard to keep up-to-date when mapped ontologies

change

15

Benefits of orthogonality

• modularity (orthogonality) ensures the mutual consistency of ontologies, and thereby also the additivity of the annotations created with their aid by different groups of annotators describing common bodies of data.

• thereby contributes to the cumulativity of science and allows new forms of unmanaged collaboration.

16

Benefits of orthogonality

• brings grave responsibilities to those in charge of ensuring for each domain that the Foundry includes an ontology for that domain

• they must commit to perpetual striving for scientific accuracy and domain-completeness in their work

• orthogonality rules out the sorts of simplification and partiality which may be acceptable under more pluralistic regimes

17

Benefits of orthogonality

• it supports the strategy of utilizing cross-products in composing terms and definitions

• this strategy will work only if we can – minimize the degree of arbitrariness involved in

selecting the terms to be composed – and thereby maximize the degree to which the

Foundry ontologies are networked together through the cross-product links

18

Misunderstandings of Orthogonality

• Orthogonality does not mean that all ontologies must be developed within the Foundry framework

• We welcome the development of competing approaches to open-access ontology development – which can only make the Foundry stronger

19

Problems with Orthogonality

• what if researchers need purpose-built ontologies to meet their own specific needs?

• OBO Foundry provides orthogonal reference ontologies, so that they can as far as possible build their application ontologies using terms composed as cross-products

• thereby avoid silos• and contributing new terms back to the

Foundry in case of need

20

Problems with Orthogonality

• For each domain, there should be convergence upon a single ontology that is recommended for use by those who wish to become involved with the Foundry initiativeQ: WHAT DOES ORTHOGONALITY MEAN?

minimally: two ontologies are not orthogonal if they share a single term with the same meaning

Q: WHAT DOES DOMAIN MEAN?

21

22

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENTCell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

Initial OBO Foundry Reference Ontologies (jigsaw)

Homesteading

Recommendation: Ontology developers should register their claim on territory not yet unoccupied, as soon as possible, because the Foundry is designed to serve as an attractor for collaboration

23

24

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENTCell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

Orthogonality = Westphalian principles of national sovereignty for reference ontologies no shared territory

Varieties of application ontology

• cross-border national parks• Slims• Fractal ontologies• Cross-product ontologies

– Template ontologies (CARO, IDO, GDO …)

25

26

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENTCell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

cross-border national parks: an ontology for studying the effects of viral infection on cell function in shrimp

27

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

Slims = an ontology of dendritic cells

28

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

Slims = an ontology of dendritic cells, with definitions composed using terms from other ontologies

29

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENTCell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

fractal ontologies, employing small portions of many ontologies (e.g. MSO Multiple Sclerosis Ontology)

30

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity

(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Organism-Level Process

(GO)

CELL AND CELLULAR

COMPONENTCell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

Cellular Process

(GO)

MOLECULEMolecule

(ChEBI, SO,RNAO, PRO)

Molecular Function(GO)

Molecular Process

(GO)

rationale of OBO Foundry coverage + BFO

GRANULARITY

RELATION TO TIME

31

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENTCell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

types plus instances

Continuants (aka endurants)– have continuous existence in time– preserve their identity through change– exist in toto whenever they exist at all

Occurrents (aka processes)– have temporal parts– unfold themselves in successive phases– exist only in their phases

Fundamental Dichotomy

Functions are continuants

Functionings are occurrents

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunctio

n(placeholder)

Phenotypic Quality(PATO)

Disease (DO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Component(FMA, GO)

Cellular Functio

n(GO)

MOLECULE (ChEBI, SO,RNAO, PRO)

Molecular Function(GO)

Molecular Process

(GO)

Biom

edical Investigations (O

BI)

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy / placeholde

r)

Anatomical Entity(FMA, CARO)

OrganFunction(placehol

der)

Phenotypic Quality(PATO)

Disease (DO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Component(FMA, GO)

Cellular Function

(GO)

MOLECULE (ChEBI, SO,RNAO, PRO)

Molecular Function(GO)

Molecular Process

(GO)

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy / placeholde

r)

Anatomical Entity(FMA, CARO)

OrganFunction(placehol

der)Phenotypic Quality(PATO)

Disease (DO)

Biological Process

(GO)

CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Component(FMA, GO)

Cellular Function

(GO)

Cellular Patholog

y????

MOLECULE (ChEBI, SO,RNAO, PRO)

Molecular Function(GO)

Molecular Process

(GO)

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy / placeholde

r)

Anatomical Entity(FMA, CARO)

OrganFunction(placehol

der)Phenotypic Quality(PATO)

Disease (DO)

Biological Process

(GO)

CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Component(FMA, GO)

Cellular Function

????(GO???)

Cellular Patholog

y????

MOLECULE (ChEBI, SO,RNAO, PRO)

Molecular Function(GO)

Molecular Process

(GO)

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy / placeholde

r)

Anatomical Entity(FMA, CARO)

OrganFunction(placehol

der)

Phenotypic Quality(PATO)

Disease (DO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Component(FMA, GO)

Cellular Function

(GO)

MOLECULE (ChEBI, SO,RNAO, PRO)

Molecular Function(GO)

Molecular Process

(GO)

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy / placeholde

r)

Anatomical Entity(FMA, CARO)

OrganFunction(placehol

der)

Phenotypic Quality(PATO)

Disease (DO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Component(FMA, GO)

Cellular Function

(GO)

MOLECULE

2- and 3-D Structure(RNAO)

(PRO)

Molecular Function(GO)

Molecular Process

(GO)Small

Molecule(ChEBI)

1-DSequence

(SO)

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy / placeholde

r)

Anatomical Entity(FMA, CARO)

OrganFunction(placehol

der)

Phenotypic Quality(PATO)

Disease (DO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Component(FMA, GO)

Cellular Function

(GO)

MOLECULE

2- and 3-D Structure(RNAO)

(PRO)

Molecular Function(GO)

Molecular Process

(GO) ?????

Small Molecule(ChEBI)

1-DSequence

(SO)Molecular Pathway

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy / placeholde

r)

Anatomical Entity(FMA, CARO)

OrganFunction(placehol

der)

Phenotypic Quality(PATO)

Disease (DO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Component(FMA, GO)

Cellular Function

(GO)

MOLECULE

2- and 3-D Structure(RNAO)

(PRO)Molecula

r Function

(GO)

Phenotypic Quality of Molecule

????

Molecular Process

(GO) ?????

Small Molecule(ChEBI)

1-DSequence

(SO)Reactome

Orthogonality can be preserved by expanding the territory (land

reclamation)

42

43

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENTCell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

GO already started to deal with biological processes involving multiple organisms

44

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Family, Community, Deme, Population

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

CELL AND CELLULAR

COMPONENTCell(CL)

Cellular Componen

t(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)http://obofoundry.org

45

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

COMPLEX OFORGANISMS

Family, Community, Deme, Population

OrganFunction

(FMP, CPRO)

Population Phenotype

PopulationProcess

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENTCell(CL)

Cellular Componen

t(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)http://obofoundry.org

46

RELATION

TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

COMPLEX OF ORGANISMS

Family, Community,

Deme, Population OrganFunction

(FMP, CPRO)

Population

Phenotype

Population Process

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

(FMA, CARO)

Phenotypic Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENTCell(CL)

Cell Com-

ponent(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)http://obofoundry.org

E N V

I R O

N M

E N T

47

RELATION

TO TIME

GRANULARITY

CONTINUANT

INDEPENDENT

COMPLEX OF ORGANISMS

Family, Community,

Deme, PopulationEnvironment of

populationORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

(FMA, CARO)

Environment of single organism

CELL AND CELLULAR

COMPONENTCell(CL)

Cell Com-

ponent(FMA, GO)

Environment of cell

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular environmenthttp://obofoundry.org

E N V

I R O

N M

E N T

48

RELATION

TO TIME

GRANULARITY

CONTINUANT

INDEPENDENT

COMPLEX OF ORGANISMS

Family, Community, Deme, Population

Environment of population

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

(FMA, CARO)

Environment of single organism*

CELL AND CELLULAR

COMPONENTCell(CL)

Cell Com-

ponent(FMA, GO)

Environment of cell

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular environment

E N V

I R O

N M

E N T

* The sum total of the conditions and elements that make up the surroundings and influence the development and actions of an individual.

49

RELATION TO TIME

GRANULARITY

CONTINUANT

INDEPENDENT

COMPLEX OFORGANISMS

biome / biotope, territory, habitat, neighborhood, ...

work environment, home environment;host/symbiont environment; ...

ORGAN ANDORGANISM

CELL AND CELLULAR

COMPONENT

extracellular matrix; chemokine gradient; ...

MOLECULEhydrophobic surface; virus localized to

cellular substructure; active site on protein; pharmacophore ...

http://obofoundry.orgE N

V I R

O N

M E N

T

50

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

Organism

NCBITaxonom

y

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO)

Phenotypic

Quality(PaTO)

Biological Process

(GO)

Cell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

Molecule(ChEBI, SO,RnaO, PrO)

Molecular Function

(GO)

Molecular Process

(GO)

Template ontologies (CARO, IDO, CL?)

X OrganismTaxonomy

51

The case of IDO

Human Disease Ontology Infectious Disease Ontology

unitary hierarchy withroot node: human diseaserefers only to dependent realizable continuants

draws terms from all BFO categories

template exists in many copies: specializing to different hosts, pathogens, vectors, etc.

We have data

TBDB: Tuberculosis Database, including Microarray data

VFDB: Virulence Factor DB TropNetEurop Dengue Case Data ISD: Influenza Sequence Database at LANLPathPort: Pathogen Portal Project ...

53

We need common controlled vocabularies to describe these data in ways that will assure

comparability and cumulationWhat content is needed to adequately cover the

infectious domain?– Host-related terms (e.g. carrier, susceptibility)– Pathogen-related terms (e.g. virulence)– Vector-related terms (e.g. reservoir, – Terms for the biology of disease pathogenesis (e.g.

evasion of host defense)– Population-level terms (e.g. epidemic, endemic,

pandemic, )54

We need to annotate this data

to allow retrieval and integration of– sequence and protein data for pathogens– case report data for patients– clinical trial data for drugs, vaccines– epidemiological data for surveillance, prevention– ...

Goal: to make data deriving from different sources comparable and computable

55

IDO needs to work withDisease Ontology (DO) + SNOMED CTGene Ontology Immunology BranchPhenotypic Quality Ontology (PATO)Protein Ontology (PRO)Sequence Ontology (SO)...

56

IDO provides a common template

IDO works like CARO.It contains terms (like ‘pathogen’, ‘vector’,

‘host’) which apply to organisms of all species involved in infectious disease and its transmission

Disease- and organism-specific ontologies then built as specifications of the IDO core

57

Proposed additions to list of OBO Foundry Principles

• INSTANTIABILITY: Terms in an ontology should correspond to instances in reality

Even disposition terms correspond to instances in reality

There are no absent nipplesThere are no cancelled studies

Proposed additions to list of OBO Foundry Principles

INSTANTIABILITY: Terms in an ontology should represent types all of which have instances in reality

types = what are described in textbooksinstances = (roughly) what are described in data

59

Proposed additions to list of OBO Foundry Principles

Ontologies consist of representations of types in

reality – therefore, their terms should consist entirely of singular nouns (preferred terms blah blah)

Ontologies should use singular nouns and noun phrases belonging to ordinary English as extended by technical terms already established in the relevant discipline – they should not use phrases like ‘EV-EXP-IGI’, no lab slang, no ellipses

60

Proposed additions to list of OBO Foundry Principles

EVALUATION• each ontology should be subject to evaluation

(as far as possible quantitative):• software (conversion OBO format OWL)• specialist review (OWL natural language)• when one version is used for a given purposes

later versions should be applied to the same purpose and results compared

61

Proposed additions to list of OBO Foundry Principles

each ontology should be built on the basis of BFO top-level distinctions (common top level):

• continuants vs. occurrents• independent continuants (molecules, cells,

organisms …)• specifically dependent continuants (qualities,

functions, roles …)• generically dependent continuants

(information artifacts, sequences …)

62