An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape...

23
What ontologies exist, who builds them and and what are they used for? An overview of the ontology landscape An overview of the ontology landscape Robert Stevens, James Malone [email protected], [email protected]

Transcript of An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape...

Page 1: An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape Robert Stevens, James Malone robert.stevens@manchester.ac.uk, malone@ebi.ac.uk.

What ontologies exist, who builds them and and what are they used for?

An overview of the ontology landscapeAn overview of the ontology landscape

Robert Stevens, James Malone

[email protected], [email protected]

Page 2: An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape Robert Stevens, James Malone robert.stevens@manchester.ac.uk, malone@ebi.ac.uk.

Outline

• What do we need to describe?

• What exists to describe it?

• Are they any good….?

• Ontology organisations

Page 3: An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape Robert Stevens, James Malone robert.stevens@manchester.ac.uk, malone@ebi.ac.uk.

Dimensions of description

• The entities themselves – genes proteins, processes, cells, properties

• The investigations that produced the entities

• The informational origins and history of those entities and their descriptions (data and provenance)

Page 4: An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape Robert Stevens, James Malone robert.stevens@manchester.ac.uk, malone@ebi.ac.uk.

What entities exist to be described

• The actual “concrete” biological entities themselves: Proteins, genes, small molecules, cells, gross anatomy, etc etc

• The devices used to produce and measure them

• Properties of those entities: Size, Shape, colour, function, role, etc etc.

• The biological processes in which those biological entities take part.

• The measuring and analytical processes used on those biological entities.

• Sites on those biological entities: Shoulder region, a bit of the environment, the dorsal region of a mouse, etc etc.

• Information artefacts about all of the above: sequences, database records, who, what, when, where and how… lab protocols, etc etc.

Page 5: An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape Robert Stevens, James Malone robert.stevens@manchester.ac.uk, malone@ebi.ac.uk.

Dividing things up from the top

Page 6: An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape Robert Stevens, James Malone robert.stevens@manchester.ac.uk, malone@ebi.ac.uk.

Dividing things up from the top - process

Gene ontology (GO)biological process,Gene ontology (GO) molecular process

Page 7: An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape Robert Stevens, James Malone robert.stevens@manchester.ac.uk, malone@ebi.ac.uk.

Dividing things up from the top - information

Information Artifact Ontology (IAO)Software Ontology (SWO)Unit Ontology (UO)

Page 8: An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape Robert Stevens, James Malone robert.stevens@manchester.ac.uk, malone@ebi.ac.uk.

Dividing things up from the top - material

ChEBIProtein Ontology (PrO)Sequence Ontology (SO)Cell Type Ontology (CLO)Uberon Foundational Model of Anatomy (FMA)NCBI Taxonomy

Page 9: An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape Robert Stevens, James Malone robert.stevens@manchester.ac.uk, malone@ebi.ac.uk.

Dividing things up from the top - property

GO Molecular FunctionPhenotypic Quality (PaTO)Human Disease Ontology (HDO)

Page 10: An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape Robert Stevens, James Malone robert.stevens@manchester.ac.uk, malone@ebi.ac.uk.

Dividing things up from the top - site

Gazetteer Ontology (GAZ)

Page 11: An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape Robert Stevens, James Malone robert.stevens@manchester.ac.uk, malone@ebi.ac.uk.

We’ve covered most of what there is…

• We’ve chosen bits from a simple upper level ontology

• These are domain neutral descriptions of the entities in any domain of interest

• Top-level or upper ontologies give a common view on what discriminations to make…

• … and what relationships to use between them

• BFO, Simple top Bio

Page 12: An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape Robert Stevens, James Malone robert.stevens@manchester.ac.uk, malone@ebi.ac.uk.

Ontologies in these dimensions

• Here we want a “space” covering these dimensions with ontologies splattered about

• Dimension 1: genotype to phenotype

• Dimension 2: investigations

• Dimension 3: information – IAO, prov, etc.

Page 13: An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape Robert Stevens, James Malone robert.stevens@manchester.ac.uk, malone@ebi.ac.uk.

Reference vs Application Ontologies

• Ontologies developed for different uses

• Reference ontologies built with aim of becoming authority on given domain

• Application ontologies built towards specific application use cases, such as for tooling or database needs

• Application ontologies often consume reference ontologies

Page 14: An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape Robert Stevens, James Malone robert.stevens@manchester.ac.uk, malone@ebi.ac.uk.

Things we describe in Biology - Genes

• Gene Ontology - Gene biological processes, cellular components and molecular functions

• Seen as benchmark of success in bio-ontology

• Many ‘best practices’ fallen out of the GO’s development such as evidence codes, obsolescence policy and community development

Page 15: An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape Robert Stevens, James Malone robert.stevens@manchester.ac.uk, malone@ebi.ac.uk.

Things we describe in Biology - Phenotypes• PATO – ‘phenotypic qualities’, i.e. physical properties of

organisms

• Extremely wide range of classes, examples include colour, size, shape, odour, behaviour

• Phenotypes are important in understanding how genes interact with the environment (in producing phenotypes)

Matzke MA, Image: Matzke AJM (2004) Planting the Seeds of a New Paradigm. PLoS Biol 2(5): e133

Page 16: An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape Robert Stevens, James Malone robert.stevens@manchester.ac.uk, malone@ebi.ac.uk.

Master headline

Things we describe in Biology - Disease

• Majority of biomedical studies consider disease in some way

• Multiple terminologies for disease on biology

• SNOMED CT – Medical (clinical) terminology

• ICD-10 – Classification of disease and health problems

• NCI Thesaurus (not an ontology) - large, lots of textual definitions but less axiomatisation, disease subpart

• UMLS – set of controlled vocabularies describing medical concepts very large at >1 million biomedical concepts

• Human Disease Ontology – based on subset of UMLS, enriched with relationships and new concepts

Page 17: An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape Robert Stevens, James Malone robert.stevens@manchester.ac.uk, malone@ebi.ac.uk.

Master headline

Things we describe in Biology - Anatomy

• Anatomy is important for many reasons including:

• Understanding how genes relate to anatomical regions

• Understanding how disease affects anatomical systems

• Comparative anatomy, i.e. comparing how structures in different species are related

• Model organism anatomies, e.g.

• Mouse adult gross anatomy

• Human anatomy – FMA

• Drosophila Anatomy

• Arabidopsis thaliana

• Zebrafish

• C. elegans

Page 18: An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape Robert Stevens, James Malone robert.stevens@manchester.ac.uk, malone@ebi.ac.uk.

Master headline

Genes at work in different species anatomy • DII gene orthologs implicated in development in multiple

species of different anatomical parts

Mungall, C. et al (2012) Uberon, an integrative multi-species anatomy ontology. Genome Biology 2012, 13:R5

Page 19: An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape Robert Stevens, James Malone robert.stevens@manchester.ac.uk, malone@ebi.ac.uk.

Things we describe in Biology – Chemical Entities

• ChEBI - molecular entities focused on ‘small’ chemical compounds

• Janna will talk about this tomorrow

Page 20: An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape Robert Stevens, James Malone robert.stevens@manchester.ac.uk, malone@ebi.ac.uk.

Things we describe in Biology – Cells

• Cell Ontology is an ontology of cell types

• CL merges information contained in species-specific anatomical ontologies as well as referencing ontologies such:

• the Protein Ontology (PR) for uniquely expressed biomarkers

• Gene Ontology (GO) for the biological processes a cell type participates in.

Page 21: An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape Robert Stevens, James Malone robert.stevens@manchester.ac.uk, malone@ebi.ac.uk.

Things we describe in Biology – Pathways

• Reactome is a database of pathways

• Has export to BioPax ontology to describe pathway elements

• Connects many biological concepts including nucleic acids, genes, disease and GO terms

Page 22: An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape Robert Stevens, James Malone robert.stevens@manchester.ac.uk, malone@ebi.ac.uk.

OBO Foundry• OBO = Open Biomedical Ontology

• The OBO Foundry seeks to organise human expertly curated ontologies in biomedicine

• Provides a set of principles for best practice

• Six OBO Foundry ontologies

• OBO library much bigger and there are many Foundry candidate ontologies

• Intrinsically, biology is interconnected yet many ontologies are not formally linked

• Ontology development is expensive – reducing overlap and improving collaboration would decrease this

• Modularity of domains would increase reusability

Page 23: An overview of the ontology landscape - BioMedBridges€¦ · An overview of the ontology landscape Robert Stevens, James Malone robert.stevens@manchester.ac.uk, malone@ebi.ac.uk.

Let 100 flowers bloom vs Centralised collaboration• 100 flowers bloom:

• Competition driven

• Application and data driven (often to local use cases)

• Requires no commitment to upper ontology framework

• Mapping between efforts can be costly (potentially exponential)

• Duplication of effort

• Centralised collaboration:

• Encourages collaboration and openness

• Aim to produce consensus model of domain knowledge

• Reducing overlap reduces duplicated effort

• Interoperability part of methodology

• Requires upper ontology commitment

• Development by committee can be inhibiting