Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB...

49
Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Olivier Bodenreider Lister Hill National Center Lister Hill National Center for Biomedical Communications for Biomedical Communications Bethesda, Maryland Bethesda, Maryland - - USA USA University of Pisa, Italy June 14, 2007

Transcript of Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB...

Page 1: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

Bio-ontologiesThe cream in the Semantic Web layer cake

NETTAB 2007 - A Semantic Web for Bioinformatics

Olivier BodenreiderOlivier Bodenreider

Lister Hill National CenterLister Hill National Centerfor Biomedical Communicationsfor Biomedical CommunicationsBethesda, Maryland Bethesda, Maryland -- USAUSA

University of Pisa, ItalyJune 14, 2007

Page 2: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

Semantic Web pastry

Page 3: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

3Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Semantic Web layer cakeSemantic Web layer cake

http://www.axis-of-aevil.net/img/2005_09/tyrnicake1.jpg

Page 4: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

4Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Semantic Web layer cakeSemantic Web layer cake

http://www.cookingwithkristina.com/uploaded_images/fudgy-702607.jpg

Page 5: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

5Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Semantic Web layer cakeSemantic Web layer cake

Page 6: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

6Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Semantic Web layer cakeSemantic Web layer cake

Page 7: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

7Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

OutlineOutline

Historical perspectiveHistorical perspectiveModern bioModern bio--ontologiesontologiesTools and formalismsTools and formalismsInstitutionalization of bioInstitutionalization of bio--ontologiesontologiesBioBio--ontologies and Semantic Webontologies and Semantic Web

Page 8: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

8Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Briefings in BioinformaticsBriefings in Bioinformatics

http://bib.oxfordjournals.org/cgi/reprint/7/3/256?ijkey=1ejwW7ipyG1ASiI&keytype=ref

Page 9: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

Before we called them bio-ontologiesA brief history of biomedical terminologies

Page 10: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

10Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Why biomedical terminologies?Why biomedical terminologies?

To support a theory of diseasesTo support a theory of diseasesTo classify diseasesTo classify diseasesTo support epidemiologyTo support epidemiologyTo index and retrieve informationTo index and retrieve informationTo serve as a referenceTo serve as a reference

Page 11: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

11Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

To support a theory of diseasesTo support a theory of diseases

HippocratesHippocratesDismisses superstitionDismisses superstitionFour humorsFour humors

BloodBloodPhlegmPhlegmYellow bileYellow bileBlack bileBlack bile

Thomas Thomas SydenhamSydenham (1624(1624--1689)1689)Medical observations on the historyMedical observations on the historyand cure of acute diseasesand cure of acute diseases (1676)(1676)

Page 12: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

12Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

To classify diseases (and plants)To classify diseases (and plants)

CarolusCarolus Linnaeus (1707Linnaeus (1707--1778)1778)Genera Genera PlantarumPlantarum (1737)(1737)Genera Genera MorborumMorborum (1763)(1763)

FranFranççois ois BoissierBoissier de La Croixde La Croixa.k.a. F. B. de a.k.a. F. B. de SauvagesSauvages (1706(1706--1767)1767)

MethodusMethodus FoliorumFoliorum (1751)(1751)NosologiaNosologia MethodicaMethodica (1763/68)(1763/68)

William Cullen (1710William Cullen (1710--1790)1790)Synopsis Synopsis NosologiaeNosologiae MethodicaeMethodicae (1785)(1785)

Page 13: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

13Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

From plantsFrom plants……

Page 14: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

14Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

…… to diseasesto diseases

Four categories (W. Cullen)Four categories (W. Cullen)FeversFeversNervous disordersNervous disordersCachexiasCachexiasLocal diseases Local diseases

“The distinction of the genera of diseases, the distinction of the species of each, and often even that of the varieties, I hold to be a necessary foundation of every plan of physic, whether dogmatical or empirical.”– William Cullen, Edinburgh, 1785Synopsis Nosologia Methodicae

“The distinction of the genera of diseases, the distinction of the species of each, and often even that of the varieties, I hold to be a necessary foundation of every plan of physic, whether dogmatical or empirical.”– William Cullen, Edinburgh, 1785Synopsis Nosologia Methodicae

(Cited by Chris Chute)

Page 15: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

15Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

To support epidemiologyTo support epidemiology

John John GrauntGraunt (1620(1620--1674)1674)Analyzes the vital statisticsAnalyzes the vital statisticsof the citizens of Londonof the citizens of London

William Farr (1807William Farr (1807--1883)1883)Medical statisticianMedical statisticianImproves CullenImproves Cullen’’s classifications classificationContributes to creating ICDContributes to creating ICD

Jacques Jacques BerthillonBerthillon (1851(1851--1922)1922)Chief of the statistical services (Paris)Chief of the statistical services (Paris)Classification of causes of death (161 rubrics)Classification of causes of death (161 rubrics)

Page 16: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

16Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

London Bills of MortalityLondon Bills of Mortality

Page 17: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

17Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Limitations of existing classificationsLimitations of existing classifications

“The advantages of a uniform statistical nomenclature, however imperfect, are so obvious, that it is surprising no attention has been paid to its enforcement in Bills of Mortality. Each disease has, in many instances, been denoted by three or four terms, and each term has been applied to as many different diseases: vague, inconvenient names have been employed, or complications have been registered instead of primary diseases. The nomenclature is of as much importance in this department of inquiry as weights and measures in the physical sciences, and should be settled without delay.”– William FarrFirst annual report.London, Registrar General of England and Wales, 1839, p. 99.

“The advantages of a uniform statistical nomenclature, however imperfect, are so obvious, that it is surprising no attention has been paid to its enforcement in Bills of Mortality. Each disease has, in many instances, been denoted by three or four terms, and each term has been applied to as many different diseases: vague, inconvenient names have been employed, or complications have been registered instead of primary diseases. The nomenclature is of as much importance in this department of inquiry as weights and measures in the physical sciences, and should be settled without delay.”– William FarrFirst annual report.London, Registrar General of England and Wales, 1839, p. 99.

Page 18: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

18Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

To index and retrieve informationTo index and retrieve information

Biomedical literatureBiomedical literatureMEDLINE (15M citations from 4600 journals)MEDLINE (15M citations from 4600 journals)Manually indexedManually indexedMedical Subject Headings (MeSH)Medical Subject Headings (MeSH)

GenomeGenomeModel organisms (Fly, Mouse, Yeast, Model organisms (Fly, Mouse, Yeast, ……))Manually / semiManually / semi--automatically annotatedautomatically annotatedGene OntologyGene Ontology

Page 19: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

19Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

MEDLINE and MeSHMEDLINE and MeSH

Page 20: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

20Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Mouse Genome Database and GOMouse Genome Database and GO

Page 21: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

21Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

To serve as a referenceTo serve as a reference

Reference terminology/ontologyReference terminology/ontologyUniversally neededUniversally neededDeveloped independently of any purposesDeveloped independently of any purposesReusable by many applicationsReusable by many applications

ExamplesExamplesRxNormRxNormFoundational Model of Anatomy (FMA)Foundational Model of Anatomy (FMA)ChEBIChEBISNOMED CTSNOMED CTLOINCLOINC

Page 22: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

22Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Administrative terminologiesAdministrative terminologies

Coding patient recordsCoding patient recordsInternational Classification of Primary Care (ICPC)International Classification of Primary Care (ICPC)SNOMEDSNOMEDRead CodesRead Codes

Reporting claims to health insurance companiesReporting claims to health insurance companiesCurrent Procedural Terminology (CPT)Current Procedural Terminology (CPT)International Classification of Diseases (ICDInternational Classification of Diseases (ICD--9 CM)9 CM)Healthcare Common Procedure Coding System Healthcare Common Procedure Coding System (HCPCS)(HCPCS)

Page 23: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

Modern bio-ontologies

Page 24: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

24Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Biomedical ontologies (and terminologies)Biomedical ontologies (and terminologies)

The OBO familyThe OBO familyOntologies and Ontologies and terminologiesterminologiesGene OntologyGene OntologyMostly biological Mostly biological ontologiesontologies

UMLSUMLSOntologies and Ontologies and terminologiesterminologiesMeSH, SNOMED CTMeSH, SNOMED CTMostly clinical ontologiesMostly clinical ontologies

Page 25: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

25Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Open Biological Ontologies (OBO)Open Biological Ontologies (OBO)

Extended family of the Gene Ontology (GO)Extended family of the Gene Ontology (GO)

Collaborative developmentCollaborative developmenthttp://obo.sourceforge.net/http://obo.sourceforge.net/

National Center for Biomedical OntologyNational Center for Biomedical Ontologyhttp://bioontology.org/http://bioontology.org/

OBO FoundryOBO Foundryhttp://obofoundry.org/http://obofoundry.org/Promote best practices in ontology development Promote best practices in ontology development 10 inclusion criteria10 inclusion criteria

Page 26: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

26Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Open Biological Ontologies (OBO)Open Biological Ontologies (OBO)

http://http://obo.sourceforge.netobo.sourceforge.net//

Page 27: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

27Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Integrating subdomainsIntegrating subdomains

(Barry Smith)(Barry Smith)

Page 28: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

28Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

OBO ontologies OBO ontologies ExamplesExamples

Gene OntologyGene OntologyCell typesCell typesSequence OntologySequence OntologyChEBIChEBIFoundational Model of AnatomyFoundational Model of AnatomyPATO PATO –– phenotypic qualitiesphenotypic qualitiesRelationship typesRelationship typesOntology for Biomedical InvestigationsOntology for Biomedical Investigations

Page 29: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

29Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

UMLS Source VocabulariesUMLS Source Vocabularies

139 source vocabularies139 source vocabularies17 languages17 languages

Broad coverage of biomedicineBroad coverage of biomedicine5.5M names5.5M names1.4M concepts1.4M concepts16M relations16M relations

Common presentationCommon presentation

(2007AA)

Page 30: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

30Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Biomedical terminologies in UMLSBiomedical terminologies in UMLS

General vocabulariesGeneral vocabulariesanatomy (UWDA, anatomy (UWDA, NeuronamesNeuronames))drugs (drugs (RxNormRxNorm, First , First DataBankDataBank, Micromedex, , Micromedex, ……))medical devices (UMD, SPN)medical devices (UMD, SPN)

Several perspectivesSeveral perspectivesclinical terms (SNOMED CT)clinical terms (SNOMED CT)information sciences (MeSH, CRISP)information sciences (MeSH, CRISP)administrative terminologies (ICDadministrative terminologies (ICD--99--CM, CPTCM, CPT--4)4)data exchange terminologies (HL7, LOINC)data exchange terminologies (HL7, LOINC)

Page 31: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

31Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Biomedical terminologies in UMLSBiomedical terminologies in UMLS

Specialized vocabulariesSpecialized vocabulariesnursing (NIC, NOC, NANDA, Omaha, PCDS)nursing (NIC, NOC, NANDA, Omaha, PCDS)dentistry (CDT)dentistry (CDT)oncology (NCI Thesaurus, PDQ)oncology (NCI Thesaurus, PDQ)psychiatry (DSM, APA)psychiatry (DSM, APA)adverse reactions (COSTART, WHO ART, adverse reactions (COSTART, WHO ART, MedDRAMedDRA))primary care (ICPC)primary care (ICPC)genomics (Gene Ontology, HUGO, OMIM)genomics (Gene Ontology, HUGO, OMIM)

Terminology of knowledge bases (Terminology of knowledge bases (AI/Rheum, AI/Rheum, DXplainDXplain, QMR, QMR))

Page 32: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

32Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Integrating subdomainsIntegrating subdomains

Biomedicalliterature

Biomedicalliterature

MeSH

Genomeannotations

Genomeannotations

GOModelorganisms

Modelorganisms

NCBITaxonomy

Geneticknowledge bases

Geneticknowledge bases

OMIM

Clinicalrepositories

Clinicalrepositories

SNOMEDOthersubdomains

Othersubdomains

AnatomyAnatomy

UWDA

UMLS

Page 33: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

Tools and formalismsfor bio-ontologiesThree examples

Page 34: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

34Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Three examplesThree examples

Foundational Model of AnatomyFoundational Model of AnatomyProtProtééggéé--framesframes

Gene OntologyGene OntologyOBOOBO--EditEdit

NCI ThesaurusNCI ThesaurusOWL DLOWL DL

ConversionsConversions

Page 35: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

35Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Foundational Model of Anatomy (FMA)Foundational Model of Anatomy (FMA)

University of WashingtonUniversity of WashingtonCanonical anatomyCanonical anatomy75,000 anatomical entitiesSynonymsRelationshipsRelationships

IsaIsaPart of (5 subtypes)Part of (5 subtypes)Topological, etc.Topological, etc.

FrameFrame--based / Protbased / Protééggéé

http://sig.biostr.washington.edu/projects/fm/index.html

http://protege.stanford.edu/

Page 36: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

36Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Explicit classificatory principleExplicit classificatory principle

Anatomical entity

Physicalanatomical entity

Non-physicalanatomical entity

Material physicalanatomical entity

Non-material physicalanatomical entity

Anatomicalstructure

Bodysubstance

Anat.space

Anat.surface

Anat.line

Anat.point

Spatialdimension

+ -

Mass+ -

Inherent3D shape

+ -

2D 1D 0D3D

FoundationalModel ofAnatomy

FoundationalModel ofAnatomy

Page 37: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

37Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

FMA FMA ConversionsConversions

OWL DLOWL DLGolbreichGolbreich et al., JWS 2006et al., JWS 2006

OWL FullOWL FullNoyNoy and Rubin, SMI Tech Report 2007and Rubin, SMI Tech Report 2007

OBOOBOhttp://obofoundry.org/cgihttp://obofoundry.org/cgi--bin/detail.cgi?idbin/detail.cgi?id==fma_litefma_lite

Page 38: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

38Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Gene OntologyGene Ontology

GO ConsortiumGO ConsortiumAnnotation of gene products (Molecular functions, Annotation of gene products (Molecular functions, Cellular components, Biological processes)Cellular components, Biological processes)24,000 terms24,000 termsSynonymsSynonymsIsa and part of relationsIsa and part of relationsOBOOBO--Edit / OBOEdit / OBOAlso available in RDF and OWL DLAlso available in RDF and OWL DL

http://oboedit.org/

http://www.geneontology.org/

Page 39: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

39Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

OBO formatOBO format

Used to represent many ontologies in the OBO Used to represent many ontologies in the OBO family (Open Biological Ontologies)family (Open Biological Ontologies)

Essentially a subset of OWL DLEssentially a subset of OWL DL

[Term]id: GO:0019563name: glycerol catabolismnamespace: biological_processdef: "The chemical reactions and pathways resulting in the breakdown of glycerol …subset: gosubset_prokexact_synonym: "glycerol breakdown" []exact_synonym: "glycerol degradation" []xref_analog: MetaCyc:PWY0-381is_a: GO:0006071 ! glycerol metabolismis_a: GO:0046174 ! polyol catabolism

[Term]id: GO:0019563name: glycerol catabolismnamespace: biological_processdef: "The chemical reactions and pathways resulting in the breakdown of glycerol …subset: gosubset_prokexact_synonym: "glycerol breakdown" []exact_synonym: "glycerol degradation" []xref_analog: MetaCyc:PWY0-381is_a: GO:0006071 ! glycerol metabolismis_a: GO:0046174 ! polyol catabolism

http://www.godatabase.org/dev/doc/obo_format_spec.html

Page 40: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

40Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

NCI ThesaurusNCI Thesaurus

National Cancer InstituteNational Cancer InstituteCancer researchCancer research54,000 concepts54,000 concepts150,000 concept names150,000 concept namesRelationsRelations

IsaIsaAssociative (87 relationship types)Associative (87 relationship types)

OWL DLOWL DL

http://protege.stanford.edu/

http://nciterms.nci.nih.gov/NCIBrowser/

Page 41: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

Institutionalization of bio-ontologies

Page 42: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

42Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

BioBio--ontologies have become ontologies have become mainstrammainstram

0

100

200

300

400

500

600

700

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006

year

Number of articles on "ontology/ies" in PubMed/MEDLINE

GOothers

Page 43: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

43Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Some institutions Some institutions BioBio--ontologiesontologies

National Center for Biomedical OntologyNational Center for Biomedical Ontologyhttp://bioontology.org/http://bioontology.org/

OBO FoundryOBO Foundryhttp://obofoundry.org/http://obofoundry.org/Promote best practices in ontology developmentPromote best practices in ontology development

Other ontology centersOther ontology centersNCOR NCOR –– National Center for Ontology Research (US)National Center for Ontology Research (US)ECOR ECOR –– European Center for Ontology ResearchEuropean Center for Ontology Research

Page 44: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

44Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Some institutions Some institutions Semantic WebSemantic Web

W3C Health Care and Life Sciences Interest W3C Health Care and Life Sciences Interest GroupGroup

http://www.w3.org/2001/sw/hcls/http://www.w3.org/2001/sw/hcls/BioRDFBioRDFBioOntBioOnt

Page 45: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

Bio-ontologiesand Semantic Web

Page 46: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

46Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Use cases for a biomedical SWUse cases for a biomedical SW

IntegrationIntegrationData/InformationData/InformationE.g., translational researchE.g., translational research

Hypothesis generationHypothesis generationKnowledge discoveryKnowledge discovery

Clinical dataClinical dataAggregation, sharing, exchangeAggregation, sharing, exchangeSupport for clinical decisionSupport for clinical decision

Page 47: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

47Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Some issuesSome issues

FormatFormatRDF/S, OWL, SKOS vs. OBO, RRF, etc.RDF/S, OWL, SKOS vs. OBO, RRF, etc.ConvertersConverters

Permanent identification of biomedical entitiesPermanent identification of biomedical entitiesSyntax: URI vs. LSIDSyntax: URI vs. LSIDSemantic: TransSemantic: Trans--namespace identificationnamespace identification

Availability, opennessAvailability, opennessGovernance, trustGovernance, trust

Page 48: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

48Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

SummarySummary

Biomedical information integration is a good use Biomedical information integration is a good use case for the Semantic Webcase for the Semantic Web

Semantic Web technologiesSemantic Web technologiesOntologiesOntologies

OntologiesOntologiesIdentificationIdentificationMappingMappingReasoningReasoning

Page 49: Bio-ontologies · 6/14/2007  · Bio-ontologies The cream in the Semantic Web layer cake NETTAB 2007 - A Semantic Web for Bioinformatics Olivier Bodenreider Lister Hill National Center

MedicalOntologyResearch

Olivier BodenreiderOlivier Bodenreider

Lister Hill National CenterLister Hill National Centerfor Biomedical Communicationsfor Biomedical CommunicationsBethesda, Maryland Bethesda, Maryland -- USAUSA

Contact:Contact:Web:Web:

[email protected]@nlm.nih.govmor.nlm.nih.govmor.nlm.nih.gov