Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

43
Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th , 2011

Transcript of Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Page 1: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical Informatics

Introduction to OntologyLiqin Wang, MS

SWE Workshop 2011

Aug 10th, 2011

Page 2: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsOutline• What is ontology• Ontology Language• How to create an ontology• Ontology Reasoning• Evaluation of Ontology• Application of Ontology• Institutes & People• Processing Tools & Triple Store• Resources• Practice

8/10/2011 2

Page 3: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsWhat is ontology?• Def. in Philosophy

– A foundational discipline of philosophy, which has its origins in ancient Greece.

– About “existence”

8/10/2011 3

Page 4: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsWhat is ontology? (Cont.)• Def. in Information Science

– A formal, explicit specification of a conceptualization.– A model for describing the world

• Concepts• Properties• Constraints• Individuals

– Also as a domain-specific knowledge base that is machine interpretable, reusable and sharable

T. R. Gruber. A translation approach to portable ontologies. Knowledge Acquisition, 5(2):199-220, 1993.

8/10/2011 4

Page 5: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsWith ontology, we are able…• To share common understanding of the

structure of information among people or software agents

• To enable reuse of domain knowledge• To make domain assumptions explicit• To analyze domain knowledge• To separate domain knowledge from the

operational knowledge

Natalya F. Noy, Deborah McGuinness. Ontology Development 101: A Guide to Creating Your First Ontology

8/10/2011 5

Page 6: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsOntology Language• In order to facilitate the

create, reading, writing, query and sharing of the ontology, it is a necessary to have ontology language(s).

8/10/2011 6

Semantic Web Stackby Tim Berners-Lee

Page 7: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsOntology Language (Cont.)• By syntax:

– RDF/XML• Interchange (can be written and read by all conformant OWL

2 software)

– OWL/XML• Easier to process using XML tools

– Functional Syntax• Easier to see the formal structure of ontologies

– Manchester syntax• Easier to read/write DL ontologies

– Turtle• Easier to read/write RDF triples

8/10/2011 7

Go to protégé and Notepad++

Page 8: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsOWL• OWL (Web Ontology Language)

– W3C standard– OWL is built on top of RDF, and written in XML– OWL has three sublanguages

• DL, Lite, Full

– Hard to read by people– Interpretable by computers

8/10/2011 8

Page 9: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsThree Variants of OWL• OWL Full

– an extension of RDF– allows for classes as instances, modification of RDF and

OWL vocabularies

• OWL DL– the part of OWL Full that fits in the Description Logic

framework– known to have decidable reasoning

• OWL Lite– a subset of OWL DL– easier for frame-based tools to transition to– easier reasoning

8/10/2011 9

Page 10: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsOWL Example

<owl:Class  rdf:about=''firstYearCourse''>

  <rdfs:subClassOf>

    <owl:Restriction>

      <owl:onProperty  rdf:resource=''isTaughtBy''/>

<owl:allValuesFrom  rdf:resource=''#Professor''/>

    </owl:Restriction>

  </rdfs:subClassOf>

</owl:Class>

8/10/2011 10

Page 11: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsManchester OWL syntax• Restrictions

8/10/2011 11

Page 12: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsManchester OWL syntax• Boolean Class Constructors

8/10/2011 12

Page 13: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsExample

Person and hasChild some (Person and (hasChild only Man) and (hasChild some Person))

Describes the set of people who have at least one child that has some children that are only men (i.e. grandparents that only have grandsons)

8/10/2011 13

Page 14: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsOWL vs. RDF• OWL and RDF are much of the same thing, but

OWL is a stronger language with greater machine interpretability than RDF.

• OWL comes with a larger vocabulary and stronger syntax than RDF.

• OWL supports the identification of inconsistencies (e.g. disjoint classes man & woman, instances and classes cannot be both)

8/10/2011 14

Page 15: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsExample of Ontology

8/10/2011 15

Page 16: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsHow to create an ontology?• Determine the scope of the ontology• Consider reusing existing ontologies• Enumerate important terms in the ontology• Define the classes and the class hierarchy• Define the properties of classes-slots• Define the constraints of the slots• Create instances

• In reality, there is no order for these process…

8/10/2011 16

Page 17: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsDetermine the scope• What is the domain that the ontology will cover?

• For what we are going to use the ontology?

• For what types of questions the information in the ontology should provide answers?

8/10/2011 17

Page 18: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsDetermine the scope• What is the domain that the ontology will cover?• e.g. Cardiology, food…• For what we are going to use the ontology?• e.g. Question answering, information extraction• For what types of questions the information in

the ontology should provide answers?• e.g. what is the treatment for patient of cognitive

heart failure?

8/10/2011 18

Page 19: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsWhat to reuse?• Domain specific ontologies

– UMLS Semantic Network• Semantic type and Relationship

– SNOMED CT– National Drug File – Reference Term

• IEEE Upper Ontology– Suggested Upper Merged Ontology

8/10/2011 19

Page 20: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsWhere to get important terms?• From

– your mind if you are domain experts– Other persons

• From– Books– Guidelines– Literatures– etc.

8/10/2011 20

Page 21: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsClasses• Classes usually constitute a taxonomic

hierarchy (a subclass-superclass hierarchy)• A class hierarchy is usually an IS-A hierarchy

– an instance of a subclass is an instance of a superclass

• If you think of a class as a set of elements a subclass is a subset

• Multiple inheritance, (A is a B), (A is a C)

8/10/2011 21

Page 22: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsProperties• rdf:Property

– owl:ObjectProperty• Link individuals to individuals• hasParent

– owl:DatatypeProperty• Link individuals to data values

8/10/2011 22

Go to protégé, see NDF-RT.owl

Page 23: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsProperty restrictions• Value constraints

– Put constraints on value range for a particular property

• Cardinality constraints– Constraints on the number of values for a particular

property

• e.g. hasParent– Value constraints: allValuesFrom “Human”– Cardinality constraints: maxCardinality “2”

8/10/2011 23

Page 24: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsOntology Reasoning• Determine the consistency of ontology• Identify subsumption relationships between

classes• Reasoners:

– RacerPro– FaCT++

• C++-based reasoner

– Pellet– HermiT

8/10/2011 24

Page 25: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsEvaluation of Ontology• Assessment by human against a set of criteria• Natural language evaluation techniques• Evaluate use of ontology in an application• Comparison of ontology against a source of

domain data• Using reality as benchmark• Ontology accreditation, certification, maturity

model

8/10/2011 25

Page 26: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsApplications of Ontology•  Global Health Monitor

– BioCaster

• Question answering– QALL-ME

• NLP/IE– Extended Syndrome Surveillance Ontology

8/10/2011 26

Page 27: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsInstitutes & People• W3C web ontology• ONTOLOG• Stanford

– Thomas R. Gruber– Mark A. Musen, M.D., Ph.D

• Buffalo ontology site– Barry Smith

• University of Manchester– e.g. Alan Rector

8/10/2011 27

Page 28: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsProcessing Tools & Triple Store• Ontology Engineering Environments

– Ontolingua

– Protégé

– Altova SemanticWorks (Commercial)

• Parser/Serializer– Rapper: Raptor RDF parsing and serializing utility

• SPARQL Query– Jena ARQ

• Triple Store– Jena SDB

– http://www.w3.org/wiki/LargeTripleStores

8/10/2011 28

Page 29: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsResources all about OWL• OWL Web Ontology Language Overview

• OWL Web Ontology Language Guide

• OWL Web Ontology Language Reference

• OWL Web Ontology Language Semantics and Abstract Syntax

• OWL Web Ontology Language Test Cases

• OWL Web Ontology Language Use Cases and Requirements

• OWL Web Ontology Language XML Presentation Syntax 

• OWL Web Ontology Language Parsing OWL in RDF/XML 

8/10/2011 29

Page 30: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsBiomedical Informatics

QUESTIONS?

8/10/2011 30

Page 31: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsPractice• Protégé

– How to work with Protégé

• Jena SDB Triple Store– Load the RDF file into triple store

8/10/2011 31

Page 32: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsBiomedical Informatics

PROTÉGÉ

8/10/2011 32

Page 33: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsProtégé - Windows• Ontology metrics

• Preference Render– Show the classes in different way

8/10/2011 33

Page 34: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsInverse Properties• Each object property may have a corresponding

inverse property.• If some property links individual a to individual

b, then its inverse property will link individual b to individual a.

8/10/2011 34

Page 35: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsFunctional Properties• If a property is functional, for a given individual,

there can only be at most one individual to be related via this property.– For a given domain, range must be unique

• Functional properties are also known as single valued properties.

8/10/2011 35

Page 36: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsInverse Functional Properties• If a property is inverse functional, then its

inverse property is functional.– For a given range, domain must be unique.

8/10/2011 36

Page 37: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsFunctional vs. inverse functional properties• FunctionalProperty vs InverseFunctionalProperty

domain range example

Functional

Property

For a given domain

Range is unique

hasFather: A hasFather B, A hasFather C B=C

InverseFunctionalProperty

Domain is unique

For a given range

hasID: A hasID B, C hasID B A=C

8/10/2011 37

Page 38: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsTransitive Properties• If a property is transitive, and the property related individual a

to individual b, and also individual b to individual c, then we can infer that individual a is related to individual c via property P.

8/10/2011 38

Page 39: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsProtégé – DL Query• Quickly test definitions of classes to see that

they subsume the appropriate subclasses. • Or check for class membership of arbitrary

descriptions without having to create named class placeholders.

• Follow Manchester OWL Syntax• http://protegewiki.stanford.edu/wiki/DLQueryTab• Example

– For pizza.owl– Pizza and hasTopping only VegetarianTopping

8/10/2011 39

Page 40: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsProtégé - OWLViz• http://protegewiki.stanford.edu/wiki/OWLViz• Graphviz

– Installation– Set path

8/10/2011 40

Page 41: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsBiomedical Informatics

TRIPLE STORE

8/10/2011 41

Page 42: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsInstall• Mysql server• Latest mysql connector jar file• Jena SDB• Cygwin

8/10/2011 42

Page 43: Biomedical Informatics Introduction to Ontology Liqin Wang, MS SWE Workshop 2011 Aug 10 th, 2011.

Biomedical InformaticsBiomedical Informatics

THANK YOU!

8/10/2011 43