November 1st, 2016
The Human Phenotype Ontology Project (HPO)
History of HPOCurrent statusUse cases
HVP/HUGO Variant Detection Training Course Variant Effect Prediction
Dr. Sebastian Köhlerdrseb.github.io
Why phenotypes matter?
❖ Phenotypic abnormality = clinical feature
❖ Constellation/Pattern of phenotypes/clinical features defines a disease:❖ … is a rare developmental disorder defined by the combination of
aplasia cutis congenita of the scalp vertex and terminal transverse limb defects. In addition, vascular anomalies such as cutis marmorata telangiectatica … are recurrently seen.
MIM
❖ The single most valuable resource for human genetics
❖ 12 printed version between 1966 and 1998
❖ Online for almost two decades
Victor McKusick with printed MIM www.hhmi.org/biointeractive/museum/exhibit98/content/h1_7info.html
OMIM
• Free text phenotypic description
• Very expressive
OMIM
• Contains Clinical Synopsis (CS) section
• Free text phenotypic description
• Very expressive
(Un)Controlled Vocabularies
❖ Non-standardized method for describing phenotypes
❖ Not designed to be easily machine interpretable
❖ Spelling problems
(Un)Controlled Vocabularies
❖ Non-standardized method for describing phenotypes
❖ Not designed to be easily machine interpretable
❖ Spelling problems
Incomplete: Fulltext contains phenotype information; absent in CS
Inconsistent: No handling of synonyms ‘Height: short stature’ ‘Reduced adult height’ ‘Final adult height, 84-128cm’
(Un)Controlled Vocabularies
❖ Non-standardized method for describing phenotypes
❖ Not designed to be easily machine interpretable
❖ Spelling problems
CS contains symptoms such as: ‘Heart: Prolonged QTc interval’ or ‘T-wave abnormalities’
Imagine query for ‘ECG Abnormalities’ , how to ensure the examples above are found?
(Un)Controlled Vocabularies
❖ Non-standardized method for describing phenotypes
❖ Not designed to be easily machine interpretable
❖ Spelling problems
E.g.:
hypereflexia - hyperreflexia congential - congenital defeciency - deficiency
Homonyms
= muscle fibrillation
... fibrillation ...
fibrillation ≠ fibrillation
= ventricular fibrillation
MotivationOMIM Query Number of Resultslarge bones 264large bone 785
enlarged bones 87enlarged bone 156
big bones 16huge bones 4
massive bones 28hyperplastic bones 12hyperplastic bone 40bone hyperplasia 134
increased bone growth 612Washington et al. PLoS Biology (2009)Linking human diseases to animal models using ontology-based phenotype annotation
Motivation
Goal of HPO
❖ We want computer-interpretable clinical features!
❖ Compare diseases based on clinical features
❖ Compare patients based on clinical features
❖ Compare patient with diseases based on clinical features
❖ …
❖ As easy to use and freely available
The Human Phenotype Ontology (HPO)
❖ Description of phenotypic abnormalities (or clinical features) in humans
❖ Synonyms merged into one term
❖ Creation of textual and logical definitions for each term
abnormality of the nervous system
neurofibrillary tangles
cerebral inclusion bodies
gait ataxia
gait disturbanceataxia
phenotypic abnormality
incoordination
abnormality of movement
abnormality of the central nervous
system
The Human Phenotype Ontology (HPO)
id: HP:0002185 name: Neurofibrillary tangles def: Pathological protein aggregates formed by hyperphosphorylation of a microtubule-associated protein known as tau, causing it to aggregate in an insoluble form. [HPO:sdoelken] synonym: Neurofibrillary tangles may be present EXACT [] synonym: Paired helical filaments EXACT []
abnormality of the nervous system
neurofibrillary tangles
cerebral inclusion bodies
gait ataxia
gait disturbanceataxia
phenotypic abnormality
abnormality of movement
abnormality of the central nervous
system
incoordination
The Human Phenotype Ontology (HPO)
❖ Semantic relations (’subclass of’, ‘is a’)
❖ From top to bottom terms get more specific
abnormality of the nervous system
neurofibrillary tangles
cerebral inclusion bodies
gait ataxia
gait disturbanceataxia
phenotypic abnormality
abnormality of movement
abnormality of the central nervous
system
is a
is a is a
is a
is a
is a
is ais a
is a
is a
is a
is a
is a
is a incoordination
Annotation of diseases❖ Terms of the HPO are used to
annotate (describe) diseases❖ E.g. neurofibrillary tangles is
used to annotate Alzheimer Disease
❖ Note: Annotation with neurofibrillary tangles induces annotation to all ancestor terms (transitive)
❖ We provide annotations to common and rare diseases
abnormality of the nervous system
neurofibrillary tangles
cerebral inclusion bodies
gait ataxia
gait disturbanceataxia
phenotypic abnormality
incoordination
abnormality of movement
abnormality of the central nervous
system
is a
is a is a
is a
is a
is a
is ais a
is a
is a
is a
is a
is a
is a
Köhler et al. The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data ; NAR (2014)
Current Status
❖ 4 root classes:
❖ Phenotypic abnormality, Mode of Inheritance, Clinical modifier, Mortality/Ageing
❖ 11,813 classes/terms in HPO
❖ ~124,000 annotations of 7,700 rare diseases from OMIM, Orphanet, DECIPHER
❖ ~133,000 annotations of 3,145 common diseases
Other Resources?
❖ Do we really need HPO?
Winnenburg and Bodenreider, Coverage of phenotypes in standard terminologies. ; Proceedings of the ISMB (2014)
Recent projects
❖ Over 6,000 layperson synonyms added
❖ Hypoplastic kidney —> Small kidneys
❖ Nephropathy —> Kidney damage
❖ Lobulated tongue —> Bumpy tongue
❖ Important for patient-reported phenotype data
See slides: http://www.slideshare.net/NicoleVasilevsky/enhancing-the-human-phenotype-ontology-for-use-by-the-layperson-64669468
Recent projects
❖ Translations of labels, synonyms and textual definitions (crowd-sourcing)
Adoption of HPO
Köhler et al.The human phenotype ontology in 2017 NAR (2016) to appear in a few days
Adoption HPO
Köhler et al.The human phenotype ontology in 2017 NAR (2016) to appear in a few days
Clinical genetics❖ Requires: Conversion tables for legacy vocabularies
❖ Done by text-mining followed by manual curation
❖ LDDB (✓)
❖ Orphanet (✓) (they are now using HPO directly)
❖ Possum (?)
❖ MedDRA
❖ UMLS (completely incorporated now)
Applications of HPO
Semantic similarity
❖ Basic idea of ontological search: Do not need exact match! But semantically similar diseases score well.
❖ Image a BLAST-search for sets of clinical features. (Phenomizer)
❖ More in practical session on Wednesday (Room: Mouses)
Clinical genomics❖ “Standard” clinical
exome pipeline
❖ Predicts causative variant based on information in genome of patient and background genomic data
❖ Each human genome harbors about 100 genuine loss-of-function SNVs with ∼20 genes completely inactivated (3) and around 50-100 CNVs. (DG MacArthur et al., Science 2012)
Whole exome
Filter, e.g. common variants
Variant Score, e.g. - allele frequency - conservation
Clinical genomics
Zemojtel , Köhler et al. Effective diagnosis of genetic disease by computational phenotype analysis of the disease-associated genomeScience Translational Medicine (2014)
Robinson, Köhler, et al. Improved exome prioritization of disease genes through cross-species phenotype comparisonGenome Research (2013)
Deep phenotype profileof patient as HPO terms
Phenotypic Relevance Score based on HPO similarity - gene of variant - patient
Combine variant score with phenotypic relevance score
(PhenIX, Exomiser, ...)
Filter
Variant Score
Performance
❖ Combination of variant score and phenotype score is key.
❖ Keywords: Exomiser, PhenIX, …
❖ More about this on the practical session on Wednesday.
Mapping phenotypes across species❖ “We are able to provide a confident interpretation of the
clinical relevance for only a … small proportion of variants in human populations.” Lloyd, Robinson, MacRae, STM 2016
❖ Use model organism?❖ Each model organism has its own phenotype ontology,
e.g.❖ MPO❖ ZP
Logical definition of phenotypes
❖ We define phenotypes using atomic ontologies for❖ Anatomy❖ Chemicals❖ Cells❖ Qualities❖ Gene Ontology❖ …
❖ Reasoning:❖ Major premise: All men are mortal.❖ Minor premise: Socrates is a man.❖ Conclusion (by reasoner):
Socrates is mortal.
anatomy-ontology
quality-ontology
Mungall et al. Integrating phenotype ontologies across multiple species ; Genome Biology (2010)
Reasoner
❖ Human phenotype (HPO)
❖ Mouse phenotype (MP-ontology)
UBERON
midface
face
PATO
part of
decreased length
quality
subclass of
Köhler et al. Construction and accessibility of a cross-species phenotype ontology along with gene annotations for biomedical researchF1000 research (2013)
❖ Quality assurance for HPO
❖ Cross-species phenotype ontology for human, mouse, and zebrafish : Uberpheno
❖ Mechanistic insights (esp. GO)
Reasoner
❖ Human phenotype (HPO)
❖ Mouse phenotype (MP-ontology)
UBERON
midface
face
PATO
part of
decreased length
quality
subclass ofReasoner:
subclass of
Köhler et al. Construction and accessibility of a cross-species phenotype ontology along with gene annotations for biomedical researchF1000 research (2013)
❖ Quality assurance for HPO
❖ Cross-species phenotype ontology for human, mouse, and zebrafish : Uberpheno
❖ Mechanistic insights (esp. GO)
Uberpheno: Situation before
subclass of
subclass of subclass of
subclass of
subclass of
subclass ofsubclass of
subclass of
subclass of
subclass of
Uberpheno
Sleep-wake cycle
disturbanceprolonged circadian period
Zebrafish geneHuman gene Mouse geneLegend:
annotated to
PER2
PSEN2
annotated to annotated to
hcrtr2
Fbxl3
inferred: is a
inferred: annotated to
inferred: annotated to
abnormally increased duration circadian
sleep/wake cycle, sleep
annotated to
inferred: is a
Uberpheno for CNV analysis
❖ E.g. in Array-CGH a found CNV often contains multiple genes
❖ Goals: ❖ Automatically assign which genes are related to which
clinical feature of the patient❖ Use model organism phenotypes❖ Create Visualisation
Köhler et al. Clinical interpretation of CNVs with cross-species phenotype dataJournal of Medical Genetics (2014)
PhenogramGene located in CNV
Symptom of the patient
Shared phenotype betweengene and patient
Match based on Uberpheno
(mouse)
Match based on Uberpheno
(mouse)
Filter unspecific matches
Filter unspecific matches
Filter unspecific matches
Summary❖ HPO -
a controlled vocabulary of phenotypic abnormalities for human genetics
❖ Freely available, open-source
❖ FOR the community, FROM the community (see our papers)
❖ Novel approaches towards:
❖ Differential diagnosis tools (Phenomizer)
❖ Standardized patient description in projects world-wide
❖ Model organism phenotypes
Contribute
Acknowledgements❖ Berlin:
❖ Peter Robinson (now JAX)
❖ Sandra Dölken
❖ Tomasz Zemojtel
❖ Genomics England:
❖ Damian Smedley
❖ Monarch Initiative:❖ Chris Mungall❖ Melissa Haendel❖ Nicole Vasilevsky
❖ A lot missing! Sorry.
All medical experts supporting HPO!
Top Related