Deep phenotyping for everyone
Transcript of Deep phenotyping for everyone
Deep phenotyping for everyone
Melissa Haendel, PhDJuly 9th, 2016
Phenoday!!
@monarchinit@[email protected]
The genome is sequenced, but…
…we still don’t know very much about what it does
3,435 OMIM
Mendelian Diseases with no known genetic basis
?66,396 ClinVar
Variants with no known pathogenicity
Why we need all the organisms
Model data can provide up to 80% phenotypic coverage of the human coding genome
The prevailing clinical diagnosis pipelines leverage only a tiny fraction of the available
data
PATIENT EXOME/ GENOME
PATIENT PHENOTYPES
PATIENT ENVIRONMENT
PUBLIC GENOMIC DATA
PUBLIC HUMAN & MODEL PHENOTYPE,
DISEASE DATA
PUBLIC HUMAN & MODEL ENVIRONMENT,
DISEASE DATA
POSSIBLE DISEASES
DIAGNOSIS & TREATMENT
Under-utilized data
monarchinitiative.org
PROBLEM Diagnosis / treatment / prognosis on gestalt
(Experience, intuition, and pattern recognition) Things are not always what they first seem Errors are common, and up to 35% of errors
cause harm It takes patients @ six years from noticing
symptoms to being diagnosed with trips to eight physicians
25% of patients having to wait between 5 and 30 years
HYPOTHESISDiagnosis, treatment and prognosis may be informed and complemented by democratized deep phenotyping that is easier to compute, collect, and exchange
Ulcerated paws
Palmoplantar
hyperkeratosis
Thick hand skin
Image credits:
"HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons – https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG
http://www.guinealynx.info/pododermatitis.html
Challenge: Each database uses their own vocabulary/ontology
MPHP
MGIHPOA
Challenge: Each database uses their own phenotype vocabulary/ontology
ZFA
MPDPO
WPO
HP
OMIA
VT
FYPO APOSNOMED
………
WB
PB
FB
OMIA
MGI
RGD
ZFIN
SGD
HPOA
EHR
IMPCOMIM
…QTLd
b
Can we help machines understand phenotype terms?
“Palmoplantar hyperkeratosis”
Human phenotype I have absolutely no
idea what that means
???
The Human Phenotype Ontology for deep phenotyping
Decomposition of complex concepts allows interoperability
Mungall, C. J., Gkoutos, G., Smith, C., Haendel, M., Lewis, S., & Ashburner, M. (2010). Integrating phenotype ontologies across multiple species. Genome Biology, 11(1), R2. doi:10.1186/gb-2010-11-1-r2
“Palmoplantar
hyperkeratosis”
increased
Stratum corneum
layer of skin
=Human phenotype
PATO
Uberon
Species neutral ontologies, homologous concepts
Autopod
keratinization
GO
Harmonizing diseases, phenotypes, anatomy, and genotypes
We learn different phenotypes from different organisms
Diagnosing an undiagnosed disease
Putting all that data to use to diagnose a rare platelet syndrome
http://bit.ly/stim1paper
Phen
otyp
ic profi
le
Gene
s
Heterozygous, missense mutationSTIM-1
MGI mouse
N/A
Heterozygous, missense mutation
STIM-1
N/A
Ranked STIM-1 variant maximally pathogenic based on cross-species G2P data,
in the absence of traditional data sourceshttp://bit.ly/exomiser
Stim1Sax/Sax
What about patients? Can they phenotype themselves?
The GenomeConnect surveyEg. Machado-Joseph disease might present:
With: Ptosis, andAbnormalities of:
eye movement,
globe size, vision, optic nerve
Without: Myopia, Hypermetropia, andAbnormalities of:
the retina, the iris, and the lens
Annotation sufficiency determination per disease
Patient self-reported HPO profile HPO reference profile
Comparison
Ensure that the survey is maximally diagnostic
Disease 1
Disease 2
Disease 3
Disease 4
Disease 7500
HPO reference profile
HPO reference profile
HPO reference profile
HPO reference profile
Patient self-reported HPO profile
Patient self-reported HPO profile
Patient self-reported HPO profile
Patient self-reported HPO profile
Simulated GenomeConnect survey HPO Profiles
Monarch Initiative reference HPO Profiles
Ensure that the survey is maximally diagnostic
Patient
Expert
Phenotypic Profile overlap
Compare phenotypic profiles
For every known disease, fill the survey and ask:Does the profile match the disease best based on the survey mapping?
GC HPO profile
HPO reference profile
GC HPO profile
GC HPO profile
GC HPO profileHPO reference
profile
HPO reference profile
HPO reference profile
Comparisons
Assess patient-derived profile generation
HPO profile
HPO profile
HPO profile
HPO profile
HPO profile
Disease 1
Disease 2
Disease 3
Disease 4
Disease 7500 HPO reference profileGC HPO profile
Assess patient-derived profile generation
Patient
Expe
rtPh
enot
ypic
Pro
file
over
lap
Compare phenotypic
profiles
For every diagnosed patient:Can the patients utilize the survey and retrieve the correct disease?
GC HPO profile
Clinical evaluation HPO profile
GC HPO profile
GC HPO profile
GC HPO profile Clinical evaluation HPO profile
Clinical evaluation HPO profile
Clinical evaluation HPO profile
GC HPO profile Clinical evaluation HPO profile
Comparison
Determine the contribution and sufficiency of patient self-phenotyping
Patient 1
Patient 2
Patient 3
Patient 4
Patient 7500
Determine the contribution and sufficiency of patient self-phenotyping
UDN patient generated GenomeConnect survey
HPO profile
UDN patient Clinical evaluation HPO profile
Patient
Expert
Phenotypic Profile overlap
Compare phenotypic profiles
Human Phenotype Ontology, now with 6,200 plain language
synonyms for patients, families, and non-
experts
www.human-phenotype-ontology.org@HP_ontology
Almost half of the 14k synonyms are plain language
All synonyms Plain language synonyms
Introducing PhenoPackets
It’s exactly what you think it is:a packet of phenotype data to be used
anywhere, written by anyone
Genes Environment Phenotypes+ =
Biology central dogma
Standards for encoding and exchanging data
must be up to these challenges
@ontowonka
The relationships too must be captured
It is not just the bits…
G-P or D (disease)causescontributes tois risk factor forprotects againstcorrelates withis marker formodulatesinvolved inincreases susceptibility to
G-G (kind of)regulatesnegatively regulates (inhibits)positively regulates (activates)directly regulatesinteracts withco-localizes withco-expressed with P/D - P/Dpart ofresults inco-occurs withcorrelates withhallmark of (P->D)
E-Pcontributes to (E->P)influences (E->P)exacerbates (E->P)manifest in (P->E) G-E (kind of)expressed inexpressed duringcontainsinactivated by
Genes Environment Phenotypes+ =
Computable encodings are essential
Base pairsVariant notation (eg. HGVS)
Human Phenotype Ontology
Mammalian Phenotype Ontology
Medical procedure codingEnvironment Ontology
@ontowonka
Genes Environment Phenotypes
VCF PXFGFF
Standard exchange formats exist for genes …
but for phenotypes? Environment?
NEW
BED
@ontowonka
If it is alive, it can be PhenoPackaged
Some biodiversity images adapted from http://i.vimeocdn.com/video/417366050_1280x720.jpg
Model Organisms
Biodiversity Crops Domestic Animals
Disease vectorsEpidemiologic
al Monitoring
Drug discovery &
Development
Rare Disease
Diagnosis
Personalized
Medicine
Environmental Monitoring
Patients & Cohorts
Genetic Engineering
Mechanistic
Discovery
Phenopackets for organisms
This is “Maru”,a 4-year-old,
male cat of the Scottish Fold
breed
abnormal sheltering behavior
[MP:0014039] (onset at birth)Bi
ogra
phy
Phen
oty
pes &
qu
alifi
ers
youtube.com/user/mugumoguWeighs 6kg
Mea
sure
me
nts So
ur ce
title: "age of onset example"persons:- id: "#1" label: "Donald Trump" sex: "M"
phenotype_profile:- entity: "person#1" phenotype: types: - id: "HP:0200055" label: "Small hands" onset: description: "during development" types: - id: "HP:0003577" label: "Congenital onset" evidence: - types: - id: "ECO:0000033" label: ”Traceable Author Statement" source: - id: "PMID:1"
Image credits: upi.com
Phenopackets for humans
Canonical JSON format
Phenopackets for Patients
Image credits: ngly1.org
• Dry eyes• Developmental delay• Elevated liver function
phenotype_profile:- entity: ”patient16" phenotype: types: - id: "HP:0000522" label: ”Alacrima" onset: description: ”at birth" types: - id: "HP:0003577" label: "Congenital onset" evidence: - types: - id: "ECO:0000033" label: ”Traceable Author Statement" source: - id: ” https://twitter.com/examplepatient/status/123456789"
• Patient registries
• Social media
Phenopackets for journals
Each article can be associated with a
phenopacket
Robinson, P. N., Mungall, C. J., & Haendel, M. (2015). Capturing phenotypes for precision medicine. Molecular Case Studies, 1(1), a000372. doi:10.1101/mcs.a000372
Each phenopacket can be shared via DOI in any repository outside paywall (eg. Figshare, Zenodo, etc)
PhenoPacket formats
CSV JSON RDF OWL
Export phenopacket to
The PhenoPackets ecosystem
www.phenopackets.orghttps://github.com/phenopackets/
So, do you expect us to put these together ourselves?
Emerging tool: WebPhenote (based on Phenote)
create.monarchinitiative.org
WebPhenoteForm-based Graph-based
Noctua / LEGO inside
Thank you!Deep Phenotype and have a magical
day
Community engagement surveybit.ly/monarchcommunity
AcknowledgementsLawrence Berkeley
Chris MungallSuzanna LewisJeremy NguyenSeth Carbon
CharitéPeter RobinsonSebastian Kohler
RTIJim Balhoff
CyverseRamona Walls
U of PittsburghHarry Hochheiser
OHSUMatt BrushKent ShefchekJulie McMurryTom ConlinNicole Vasilevsky
Queen Mary College London
Damian SmedleyJules Jacobson
GarvanTudor Groza
Alfred Wegener Pier Buttigieg
GSoCSatwik Bhattamishra
FUNDING: NIH Office of Director: 1R24OD011883; NIH-UDP: HHSN268201300036C, HHSN268201400093P, Phenotype Ontology Research Coordination Network (NSF-DEB-0956049)With special thanks to Julie McMurry for excellent graphic
design