Envisioning a world where everyone helps solve disease
Transcript of Envisioning a world where everyone helps solve disease
![Page 1: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/1.jpg)
@monarchinit@ontowonka
“Not everyone can become a great artist, but a great artist
can come from anywhere” Anton Ego, Ratatouille, 2007, Dixsney/Pixar
Envisioning a world where everyone helps solve disease
Melissa HaendelSWAT4LS 2015
Cambridge, England
![Page 2: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/2.jpg)
Faith-based research
“I believe that my work on some obscure cell type in some obscure organism will matter to mankind one day”
Well, it can, and it does.
![Page 3: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/3.jpg)
![Page 4: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/4.jpg)
Four things it takes to solve an undiagnosed disease
1. Deep phenotyping the human organism
2. Crossing the language barrier
3. A lot of data from a lot of places
4. Very many people (who have faith)
![Page 5: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/5.jpg)
1. DEEP PHENOTYPING THE HUMAN ORGANISM
![Page 6: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/6.jpg)
PatientGenom
e/Exome
Filter
****
** ***** ****
Genomic data
Diagnosis,treatment
ATCTTAGCACGTTAC
ATCTTAGCACGTGACATCTTATCACGTTACATCTTAGCACGTTAC
![Page 7: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/7.jpg)
What do all those variations do?
We only know the phenotypic consequences of mutation of <20% of the human coding genome
![Page 8: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/8.jpg)
Patient
Genome
/Exome
Diagnosis,treatment
Filter
****
** ***** ****
Genomic data
Phenotype
Gene-Phenotype
Data
Environment
![Page 9: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/9.jpg)
We have a common language for sequence data…. ATCTTAGCACGTTAC… ….not so much for phenotypes
![Page 10: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/10.jpg)
CC2.0 European Southern Observatory https://www.flickr.com/photos/esoastronomy/6923443595
![Page 11: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/11.jpg)
Can we help machines understand phenotypes?
“Palmoplantar
hyperkeratosis”
Human phenotype I have absolutely no
idea what that means
???
Image credits:
"HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons – https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG
Marcin Wichary [CC BY 2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons
![Page 12: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/12.jpg)
A disease is a collection of phenotypes
Patient
Disease XDifferential diagnosis with similar but non-matching phenotypes is difficult
Flat back of head Hypotonia
Abnormal skull morphology Decreased muscle mass
![Page 13: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/13.jpg)
Do we *really* need yet another clinical vocabulary?
Winnenburg and Bodenreider, ISMB PhenoDay, 2014
UMLSSNOMED CT
CHVMedDRA
MeSHNCIT
ICD10-CICD9-CM
ICD-10OMIM
MedlinePlus
Existing clinical vocabularies don’t adequately cover phenotype descriptions
![Page 14: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/14.jpg)
Disease-phenotype associations using an ontology
![Page 15: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/15.jpg)
Once OMIM is rendered computable, are we done yet?
Free text -> HPO enables phenotype semantic similarity matching
![Page 16: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/16.jpg)
Mendelian disease integrationMerges sources together using: equivalence and subclass axioms derived from xrefs string matching manual efforts to fill gaps based on phenotypes and
anatomical axioms
Parkinson’s disease subtypes
Different colors = different disease sources
https://github.com/monarch-initiative/monarch-disease-ontology
![Page 17: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/17.jpg)
Why we need all the organisms
Model data can provide up to 80% phenotypic coverage of the human coding genome
![Page 18: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/18.jpg)
We learn different things from different organisms
![Page 19: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/19.jpg)
2. CROSSING THE LANGUAGE BARRIER
![Page 20: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/20.jpg)
Ulcerated paws
Palmoplantar hyperkeratos
is
Thick hand skin
Image credits:
"HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons – https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG
http://www.guinealynx.info/pododermatitis.html
![Page 21: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/21.jpg)
Challenge: Each database uses their own vocabulary/ontology
MPHP
MGIHPOA
Image credits:
"HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons – https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG
http://www.guinealynx.info/pododermatitis.html
![Page 22: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/22.jpg)
Challenge: Each database uses their own vocabulary/ontology
ZFA
MPDPO
WPO
HP
OMIA
VT
FYPO APOSNOMED
………
WB
PB
FB
OMIA
MGI
RGD
ZFIN
SGD
HPOAIMPC
OMIM
ICDQTLd
b
EHR
Image credits:
"HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons – https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG
http://www.guinealynx.info/pododermatitis.html
![Page 23: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/23.jpg)
Decomposition of complex concepts allows interoperability
Mungall, C. J., Gkoutos, G., Smith, C., Haendel, M., Lewis, S., & Ashburner, M. (2010). Integrating phenotype ontologies across multiple species. Genome Biology, 11(1), R2. doi:10.1186/gb-2010-11-1-r2
“Palmoplantar
hyperkeratosis”
increased
Stratum corneum
layer of skin
=Human phenotype
PATO
Uberon
Species neutral ontologies, homologous concepts
Autopod
keratinization
GO
![Page 24: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/24.jpg)
Cross-species ontology integration
![Page 25: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/25.jpg)
3. A LOT OF DATA FROM A LOT OF PLACES
![Page 26: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/26.jpg)
Graph Views
DiverseG2P/D
source data
Source Ontologies Owl Loader
Graph Views
Monarch App
FacetedBrowsing
Phenotype
Matching
.ttl
.ttl
Input OutputPipeline
Putting it Together: Data + Ontologies
https://github.com/SciGraph/SciGraph
![Page 27: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/27.jpg)
Data Integrated in SciGraph>25 sources>100 species
51M triples4M curated
associations2.2M G-P / G-D
associations
![Page 28: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/28.jpg)
Genotype-phenotype integration
One sourceTwo sources3 or more
9%
91% of our 2.2 Million G2P associations required integrating 2 or more data sources (this number does not even include orthology (Panther))
91%
![Page 30: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/30.jpg)
Combining genotype and phenotype data for variant prioritization
Whole exome
Remove off-target and common variants
Variant score from allele freq and pathogenicity
Phenotype score from phenotypic similarity
PHIVE score to give final candidates
Mendelian filters
https://www.sanger.ac.uk/resources/software/exomiser/
![Page 31: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/31.jpg)
York platelet syndrome and STIM1
Markello T et al. Molecular Genetics and Metabolism 2015, 114: 474
Grosse J, J Clin Invest 2007 117: 3540-50
Impaired platelet aggregation(HP:0003540) Thromocytopenia (HP:0001873)
Abnormal platelet activation(MP:0006298) Thrombocytopenia (MP:0003179)
UDP_2542 Stim1Sax/Sax
http://www.nature.com/gim/journal/vaop/ncurrent/full/gim2015137a.html
![Page 32: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/32.jpg)
4. VERY MANY PEOPLE (WHO HAVE FAITH)
![Page 33: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/33.jpg)
Who helped solve the STIM1 UDP_2542 case?
![Page 34: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/34.jpg)
Credit extends beyond the publication
Johannes creates stim1 mouse
Melissa annotates patient UDP_2542 with HPO
Will performs analysis of UDP_2542 that includes stim1 mouse to generate a dataset of prioritized variants
Tom writes publication pmid:25577287 about the STIM1 diagnosis
Tom explicitly credits Will as an author but not Melissa.
![Page 35: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/35.jpg)
Credit is connected
Credit to Will is asserted, but credit to Melissa can be inferred
![Page 36: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/36.jpg)
![Page 37: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/37.jpg)
![Page 38: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/38.jpg)
![Page 39: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/39.jpg)
![Page 40: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/40.jpg)
Who is in the graph?
Melissa HaendelPeter RobinsonChris MungallSebastian KohlerCindy SmithNicole VasilevskySandra Dolken
Johannes GrosseAttila BraunDavid Varga-SzaboNiklas BeyersdorfBoris SchneiderLutz ZeitlmannPetra HankePatricia SchroppSilke MühlstedtCarolin ZornMichael HuberCarolin SchmittwolfWolfgang JaglaPhilipp YuThomas KerkauHarald SchulzeMichael NehlsBernhard Nieswandt
Thomas MarkelloDong ChenJustin Y. Kwan Iren Horkayne-Szakaly Alan Morrison Olga Simakova Irina Maric Jay Lozier Andrew R. Cullinane Tatjana Kilo Lynn Meister Kourosh PakzadSanjay Chainani Roxanne Fischer Camilo Toro James G. White David AdamsCornelius BoerkoelWilliam A. Gahl Cynthia J. Tifft Meral Gunay-Aygun
Melissa HaendelDavid AdamsDavid DraperBailey GallingerJoie DavisNicole Vasilevsky Heather TrangRena GodfreyGretchen GolasCatherine GrodenMichele NehrebeckyAriane SoldatosElise Valkanas,Colleen WahlLynne Wolfe
Elizabeth Lee Amanda LinksWill Bone Murat SincanDamian SmedleyJules JacobsonNicole WashingtonElise FlynnSebastian KohlerOrion BuskeMarta GirdeaMichael Brudno Jeremy Band
Hans GoebleKaren BalbachNadine PfeiferSandra WernerChristian Linden
Clinical/care Pathology Ontologist CS/informatics Curator Basic research
![Page 41: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/41.jpg)
Tracking Evidence and Provenance of G2P Associations
Evidence is a collection of information that is used to support a scientific claim or association
Provenance is a history of what processes led to the claim being made, what entities participated in these processes
Value of Evidence and Provenance Metadata context to evaluate credibility/confidence support filtering and analysis of data detailed history for attribution
![Page 42: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/42.jpg)
Evidence and Provenance for a Variant-Phenotype Association
![Page 43: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/43.jpg)
Who is missing?
http://haluzz.deviantart.com/art/Waldo-at-the-hipster-party-273602450
![Page 44: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/44.jpg)
What about patients? Can they help too?
HP:0000252Pref Label: MicrocephalySynonyms: Decreased Head Circumference; Reduced Head Circumference; Small head circumferenceSuggested Synonyms : Small Head; Little Head; Small Skull; Little Skull; Small Cranium…
Small headMicrocephaly
https://commons.wikimedia.org/wiki/File:Microcephaly.png#/media/File:Microcephaly.png
![Page 45: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/45.jpg)
Job openinghttps://goo.gl/MlcnR5
Focusing on building ontologies and semantic web technologies to represent research, attribution, provenance, and scholarly communication
@ontowonka [email protected]
![Page 46: Envisioning a world where everyone helps solve disease](https://reader035.fdocuments.us/reader035/viewer/2022062412/58e8cd3f1a28abb3398b4fc7/html5/thumbnails/46.jpg)
Funding: NIH Office of Director: 1R24OD011883; NIH-UDP: HHSN268201300036C, HHSN268201400093P; NCINCI/Leidos #15X143,
BD2K U54HG007990-S2 (Haussler) & BD2K PA-15-144-U01 (Kesselman)
PIs: Chris Mungall, Peter Robinson, Damian Smedley, Tudor Groza, Harry Hochheiserwww.monarchinitiative.org/page/team