Integrating clinical and model organism G2P data for disease discovery

22
Integrating clinical and model organism genotype-phenotype data for improved disease discovery Melissa Haendel ClinGen/DECIPHER meeting 2015.05.28 @monarchinit www.monarchinitiative.org @ontowonka

Transcript of Integrating clinical and model organism G2P data for disease discovery

Page 1: Integrating clinical and model organism G2P data for disease discovery

Integrating clinical and model organism genotype-phenotype

data for improved disease discovery

Melissa HaendelClinGen/DECIPHER meeting

2015.05.28@monarchinit

www.monarchinitiative.org

@ontowonka

Page 2: Integrating clinical and model organism G2P data for disease discovery

There are 47,964 variants of unknown significance in

ClinVar

What are we gonna do about that?

Page 3: Integrating clinical and model organism G2P data for disease discovery

The Human Phenotype Ontology

Each disease is associated with different phenotype nodes in the graph

Disease or Patient

Page 4: Integrating clinical and model organism G2P data for disease discovery

HPO concepts are not well represented in other

vocabularies

Winnenburg and Bodenreider, ISMB PhenoDay, 2014

UMLS

SNOMED CT

CHV

MedDRA

MeSH

NCIT

ICD10-C

ICD9-CM

ICD-10

OMIM

MedlinePlus

Page 5: Integrating clinical and model organism G2P data for disease discovery

Phenotype “Blast”: Which phenotypic profile is graphically

most similar?Disease X

Patient

Disease Y

Page 6: Integrating clinical and model organism G2P data for disease discovery

Finding the phenotype graph in common

Disease X

Patient

Disease Y

Page 7: Integrating clinical and model organism G2P data for disease discovery

The Human Phenotype Ontology

Page 8: Integrating clinical and model organism G2P data for disease discovery

Why we need all the organisms

Page 9: Integrating clinical and model organism G2P data for disease discovery

Clinicians and researchers speak different languages

Page 10: Integrating clinical and model organism G2P data for disease discovery

Diversity of disease and phenotype vocabularies

Page 11: Integrating clinical and model organism G2P data for disease discovery

Using semantics to bridge vocabularies

Page 12: Integrating clinical and model organism G2P data for disease discovery

Using semantics to bridge vocabularies

Page 13: Integrating clinical and model organism G2P data for disease discovery

Standardizing Cross-species G2P Data + Ontologies

SciGraph: A Neo4j-backed ontology store All species ontologies and G2P data can be

stored in a graph together Advantages: Semantics + Speed + Flexibility Propagate provenance and evidence Using to develop and evaluate GA4GH G2P

schemas

https://github.com/SciGraph/SciGraph

Page 14: Integrating clinical and model organism G2P data for disease discovery

Combining genotype and phenotype data for variant

prioritizationWhole exome

Remove off-target and common variants

Variant score from allele freq and pathogenicity

Phenotype score from phenotypic similarity

PHIVE score to give final candidateshttps://www.sanger.ac.uk/resources/databases/exomiser/query/exomiser2

Mendelian filters

Page 15: Integrating clinical and model organism G2P data for disease discovery

Cross-species phenotypic profile comparison for disease

discovery

Page 17: Integrating clinical and model organism G2P data for disease discovery

AcknowledgmentsOHSUNicole VasileskyMatt BrushBryan LarawayShahim EssaidKent Shefchek

NIH-UDPWilliam BoneMurat SincanDavid AdamsJoie DavisNeal BoerkoelCyndi TifftBill Gahl

UDNAlexa McCrayRachel Ramoni

GarvanTudor Groza

Lawrence BerkeleyNicole WashingtonSuzanna LewisJeremy XuanChris Mungall

UCSDJeff GretheChris ConditMaryann Martone

U of PittChuck BorromeoVincent AgrestiHarry Hochheiser

SangerAnika OehlrichJules JacobsonDamian Smedley

CharitéSebastian KohlerSandra DoelkenSebastian BauerPeter Robinson

TorontoMarta GirdeaSergiu DumitriuHeather TrangBailey GallingerOrion BuskeMike Brudno

JAXCynthia SmithCurrent Funding:

NIH Office of Director: 1R24OD011883HHSN268201300036C, HHSN268201400093P

Page 18: Integrating clinical and model organism G2P data for disease discovery

If you use Monarch ontologies or tools, please attribute us!

Please send feedback too, don’t let it be a one way street.

Page 19: Integrating clinical and model organism G2P data for disease discovery

Extra

Page 20: Integrating clinical and model organism G2P data for disease discovery

Propagating phenotypes across genotypic levels

Page 21: Integrating clinical and model organism G2P data for disease discovery

We learn different things from different organisms

Page 22: Integrating clinical and model organism G2P data for disease discovery

Monarch in the GA4GH MatchMaker Exchange