Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter...
Transcript of Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter...
![Page 1: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/1.jpg)
Semantic Parsing for
Cancer Panomics
Hoifung Poon
1
![Page 2: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/2.jpg)
Overview
2
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC …
……
……
Disease Genes
Drug Targets
……KBHigh-Throughput Data
![Page 3: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/3.jpg)
Overview
3
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC …
……
……
Disease Genes
Drug Targets
……KB
Infer cancer driver
mutations
High-Throughput Data
![Page 4: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/4.jpg)
4
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC …
……
……
Disease Genes
Drug Targets
…KB
Extract Pathways
from Pubmed
Overview
High-Throughput Data
Grounded
Unsupervised
Semantic Parsing
![Page 5: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/5.jpg)
Collaborators
5
David Heckerman
Tony Gitter Lucy Vanderwende
Kristina Toutanova Chris Quirk
Ankur Parikh
![Page 6: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/6.jpg)
Precision Medicine
![Page 7: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/7.jpg)
7
Before Treatment 15 Weeks
Vemurafenib on BRAF-V600 Melanoma
![Page 8: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/8.jpg)
Vemurafenib on BRAF-V600 Melanoma
8
Before Treatment 15 Weeks 23 Weeks
![Page 9: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/9.jpg)
9
![Page 10: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/10.jpg)
Traditional Biology
10
Targeted Experiments Discovery
One
hypothesis
![Page 11: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/11.jpg)
Genomics
11
High-Throughput ExperimentsDiscovery
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC …
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC …
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC …
Many
hypotheses
?
![Page 12: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/12.jpg)
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC … Healthy
Disease(e.g., Alzheimer, Cancer)
Genome-Wide Association Studies (GWAS)
2000
2010
“Genetic diagnosis of diseases would be
accomplished in 10 years and that
treatments would start to roll out perhaps
five years after that.”
“A Decade Later, Genetic Maps Yield Few New Cures”
New York Times, June 2010.
12
![Page 13: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/13.jpg)
Key Challenges
Human genome: 3 billion base pairs
Potential variations: > 10 million mutations
Combination: > 101000000 (1 million zeros)
Machine learning problem
Atomic features: > 10 million
Feature combination: Too many to enumerate
13
![Page 14: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/14.jpg)
Genomics
14
Discovery
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC …
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC …
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC …
How to Scale Discovery?
High-Throughput Experiments
![Page 15: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/15.jpg)
Cancer
Hundreds of mutations
Most are “passenger”, not driver
Can we identify likely drivers?
15
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC … Normal cells
Tumor cells
![Page 16: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/16.jpg)
Panomics
16
… ATTCGGATATTTAAGGC …
Genome Transcriptome Epigenome
……
![Page 17: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/17.jpg)
Pathway Knowledge
Genes work synergistically in pathways
17
![Page 18: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/18.jpg)
Why Hard to Identify Drivers?
Complex diseases Synergistic perturbation
of multiple pathways
Cancer: 6 8 “hallmarks”
Promote growth
Avoid suicide
Evade immune attack
Induce blood vessels
Invade neighboring tissues
…
18
![Page 19: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/19.jpg)
19Hanahan & Weinberg [Cell 2011]
![Page 20: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/20.jpg)
Why Cancer Comes Back?
Subtypes with alternative pathway profile
Compensatory pathways can be activated
20
EphA2 EphB2
Ovarian Cancer
![Page 21: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/21.jpg)
Why Cancer Comes Back?
Subtypes with alternative pathway profile
Compensatory pathways can be activated
21
EphA2 EphB2
Ovarian Cancer
X
![Page 22: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/22.jpg)
A Grammar of Cancer?
Cancer Anti-Apoptosis & ProGrowth & …
Anti-Apoptosis Deactivate TP53
Anti-Apoptosis Activate BCL-2
…
22
![Page 23: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/23.jpg)
Infer Cancer Driver Mutations
23
Gene A DNA mRNA Protein Protein Active
Transcription Translation Activation
… ATTCGGATATTTAAGGC …
What’s the level of activity?
Is change caused by mutation?
![Page 24: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/24.jpg)
24
Gene A DNA mRNA Protein Protein Active
Gene B DNA mRNA Protein Protein Active
Gene C DNA mRNA Protein Protein Active
Transcription Factor
Protein Kinase
Pathway Knowledge
![Page 25: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/25.jpg)
25
Gene A DNA mRNA Protein Protein Active
Gene B DNA mRNA Protein Protein Active
Gene C DNA mRNA Protein Protein Active
Transcription Factor
Protein Kinase
Pathway Knowledge ?
![Page 26: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/26.jpg)
26
Gene A DNA mRNA Protein Protein Active
Gene B DNA mRNA Protein Protein Active
Gene C DNA mRNA Protein Protein Active
Transcription Factor
Protein Kinase
Pathway Knowledge ?
![Page 27: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/27.jpg)
27
Gene A DNA mRNA Protein Protein Active
Gene B DNA mRNA Protein Protein Active
Gene C DNA mRNA Protein Protein Active
Transcription Factor
Protein Kinase
Pathway Knowledge !
![Page 28: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/28.jpg)
Approach: Graph HMM
28
Gene A DNA mRNA Protein Protein Active
Transcription Factor
Protein Kinase
Gene B DNA mRNA Protein Protein Active
Gene C DNA mRNA Protein Protein Active
![Page 29: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/29.jpg)
Extract Pathways from Pubmed
29
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC …
……
……
Disease Genes
Drug Targets
……KBHigh-Throughput Data
![Page 30: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/30.jpg)
PubMed
22 millions abstracts
Two new abstracts every minute
Adds 2000-4000 every day
30
![Page 31: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/31.jpg)
…
VDR+ binds to
SMAD3 to form
…
…
JUN expression
is induced by
SMAD3/4
…
PMID: 123
PMID: 456
……
31
Extract Pathways from Pubmed
![Page 32: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/32.jpg)
32
Involvement of p70(S6)-kinase activation in IL-10
up-regulation in human monocytes by gp41 envelope
protein of human immunodeficiency virus type 1 ...
Involvement
up-regulation
IL-10human
monocytegp41 p70(S6)-kinase
activation
Extract Complex Knowledge
![Page 33: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/33.jpg)
33
Involvement of p70(S6)-kinase activation in IL-10
up-regulation in human monocytes by gp41 envelope
protein of human immunodeficiency virus type 1 ...
Involvement
up-regulation
IL-10human
monocytegp41 p70(S6)-kinase
activation
Extract Complex Knowledge
REGULATION
REGULATION REGULATION
PROTEINPROTEINPROTEINCELL
![Page 34: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/34.jpg)
34
Involvement of p70(S6)-kinase activation in IL-10
up-regulation in human monocytes by gp41 envelope
protein of human immunodeficiency virus type 1 ...
Involvement
up-regulation
IL-10human
monocyte
SiteTheme Cause
gp41 p70(S6)-kinase
activation
Theme Cause
Theme
Extract Complex Knowledge
REGULATION
REGULATION REGULATION
PROTEINPROTEINPROTEINCELL
![Page 35: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/35.jpg)
35
Involvement of p70(S6)-kinase activation in IL-10
up-regulation in human monocytes by gp41 envelope
protein of human immunodeficiency virus type 1 ...
Involvement
up-regulation
IL-10human
monocyte
SiteTheme Cause
gp41 p70(S6)-kinase
activation
Theme Cause
Theme
Extract Complex Knowledge
REGULATION
REGULATION REGULATION
PROTEINPROTEINPROTEINCELL
Semantic Parsing
![Page 36: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/36.jpg)
Bottleneck: Annotated Examples
GENIA (BioNLP Shared Task 2009-2013)
1999 abstracts
MeSH: human, blood cell, transcription factor
Can we breach the annotation bottleneck?
36
![Page 37: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/37.jpg)
Free Lunch #1:
Distributional Similarity
Similar context Probably similar meaning
Annotation as latent variables
Textual expression Recursive clusters
Unsupervised semantic parsing
37
Poon & Domingos, “Unsupervised Semantic Parsing”.
EMNLP-2009 (Best Paper Award).
![Page 38: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/38.jpg)
Problem Formulation
Dependency tree Semantic parse
Probability
Parsing
Learning
38
Prior: Favor fewer parameters
![Page 39: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/39.jpg)
Free Lunch #2:
Existing KBs
Many KBs available
Gene/Protein: GeneBank, UniProt, …
Pathways: NCI, Reactome, KEGG, BioCarta, …
Annotation as latent variables
Textual expression Table, column, join, …
Grounded unsupervised semantic parsing
39
Poon, “Grounded Unsupervised Semantic Parsing”. ACL-13.
![Page 40: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/40.jpg)
Natural-Language Interface
to Database
Get flight from Toronto to San Diego stopping at DTW
SELECT flight.flight_id
FROM flight, city, city c2, flight_stop, airport_service, airport_service as2
WHERE flight.from_airport = airport_service.airport_code AND flight.to_airport =
as2.airport_code AND airport_service.city_code = city.city_code AND as2.city_code =
city2.city_code AND city.city_name = ‘toronto’ AND city2.city_name = ‘san diego’ AND
flight_stop.flight_id = flight.flight_id AND flight_stop.stop_airport = ‘dtw’
Answers40
![Page 41: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/41.jpg)
Clusters KB Elements
Entity: Table, Column, Cell
Relation: Relational join
Priors:
Favor lexical similarity
Favor short relational joins
41
![Page 42: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/42.jpg)
GUSP: Key Ideas
Leverage target database
42
Job ID Company System
001 IBM Unix
002 Roche IBM
003 Microsoft Windows
……
Prior: Favor Unix → System
Bootstrap learning
with lexical prior
JOB
![Page 43: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/43.jpg)
GUSP: Key Ideas
Leverage target database
43
Flight ID From Airport ……
Flight
Airport Code Airport Name ……
Airport
Foreign Key
![Page 44: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/44.jpg)
GUSP: Key Ideas
Leverage target database
44
Flight Airport
![Page 45: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/45.jpg)
GUSP: Key Ideas
Leverage target database
45
Flight
Days Fare Airline
Airport
![Page 46: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/46.jpg)
GUSP: Key Ideas
Leverage target database
46
Flight Airport
flight BWI
Days Fare Airline
?
Flight
Days Fare Airline
Airport
![Page 47: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/47.jpg)
GUSP: Key Ideas
Leverage target database
47
Prior: Favor shorter join
Leverage schema
to guide learningFlight
Days Fare Airline
Airport
flight BWI
![Page 48: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/48.jpg)
Free Lunch #3:
Dependency Parses
Start from syntactic parse
Rich resources and available parsers
Intractable structure learning Tree HMM
Exact inference is linear-time
Need to handle syntax-semantics mismatch
48
![Page 49: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/49.jpg)
Syntax-Semantics Mismatch
49
get
toronto
flight from to
diego
at
san stopping
dtw
![Page 50: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/50.jpg)
50
get
toronto
flight from to
diego
at
san stopping
dtw
Syntax-Semantics Mismatch
![Page 51: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/51.jpg)
51
get
toronto
flight from to
diego
at
san stopping
dtw
Syntax-Semantics Mismatch
![Page 52: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/52.jpg)
52
get
toronto
flight from to
diego
at
san stopping
dtw
Syntax-Semantics Mismatch
![Page 53: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/53.jpg)
Introduce Complex States
Raising
Sinking
Implicit
53
![Page 54: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/54.jpg)
Raising
54
get
toronto
flight from to
diego
at
san stopping
dtw
E:flight
E:flight:R
![Page 55: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/55.jpg)
Sinking
get
toronto
flight from to
diego
at
san stopping
dtw55
E:flight:R
V:city.name + E:flight
![Page 56: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/56.jpg)
Implicit
56
Give me the fare (of the flight) from Seattle to Boston
fare
E:fare
fare
E:fare + E:flight
![Page 57: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/57.jpg)
Experiment: Dataset
ATIS
Questions and ATIS database
Dev. / Test: Follow ZC07 [Zettlemoyer & Collins 2007]
Gold SQLs: Use at evaluation only
Gold logical forms in ZC07: Not used
Evaluate on question-answering accuracy
57
![Page 58: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/58.jpg)
Experiment: Systems
LEXICAL: Lexical-trigger prior only
Supervised learning
ZC07: Zettlemoyer & Collins [2007]
FUBL: Kwiatkowski et al. [2011]
GUSPSIMPLE: Simple states only
GUSP++: All states
58
![Page 59: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/59.jpg)
Results
59
System Accuracy
ZC07 84.6
FUBL 82.8
GUSP++ 83.5
![Page 60: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/60.jpg)
Ablation
60
System Variant Accuracy
LEXICAL 33.9
GUSPSIMPLE 66.5
GUSP++ 83.5
Raising 75.7
Sinking 77.5
Implicit 76.2
![Page 61: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/61.jpg)
Pathway Extraction
More to leverage from KB:
Semantic relations in KB likely occur in
semantic parse of some sentence
Priors:
Favor a parse w. relations in KB
Penalize a parse w. relations not in KB
61
![Page 62: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/62.jpg)
Distant-Supervision
Existing work: Binary relation, classification Mintz et al. [2009]
Riedel et al. [2010]
Hoffmann et al. [2011]
Krishnamurphy & Mitchell [2012]
Etc.
Our approach: Generalize distant supervision
to semantic parsing
62
Parikh, Poon, Toutanova. In progress.
![Page 63: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/63.jpg)
http://literome.azurewebsites.net
63
Literome
Poon et al., “Literome: PubMed-Scale Genomic Knowledge
Base in the Cloud”, Bioinformatics 2014.
![Page 64: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/64.jpg)
PubMed-Scale Extraction
Preliminary pass:
2 million instances
13,000 genes, 870,000 unique interactions
Applications:
UCSC Genome Browser, MSR Interactions Track
Cancer expression profile modeling
Validate de novo pathway prediction
Etc.
64
![Page 65: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/65.jpg)
Big Mechanism
42-million program for 12 teams
Reading, Assembly, Explanation
Domain: Cancer signaling pathways
We are funded
PI: Andrey Rzhetsky
Co-PI w. James Evans, Ross King
65
![Page 66: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/66.jpg)
We Have Digitized Life
66
![Page 67: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/67.jpg)
Next: Digitize Medicine
67
Knock down genes A, B, C → Cure
![Page 68: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/68.jpg)
Summary
Precision medicine is the future
Infer cancer driver mutations
Graphical model: Pathways + Panomics data
Extract pathways from Pubmed
Semantic parsing grounded in KBs
Literome: KB for genomic medicine
68
![Page 69: Semantic Parsing for Cancer Panomics - Yoav Artzi · 2020. 9. 17. · David Heckerman Tony Gitter Lucy Vanderwende Kristina Toutanova Chris Quirk Ankur Parikh. Precision Medicine.](https://reader034.fdocuments.us/reader034/viewer/2022052014/602b8099e9c98a5af142593c/html5/thumbnails/69.jpg)
Summary
69
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC …
……
……
Disease Genes
Drug Targets
……KBHigh-Throughput Data