Machine Learning for Drug Development
Transcript of Machine Learning for Drug Development
![Page 1: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/1.jpg)
Machine Learning for Drug Development
1ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
Marinka ZitnikDepartment of Biomedical InformaticsBroad Institute of Harvard and MITHarvard Data Science Initiative
[email protected]://zitniklab.hms.harvard.edu
![Page 2: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/2.jpg)
OutlineOverview and introduction
Part 1: Virtual drug screening and drug repurposing
Part 2: Adverse drug effects, drug-drug interactions
Part 3: Clinical trial site identification, patient recruitment
Part 4: Molecule optimization, molecular graph generation, multimodal graph-to-graph translation
Part 5: Molecular property prediction and transformers
Demos, resources, wrap-up & future directions2ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
![Page 3: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/3.jpg)
Datasets to facilitate algorithmic innovation
3ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
![Page 4: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/4.jpg)
Therapeutics are one of most exciting areas for computational
scientists. However,Retrieving, curating, and processing datasets is time-consuming and
requires extensive domain expertise
Datasets are scattered around the bio repositories and there is no centralized data repository for a variety of therapeutics
Many tasks are under-explored in AI/ML communitybecause of the lack of data access
4ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
![Page 5: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/5.jpg)
● Open-Source ML Datasets for Therapeutics:○ Wide range of tasks: target discovery, activity
screening, efficacy, safety, manufacturing○ Wide range of products: small molecules, antibodies,
vaccine, miRNA● Numerous Data Functions:
○ Extensive data functions ○ Model evaluation, data processing and splits,
molecule generation oracles, and much more● 3 Lines of Code:
○ Minimum package dependency, lightweight loaders
5ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
![Page 6: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/6.jpg)
Domain scientists
MLscientists
Identify meaningful learning tasks
Design powerful ML models
Advancing algorithms for key therapeutics problems
Our Vision for TDC
6ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
![Page 7: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/7.jpg)
Modular Structure of TDC
7
Single-instance
Multi-instance
Generation
Y
Y
ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
![Page 8: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/8.jpg)
DrugRes
ADME
Tox
HTS
QM
Yields
Paratope
Epitope
Develop
DTI
DDI
PPI
GDA
DrugSyn
PeptideMHC
AntibodyAff
MTI
Catalyst
Reaction
MolGen
PairMolGen
RetroSyn
67 datasets spread over 22 learning tasks
8ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
![Page 9: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/9.jpg)
Model performance evaluators
Data processing helpers
A variety of data splits
Data Functions to Support Your Research
9ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
![Page 10: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/10.jpg)
GuacaMol
MOSES
Literature
Molecule Generation
GeneratedMolecules
Oracle Score
Optimize
GuacaMol: Benchmarking Models for de Novo Molecular Design, J. Chem. Inf. Model., 2019MOSES: A Benchmarking Platform for Molecular Generation Models, Frontiers in Pharmacology, 2020
Molecule Generation Oracles
10ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
![Page 11: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/11.jpg)
Leaderboards: Submit your Models
ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021 11
zitniklab.hms.harvard.edu/TDC
![Page 12: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/12.jpg)
You Are Invited to Join TDC! TDC is an Open-Source, Community Effort
github.com/mims-harvard/TDC
zitniklab.hms.harvard.edu/TDC
12ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
![Page 13: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/13.jpg)
Demos, tools, and implementations
13ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
![Page 14: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/14.jpg)
DeepPurpose: Deep Learning Library for Compound and Protein Modeling
DTI, Drug Property, PPI, DDI, Protein Function Prediction
ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021 14
DeepPurpose: a Deep Learning Library for Drug-Target Interaction Prediction, Bioinformatics 2020
https://github.com/kexinhuang12345/DeepPurpose
![Page 15: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/15.jpg)
How can domain scientists interact with AI systems?
15
DeepPurpose: a Deep Learning Library for Drug-Target Interaction Prediction, Bioinformatics 2020MolDesigner: Interactive Design of Efficacious Drugs with Deep Learning, NeurIPS 2020ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
![Page 16: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/16.jpg)
MolDesigner: Interactive Design of Drugs with Deep Learning
16http://deeppurpose.sunlab.orgML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
![Page 17: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/17.jpg)
DEMO: DRUG-TARGET INTERACTION PREDICTION
Drug: Remdesivir Remdesivir is indicated for the treatment of adult and pediatric patients aged 12 years and over weighing at least 40 kg for coronavirus disease 2019 (COVID-19) infection requiring hospitalization.
Target protein: Replicase polyprotein 1ab. Multifunctional protein involved in the transcription and replication of viral RNAs
ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021 17
Molecular structure of Remdesivir Amino acid sequence of Replicase polyprotein 1ab
![Page 18: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/18.jpg)
How can domain scientists interact with AI systems?
18
DeepPurpose: a Deep Learning Library for Drug-Target Interaction Prediction, Bioinformatics 2020MolDesigner: Interactive Design of Efficacious Drugs with Deep Learning, NeurIPS 2020ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
![Page 19: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/19.jpg)
Automating Science
19ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
![Page 20: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/20.jpg)
Automating Science
20ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
![Page 21: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/21.jpg)
How to explain predictions?Key idea: § Summarize where in the data the model “looks” for
evidence for its prediction§ Find a small subgraph most influential for the prediction
Approach to generate explanations for graph neural networks based on counterfactual reasoning
21GNNExplainer: Generating Explanations for Graph Neural Networks, NeurIPS 2019ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
![Page 22: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/22.jpg)
GNNExplainer: Key Idea§ Input: Given prediction 𝑓(𝑥) for node/link 𝑥§ Output: Explanation, a small subgraph 𝑀! together
with a small subset of node features:§ 𝑀! is most influential for prediction 𝑓(𝑥)
§ Approach: Learn 𝑀! via counterfactual reasoning§ Intuition: If removing 𝑣 from
the graph strongly decreases the probability of prediction ⇒ 𝑣 is a good counterfactual explanation for the prediction
GNN Explainer: Generating Explanations for Graph Neural Networks, NeurIPS 2019 22ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
![Page 23: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/23.jpg)
Examples of Explanations”Will rosuvastatin treat hyperlipidemia? Whatis the disease treatment mechanism?”
New Algorithms: GNNExplainer: Generating Explanations for Graph Neural Networks, NeurIPS 2019New Insights: Discovery of Disease Treatment Mechanisms through the Multiscale Interactome, Nature Communications 2021 (in press)23ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
![Page 24: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/24.jpg)
Open challenges and future directions
24ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
![Page 25: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/25.jpg)
Learn about Therapeutics ML!
https://www.drugsymposium.org
Videos from the presentations are now publicly available to everyone through the Symposium Video Channel
ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021 25
![Page 26: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/26.jpg)
Open Challenges§ Disconnected, uncoupled biomedical knowledge:
§ Challenge: Need to combine data in their broadest sense to close the gap between research and patient data
§ Diverse mechanisms of drug action:§ Challenge: Need to consider diverse mechanisms through which a
drug can treat a disease
§ Novel drugs in development, emerging diseases:§ Challenge: Need to learn and reason about never-before-seen
phenomena
§ Datasets for a variety of therapeutics tasks:§ Challenge: Need datasets and benchmarks to accelerate ML
model development, validation and transition into production and clinical implementation
ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021 26
![Page 27: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/27.jpg)
OutlineOverview and introduction
Part 1: Virtual drug screening and drug repurposing
Part 2: Adverse drug effects, drug-drug interactions
Part 3: Clinical trial site identification, patient recruitment
Part 4: Molecule optimization, molecular graph generation, multimodal graph-to-graph translation
Part 5: Molecular property prediction and transformers
Demos, resources, wrap-up & future directions27ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
![Page 28: Machine Learning for Drug Development](https://reader031.fdocuments.us/reader031/viewer/2022012023/6169d07f11a7b741a34ba9f2/html5/thumbnails/28.jpg)
28https://zitniklab.hms.harvard.edu/drugml
ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021