Big Data Network Genomics Network Inference and Perturbation to Study Chemical-Mediated Cancer...
-
Upload
madison-cummings -
Category
Documents
-
view
213 -
download
0
Transcript of Big Data Network Genomics Network Inference and Perturbation to Study Chemical-Mediated Cancer...
![Page 1: Big Data Network Genomics Network Inference and Perturbation to Study Chemical-Mediated Cancer Induction Stefano Monti smonti@bu.edu Section of Computational.](https://reader031.fdocuments.us/reader031/viewer/2022020417/56649e4f5503460f94b46578/html5/thumbnails/1.jpg)
Big Data Network Genomics Network Inference and Perturbation
to Study Chemical-Mediated Cancer Induction
Stefano [email protected]
Section of Computational BioMedicineBoston University School of Medicine
Biostatistics, BUSPH
Bioinformatics Program, BU
Graduate Program in Genetics & Genomics, BU
Broad Institute of MIT & Harvard
![Page 2: Big Data Network Genomics Network Inference and Perturbation to Study Chemical-Mediated Cancer Induction Stefano Monti smonti@bu.edu Section of Computational.](https://reader031.fdocuments.us/reader031/viewer/2022020417/56649e4f5503460f94b46578/html5/thumbnails/2.jpg)
Abstract
Development and application of novel methods of network inference and differential analysis from multiple genomic data types toward the elucidation of a chemical's mechanism(s) of
cancer induction
![Page 3: Big Data Network Genomics Network Inference and Perturbation to Study Chemical-Mediated Cancer Induction Stefano Monti smonti@bu.edu Section of Computational.](https://reader031.fdocuments.us/reader031/viewer/2022020417/56649e4f5503460f94b46578/html5/thumbnails/3.jpg)
Abstract
Development and application of novel methods of network inference and differential analysis from high-dimensional data types toward the elucidation of functionally relevant modules
(generalization)
high-dimensional data typesfunctionally relevant modules
domain specific
![Page 4: Big Data Network Genomics Network Inference and Perturbation to Study Chemical-Mediated Cancer Induction Stefano Monti smonti@bu.edu Section of Computational.](https://reader031.fdocuments.us/reader031/viewer/2022020417/56649e4f5503460f94b46578/html5/thumbnails/4.jpg)
The Motivating Problem
![Page 5: Big Data Network Genomics Network Inference and Perturbation to Study Chemical-Mediated Cancer Induction Stefano Monti smonti@bu.edu Section of Computational.](https://reader031.fdocuments.us/reader031/viewer/2022020417/56649e4f5503460f94b46578/html5/thumbnails/5.jpg)
GoalsDevelopment of “Carcinogenicity Biomarker(s)”
CarcinogenicityPrediction Model
Chemical
Carcinogen
Non-carcinogen
Pathways affected Driver alterations Biomarkers …
Understand Why
Manuscript under Review
![Page 6: Big Data Network Genomics Network Inference and Perturbation to Study Chemical-Mediated Cancer Induction Stefano Monti smonti@bu.edu Section of Computational.](https://reader031.fdocuments.us/reader031/viewer/2022020417/56649e4f5503460f94b46578/html5/thumbnails/6.jpg)
GoalsDevelopment of “Carcinogenicity Biomarker(s)”
CarcinogenicityPrediction Model
Chemical
Carcinogen
Non-carcinogen
Non-carcinogens Carcinogens
gene1 gene2 gene3 gene4 gene5 gene6 gene7
…
To generate this ‘matrix’100,000s of experiments need
to be performed
1,000 of controls generated
![Page 7: Big Data Network Genomics Network Inference and Perturbation to Study Chemical-Mediated Cancer Induction Stefano Monti smonti@bu.edu Section of Computational.](https://reader031.fdocuments.us/reader031/viewer/2022020417/56649e4f5503460f94b46578/html5/thumbnails/7.jpg)
In Progresshigh-throughput data generation
384-well plate
100,000s profiles
Phase I 24 plates (liver and lung) ~200 compounds ~10,000 profiles
Future plans … Phase II
More tissue types (breast, prostate, etc.) More compounds (~1,500) Mixtures 100,000s profiles
Phase III iPSC-derived cells & 3D cultures “personalized exposure” models
![Page 8: Big Data Network Genomics Network Inference and Perturbation to Study Chemical-Mediated Cancer Induction Stefano Monti smonti@bu.edu Section of Computational.](https://reader031.fdocuments.us/reader031/viewer/2022020417/56649e4f5503460f94b46578/html5/thumbnails/8.jpg)
Generalization of the Motivating Problem
Comparison of a control state to multiple perturbation states
Standard approaches of gene-based differential analysis might miss salient (aggregate) differences
High-dimensional data (1000s of ‘features’) Usually representable as 2D [10K x 1K] matrices
Large sample size for the ‘control state’ ≥1000 observations
Small sample size for each of the ‘perturbation states’ ~10-100 observations/perturbation
![Page 9: Big Data Network Genomics Network Inference and Perturbation to Study Chemical-Mediated Cancer Induction Stefano Monti smonti@bu.edu Section of Computational.](https://reader031.fdocuments.us/reader031/viewer/2022020417/56649e4f5503460f94b46578/html5/thumbnails/9.jpg)
Generalization of the Motivating Problem: an example
The Connectivity Map/LINCS project Expression Profiling of Chemical/Genetic perturbations
• >10,000 compounds (most FDA approved drugs)• ~5,000 genetic perturbation (RNAi, CRISPR)• 18 cell types, multiple doses, time-points
> 1,000,000 profiles
Main Goal: Drug Discovery
![Page 10: Big Data Network Genomics Network Inference and Perturbation to Study Chemical-Mediated Cancer Induction Stefano Monti smonti@bu.edu Section of Computational.](https://reader031.fdocuments.us/reader031/viewer/2022020417/56649e4f5503460f94b46578/html5/thumbnails/10.jpg)
Approach Overview
Module1
Module2
…
ModulepCo
mpo
und 1
Com
poun
d 2
… Com
poun
d n
lossgain
connectivity
Annota
tionWild-Type
Network
![Page 11: Big Data Network Genomics Network Inference and Perturbation to Study Chemical-Mediated Cancer Induction Stefano Monti smonti@bu.edu Section of Computational.](https://reader031.fdocuments.us/reader031/viewer/2022020417/56649e4f5503460f94b46578/html5/thumbnails/11.jpg)
Approach Overview
Module1
Module2
…
ModulepCo
mpo
und 1
Com
poun
d 2
… Com
poun
d n
lossgain
connectivity
Network constructionModule Identification
Annota
tionWild-Type
Network
Module/Network Comparison
![Page 12: Big Data Network Genomics Network Inference and Perturbation to Study Chemical-Mediated Cancer Induction Stefano Monti smonti@bu.edu Section of Computational.](https://reader031.fdocuments.us/reader031/viewer/2022020417/56649e4f5503460f94b46578/html5/thumbnails/12.jpg)
Approach Detailsnetworks’ construction
Correlations Networks clustering vs. topology-based ‘module’ identification
Gaussian models Inverse covariance matrix partial correlations
Correlation networks + “scale-free transformations” mostly for comparison w/ existing methods
![Page 13: Big Data Network Genomics Network Inference and Perturbation to Study Chemical-Mediated Cancer Induction Stefano Monti smonti@bu.edu Section of Computational.](https://reader031.fdocuments.us/reader031/viewer/2022020417/56649e4f5503460f94b46578/html5/thumbnails/13.jpg)
Approach Details networks’ comparison
Covariance matrices comparison
Probabilistic Model Selection Bayes Factor
Network topology Diffusion State Distance (M. Crovella) and related
![Page 14: Big Data Network Genomics Network Inference and Perturbation to Study Chemical-Mediated Cancer Induction Stefano Monti smonti@bu.edu Section of Computational.](https://reader031.fdocuments.us/reader031/viewer/2022020417/56649e4f5503460f94b46578/html5/thumbnails/14.jpg)
The Data
Gene expression profiles networks’ inference
Protein-protein interaction networks’ priors
“Cell painting” profiles networks’ annotation
100K samples
10K features (genes)
![Page 15: Big Data Network Genomics Network Inference and Perturbation to Study Chemical-Mediated Cancer Induction Stefano Monti smonti@bu.edu Section of Computational.](https://reader031.fdocuments.us/reader031/viewer/2022020417/56649e4f5503460f94b46578/html5/thumbnails/15.jpg)
Deliverables
Computational Toolbox Network inference and visualization Module (i.e., sub-network) identification/comparison Network/module-based clustering/annotation
Analysis and cataloguing of chemical perturbations Chemicals’ putative mechanisms of action Interpretable carcinogenicity predictor(s)
A sandbox for researchers to develop and test new methods richly annotated multi-type data domain expertise to evaluate relevance/usefulness
Preliminary results for pursuit of further funding
![Page 16: Big Data Network Genomics Network Inference and Perturbation to Study Chemical-Mediated Cancer Induction Stefano Monti smonti@bu.edu Section of Computational.](https://reader031.fdocuments.us/reader031/viewer/2022020417/56649e4f5503460f94b46578/html5/thumbnails/16.jpg)
The Team
Stefano Monti, Ph.D. (Assoc. Professor)Computational Biology, Cancer Genomics, Machine Learning (Bayesian Networks)
Paola Sebastiani, Ph.D. (Professor)Biostatistics, Genetics/Genomics, Bayesian Graphical Models
Mark Cravella, Ph.D. (Professor)Computer Science, Network Analysis
Simon Kasif (Professor)Computational Biology, Systems Biology, Machine Learning
Francesca Mulas, Ph.D. (Post-doctoral Fellow)Computational Biology/Bioinformatics, Computer Science
Daniel Gusenleiter, M.S. (Ph.D. student)Bioinformatics, Computer Science, Machine Learning
![Page 17: Big Data Network Genomics Network Inference and Perturbation to Study Chemical-Mediated Cancer Induction Stefano Monti smonti@bu.edu Section of Computational.](https://reader031.fdocuments.us/reader031/viewer/2022020417/56649e4f5503460f94b46578/html5/thumbnails/17.jpg)
“Background” TeamBU-SRPDavid Ozonoff Basra KomalHeather Henry (NIEHS)
Evans Foundation - ARCKatya RavidRobin MacDonald
NTP/NIEHSScott AuerbachRay Tice
Broad InstituteAravind SubramanianXiaodong LuTodd GolubcMAP team
BU CBM/Bioinformatics/SPHDavid Sherr (co-PI)Daniel GusenleitnerJessalyn Ubellacker
Tisha MeilaHarold GomezYuxiang TanLiye Zhang
Elizabeth MosesTeresa WangMarc LenburgAvi Spira
![Page 18: Big Data Network Genomics Network Inference and Perturbation to Study Chemical-Mediated Cancer Induction Stefano Monti smonti@bu.edu Section of Computational.](https://reader031.fdocuments.us/reader031/viewer/2022020417/56649e4f5503460f94b46578/html5/thumbnails/18.jpg)
The End