Divining Systems Biology Knowledge from High-throughput Experiments Using EGAN Jesse Paquette ISMB...

58
Divining Systems Biology Knowledge from High-throughput Experiments Using EGAN Jesse Paquette ISMB 2010 Biostatistics and Computational Biology Core Helen Diller Family Comprehensive Cancer Center University of California, San Francisco (AKA BCBC HDFCCC UCSF)
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    1

Transcript of Divining Systems Biology Knowledge from High-throughput Experiments Using EGAN Jesse Paquette ISMB...

Divining Systems Biology Knowledge from High-throughput Experiments Using EGAN

Jesse PaquetteISMB 2010

Biostatistics and Computational Biology CoreHelen Diller Family Comprehensive Cancer Center

University of California, San Francisco (AKA BCBC HDFCCC UCSF)

High-throughput experiments

• This talk applies to– Expression microarrays– aCGH– SNP/CNV arrays– MS/MS Proteomics– DNA methylation– ChIP-Seq– RNA-Seq– In-silico experiments

• If parts of the output can be mapped to gene IDs– You can use EGAN

What do you hope to accomplish?

Collect data

Process data

Differential analysis Publish!

Clusters and/or gene lists

New testable hypotheses

Produce insight about the underlying biology

New grants!New papers!

Drug targets!

Leverage organic intelligence

Clusters and/or gene lists

New testable hypotheses

Produce insight about the underlying biology

Summarize

Visualize

Contextualize

Producing insight from clusters and gene lists

• Summarize: find enriched pathways (and other gene sets)– Hypergeometric over-representation

• DAVID– Global trends

• GSEA

• Visualize: gene relationships in a graph– Protein-protein interactions

• Cytoscape– Network module discovery

• Ingenuity IPA– Literature co-occurrence

• PubGene

• Contextualize: pertinent literature• PubMed• Google• iHOP

EGAN: Exploratory Gene Association Networks

• Methods: state-of-the-art analysis of clusters and gene lists– Hypergeometric enrichment of gene sets– Global statistical trends of gene sets– Hypergraph visualization (via Cytoscape libraries)– Literature identification– Network module discovery

• User Interface: responds quickly to new queries from the biologist– Sandbox-style functionality– Dynamic adjustment of p-value cutoffs– Point-and-click interface– All data in-memory for immediate access– Links to external websites

• Modular: integrates as a flexible plug-and-play cog – All data is customizable– Proprietary data can be restricted to the client location– Java runs on almost every OS (PC, Mac, LINUX)– Can be configured and launched from a different application (e.g. GenePattern)– Analyses can be scripted for automation

Gene sets

• A gene set is a a set of semantically related genes– e.g. Wnt signaling pathway

• EGAN contains a database of gene sets– > 100k gene sets by default

• KEGG, Reactome, NCI-Nature, Gene Ontology, MeSH, Conserved Domain, Cytoband, miRNA targets

– You can easily add your own• Simple file format

• Download from MSigDB (Broad Institute)

Gene-gene relationships

• EGAN also contains– Protein-protein interactions (PPI)– Literature co-occurrence– Chromosomal adjacency– Kinase-target relationships

• Other possibilities– Sequence homology– Expression correlation

Example with microarray and aCGH results

• Mirzoeva et al. (2009) Cancer Research– UCSF-LBL collaboration– Analysis of breast cancer cell lines

• Basal vs. luminal

• Discoveries in this presentation– miRNA regulator of subtype (mir-200)– Annexin (ANXA1) as potential regulator of ER,

glucocorticoid and EGFR signaling

Gene list - higher expression in basal cell lines

Gene set/pathway enrichment

Importing gene lists from publications

Combining expression with aCGH

Finding network modules

Where to find EGAN

• Website– http://akt.ucsf.edu/EGAN/

• 2010 paper in Bioinformatics– http://www.ncbi.nlm.nih.gov/pubmed/19933825

Acknowledgements• BCBC HDFCCC UCSF

– Taku Tokuyasu– Adam Olshen– Ritu Roy– Ajay Jain

• LBNL– Debopriya Das– Joe Gray

• Funding– UCSF Cancer Center Support

Grant

• UCSF– Early adopters

• Ingrid Revet• Antoine Snijders• Stephan Gysin• Sook Wah Yee• Joachim Silber

– Cytoscape gurus• David Quigley• Scooter Morris

– OTM• David Eramian• Ha Nguyen

– Laura van ’t Veer– Donna Albertson– Graeme Hodgson