EGAN: Exploratory Gene Association Networks by Jesse Paquette Biostatistics and Computational...
-
Upload
natalie-hart -
Category
Documents
-
view
215 -
download
0
Transcript of EGAN: Exploratory Gene Association Networks by Jesse Paquette Biostatistics and Computational...
EGAN: Exploratory Gene Association Networks
by Jesse PaquetteBiostatistics and Computational Biology Core
Helen Diller Family Comprehensive Cancer CenterUniversity of California, San Francisco
(AKA BCBC HDFCCC UCSF)
EGAN http://akt.ucsf.edu/EGAN/
• Features– Downloadable Java application –
• but could be re-composed as components for web service architecture– Graphics provided by Cytoscape; graph layout algorithms imported from open
source– Data pre-loaded for analysis. Each data set must include assay id, a measure
(e.g., correlation coefficient, expression level) and significance value (e.g., p value)
– Currently for Human and Rat Genome, but other model species in August (including arabidopsis)
• Key focus- interactive analysis of sets of genes– User identifies the sets interactively– Enrichment -- uses Fishers exact test to see whether genes in a pathway are
“overrepresented” relative to chance selection. Based on hypergeometric distribution, an n choose k sampling distribution
– Gene sets graphed based on relationships• Counts (simply connect each gene to others in the set– can graph multiple sets)• Protein-protein interaction• Co-occurrence in literature
– Access to pub med literature and external links• For demos, slides, presentations
http://akt.ucsf.edu/EGAN/documentation.php
Producing insight from clusters and gene lists
• Summarize: find enriched pathways (and other gene sets)– Hypergeometric over-representation
• DAVID– Global trends
• GSEA
• Visualize: gene relationships in a graph– Protein-protein interactions
• Cytoscape– Network module discovery
• Ingenuity IPA– Literature co-occurrence
• PubGene
• Contextualize: pertinent literature• PubMed• Google• iHOP
High-throughput experiments
• EGAN applies to– Expression microarrays– aCGH– SNP/CNV arrays– MS/MS Proteomics– DNA methylation– ChIP-Seq– RNA-Seq– In-silico experiments
• If parts of the output can be mapped to gene IDs– You can use EGAN
Gene sets
• EGAN contains a database of gene sets– You can also add your own– Download from MSigDB (Broad)
• A gene set defines a semantically-meaningful subset of genes– Signaling or metabolic pathway– Gene Ontology (GO) term– Previously-reported gene list (“signature”)– Cytoband– Transcription factor targets– miRNA targets– Conserved domain– Drug targets– &c.
Gene-gene relationships
• EGAN contains– Protein-protein interactions (PPI)– Literature co-occurrence– Chromosomal adjacency– Kinase-target relationships
The article will be shown in your default web browser.
Finding Counts
EGAN Summary: Exploratory Gene Association Networks
• Methods: state-of-the-art analysis of clusters and gene lists– Hypergeometric enrichment of gene sets– Global trends of gene sets– Graph visualization– Literature identification– Network module discovery
• User Interface: responds quickly to new queries from the biologist– Fluid adjustment of p-value cutoffs– Point-and-click interface– All data in-memory for immediate access– Links to external websites
• Modular: integrates as a flexible plug-and-play cog – All data is customizable– Proprietary data can be restricted to the client location– Java runs on almost every OS (PC, Mac, LINUX)– Can be configured and launched from a different application (e.g. GenePattern)– Analyses can be scripted for automation
Keys to getting the most out of EGAN
• Don’t panic!• Load as much data as possible
– Assay results for every gene– Multiple experiments– Pathways and gene sets
• MSigDB– Previously-published gene lists
and clusters• Supplementary data • Oncomine
• Think about the context of the experiment
– Show appropriate genes on graph• Think about the semantic meaning
of the enriched gene sets– Show appropriate gene sets on
graph
• Follow links to literature• Use appropriate Google/PubMed
search queries• Create high-quality reports
– Save your custom gene sets– Export graph screenshots to PDF– Export tables with enrichment
scores to Excel– Record details in your lab
notebook
Where to find EGAN
• Website– http://akt.ucsf.edu/EGAN/
• 2010 paper in Bioinformatics– http://www.ncbi.nlm.nih.gov/pubmed/19933825