EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon...
-
Upload
leo-douglas-little -
Category
Documents
-
view
222 -
download
0
Transcript of EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon...
![Page 1: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/1.jpg)
ScheduleSchedule Data, preprocessing, grouping (10:15-11:00)Data, preprocessing, grouping (10:15-11:00) Hands-on part I (11:00-11:20)Hands-on part I (11:00-11:20) Group analysis (11:20-11:45)Group analysis (11:20-11:45) Hands-on part II (11:45-12:05)Hands-on part II (11:45-12:05) Coffee Break (12:05 – 12:20)Coffee Break (12:05 – 12:20) Spike (12:20-12:50)Spike (12:20-12:50) Hands-on part III (12:50-13:10)Hands-on part III (12:50-13:10) Lunch Break (13:10 – 14:10)Lunch Break (13:10 – 14:10) Amadeus (14:10 – 14:40)Amadeus (14:10 – 14:40) Hands-on part IV (14:40 – 15:00)Hands-on part IV (14:40 – 15:00)
![Page 2: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/2.jpg)
• EXPANDER EXPANDER – an integrative package for analysis of gene expression data
• Built-in support for 15 organisms:human, mouse, rat, chicken, fly, zebrafish,
C.elegans, yeast (s. cereviciae and s. pombe), arabidopsis, tomato, listeria, E. coli, aspargillus* and rice.
• Demonstration - on oligonucleotide array data, which contains expression profiles measured in several time points after serum stimulation of human cell line.
![Page 3: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/3.jpg)
What can it do?What can it do? Low level analysis:
Missing data estimation (KNN or manual)Data adjustments (merge conditions, divide
by base, take log)NormalizationProbes & condition filtering
High level analysisDetecting patterns/groups in the data
(supervised clustering, differential expression, clustering, bi-clustering, network based grouping).
Ascribing biological meaning to patterns (searching for enrichment within groups).
![Page 4: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/4.jpg)
Input data
Functional enrichment
Preprocessing
Promoter signals
Visu
aliza
tion
utilities
Location enrichment
miRNA Targets
enrichment
Lin
ks to
pu
blic
ann
ota
tion
da
tab
ase
s
Grouping
KEGG pathway
enrichment
![Page 5: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/5.jpg)
EXPANDER – Data Input data:
Expression matrix (probe-row; condition-column) One-channel data (e.g., Affymetrix) Dual-channel data, in which data is log R/G (e.g.
cDNA microarrays) ‘.cel’ files
ID conversion file: maps probes to genes Gene groups data: defines gene groups
![Page 6: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/6.jpg)
EXPANDER – Data (II)
Data definitions: Defining condition subsets Data type & scale (log)
Data Adjustments: Missing value estimation (KNN or arbitrary) Reorder conditions Merging conditions Merging probes by gene IDs Divide by base Log data (base 2)
![Page 7: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/7.jpg)
EXPANDER – Preprocessing Normalization: removal of systematic biases from
the analyzed chips Implemented methods: quantilequantile, lowesslowess Visualization: box plots, scatter plots (simple, M vs. A)
Filtering: Focus downstream analysis on the set of “responding genes” Fold-Change Variation Statistical tests: T-test, SAM (Significance Analysis of
Microarrays) It is possible to define “VIP genes”.
Standardization : Mean=0, STD=1 (visualization)
![Page 8: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/8.jpg)
Input data
Functional enrichment
Preprocessing
Promoter signals
Visu
aliza
tion
utilities
Location enrichment
miRNA Targets
enrichment
Lin
ks to
pu
blic
ann
ota
tion
da
tab
ase
s
Grouping
KEGG pathway
enrichment
![Page 9: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/9.jpg)
Cluster Analysis• partition the responding genes into
distinct groups, each with a particular expression pattern
co-expression → co-function co-expression → co-regulation
• Partition the genes to achieve: High Homogeneity within clusters High Separation between clusters
![Page 10: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/10.jpg)
Cluster Analysis (II)
Implemented algorithms: CLICK, K-means, SOM, Hierarchical
Visualization: Mean expression patterns Heat-maps Chromosomal positions Network sub-graph PCA Clustered matrix
![Page 11: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/11.jpg)
Biclustering
Relevant knowledge can be revealed by identifying genes with common pattern across a subset of the conditions
Novel algorithmic approach is needed: Biclustering
Clustering seeks global partition according to similarity across ALL conditions >> becomes too restrictive on large datasets.
![Page 12: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/12.jpg)
* Bicluster (=module) : subset of genes with similar behavior in a subset of conditions
* Computationally challenging: has to consider many combinations of sub-conditions
Biclustering II Biclustering II
SAMBA = Statistical Algorithmic Method for Bicluster Analysis
* A. Tanay, R. Sharan, R. Shamir RECOMB 02
![Page 13: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/13.jpg)
Drawbacks/ limitations:
• Usefull only for over 20 conditions
• Parameters
• How to asses the quality of Bi-clusters
![Page 14: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/14.jpg)
Biclustering Visualization
![Page 15: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/15.jpg)
Network based grouping
• Goal: to identify modules using gene expression data and interaction networks.
• GE data + Interactions file (.sif) .
• MATISSE (Module Analysis via Topology of Interactions and Similarity SEts).
I. Ulitsky and R. Shamir. BMC Systems Biology (2007)
![Page 16: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/16.jpg)
Motivation
Detect functional modules: groups of interacting proteinsco-expressed genes
Integrative analysis - can identify weaker signals
Identifies a group of genes as well as the connections between them
![Page 17: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/17.jpg)
Front vs Back nodes
Only variant genes (front nodes) have meaningful similarity values
These can be linked by non regulated genes (back nodes).
Back nodes correspond to: Post-translational regulation Partially regulated pathways Unmeasured transcripts
![Page 18: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/18.jpg)
Network based clustering visualization
Similar to clustering visualization (gene list, mean patterns, heat maps, etc.).
Interactions map
![Page 19: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/19.jpg)
Supervised Grouping
• Differential expression: t-test, SAM (Significance Analysis of Microarrays)
• Similarity group (correlation to one probe/gene)
• Rule based grouping (define a pattern)
![Page 20: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/20.jpg)
Hands-on part I (1-3)
![Page 21: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/21.jpg)
Input data
Functional enrichment
Preprocessing
Promoter signals
Visu
aliza
tion
utilities
Location enrichment
miRNA Targets
enrichment
Lin
ks to
pu
blic
ann
ota
tion
da
tab
ase
s
Grouping
KEGG pathway
enrichment
![Page 22: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/22.jpg)
Ascribe functional meaning to gene groups
Gene OntologyGene Ontology (GO) a database of functional annotations. We extract files for all supported organisms.
TANGOTANGO: Apply statistical tests that seek over-represented GO functional categories in the groups.
![Page 23: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/23.jpg)
Functional Enrichment - Visualization
![Page 24: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/24.jpg)
Input data
Functional enrichment
Preprocessing
Promoter signals
Visu
aliza
tion
utilities
Location enrichment
miRNA Targets
enrichment
Lin
ks to
pu
blic
ann
ota
tion
da
tab
ase
s
Grouping
KEGG pathway
enrichment
![Page 25: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/25.jpg)
Inferring regulatory mechanisms from gene expression data
Assumption: co-expression → transcriptional co-regulation → common cis-regulatory promoter elements
Computational identification of cis-regulatory elements that are over-represented in promoters of the co-expressed gene
PRIMA - PRomoter Integration in Microarray Analysis
* Elkon, et. Al, Genome Research (2003)
![Page 26: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/26.jpg)
PRIMA – general description Input:
Target set (e.g., co-expressed genes)Background set (e.g., all genes on the chip)
Analysis: Identify transcription factors whose binding
site signatures are enriched in the ‘Target set’ with respect to the ‘Background set’.
TF binding site models – TRANSFAC DB Default: From -1000 bp to 200 bp relative
the TSS
![Page 27: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/27.jpg)
Promoter Analysis - Visualization
![Page 28: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/28.jpg)
Input data
Functional enrichment
(TANGO)
Normalization/Filtering
Promoter signals (PRIMA)
Visu
aliza
tion
utilities
Location enrichment
miRNA Targets
enrichment (FAME)
Lin
ks to
pu
blic
ann
ota
tion
da
tab
ase
s
Grouping (Clustering/ Biclustering/
Network based clustering)
![Page 29: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/29.jpg)
miRNA Enrichment Analysis
• Goal: to predict micorRNAs (miRNAs) regulation by detecting miRNAs whose binding sites are over/under represented in the 3' UTRs of gene groups.
• FAME = Functional Assignment of MiRNAs via Enrichment
![Page 30: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/30.jpg)
FAME
miRNA targets
Genes sharing a common function
Significance of overlap
• Rather than using a simple hyper-geometric test, FAME also considers:
• The uneven distribution of 3’ UTR lengths
• Confidence values assigned to individual miRNA target sites (context scores)
![Page 31: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/31.jpg)
Implementation in Expander
• Usage very similar to that of TANGO and PRIMA
• Currently uses TargetScan5 predictions• Main parameters:
– Number of random iterations (random graphs created)
– Enrichment direction: over- or under-representation
– Multiple testing correction– The use of the context score weights is
optional
![Page 32: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/32.jpg)
Input data
Functional enrichment
(TANGO)
Normalization/Filtering
Promoter signals (PRIMA)
Visu
aliza
tion
utilities
Location enrichment
miRNA Targets
enrichment (FAME)
Lin
ks to
pu
blic
ann
ota
tion
da
tab
ase
s
Grouping (Clustering/ Biclustering/
Network based clustering)
![Page 33: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/33.jpg)
Location analysis
• Goal: Detect genes that are located in the same area and are co-expressed.
• Search for over represented chromosomal areas within gene groups.
• Statistical test.
• Redundancy filter
• Ignoring known gene clusters
![Page 34: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/34.jpg)
Location analysis visualization
• Enrichment analysis visualization
• Positions view with color assignments
![Page 35: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/35.jpg)
KEGG pathway analysis
• Searches for KEGG pathways that are over-represented in gene groups (I.e. in a target set with respect to a background set).
• Uses hyper geometric test.
• Multiple testing correction (Bonferroni)
• Enrichment results visualization (same as other group analysis results).
![Page 36: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/36.jpg)
Custom enrichment analysis
• Loads an annotation file supplied by the user (provides genes with custom annotations).
• Searches for annotations (features) that are over-represented in gene groups (I.e. in a target set with respect to a background set).
• Uses hyper geometric test.• Multiple testing correction (Bonferroni)• Enrichment results visualization (same as other
group analysis results).
![Page 37: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/37.jpg)
Hands-on part II (3-11)
![Page 38: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/38.jpg)
![Page 39: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/39.jpg)
SPIKE…
![Page 40: EXPression ANalyzer and DisplayER Adi Maron-Katz Igor Ulitsky Chaim Linhart Amos Tanay Rani Elkon Seagull Shavit Dorit Sagir Eyal David Roded Sharan Israel.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d165503460f949eb9d9/html5/thumbnails/40.jpg)
Gene Groups File