Visualizing RNA Expression Data John Quackenbush VIZBI 16 March 2011.

Post on 11-Jan-2016

219 views 0 download

Tags:

Transcript of Visualizing RNA Expression Data John Quackenbush VIZBI 16 March 2011.

Visualizing RNA Expression DataVisualizing RNA Expression Data

John QuackenbushJohn QuackenbushVIZBIVIZBI

16 March 201116 March 2011

Northern Blots:Northern Blots:Before the dawn of TimeBefore the dawn of Time

Northern BlotsNorthern Blots

Northern BlotsNorthern Blots

Quantitative RT-PCRQuantitative RT-PCRThe Pre-Modern EraThe Pre-Modern Era

Quantitative PCRQuantitative PCR

Quantitative PCR and other MethodsQuantitative PCR and other Methods

Large-scale Quantitative RT-PCR:Large-scale Quantitative RT-PCR:The Dawn of the Modern AgeThe Dawn of the Modern Age

An Aside: The Birth of ClusteringAn Aside: The Birth of Clustering

Our World Today:Our World Today:A Microarray OverviewA Microarray Overview

History is written by the victors History is written by the victors (or those who produce software): (or those who produce software):

The Birth of ClusteringThe Birth of Clustering

This was also the start of tormenting This was also the start of tormenting the red-green color-blind.the red-green color-blind.

Truth is determined by the person Truth is determined by the person giving the talk:giving the talk:

MeV is the best clustering tool ever!MeV is the best clustering tool ever!

http://www.tm4.orghttp://www.tm4.org

Truth is determined by the person Truth is determined by the person giving the talk:giving the talk:

MeV is the best clustering tool ever!MeV is the best clustering tool ever!

Truth is determined by the person Truth is determined by the person giving the talk:giving the talk:

MeV is the best clustering tool ever!MeV is the best clustering tool ever!

Public Microarray DataPublic Microarray Data ArrayExpressArrayExpress

20,423 20,423 Experiments (Experiments (572,682 hybs/572,682 hybs/arrays)arrays)

GEO GEO 21,320 21,320 Experiments Experiments ((529,108 arrays)529,108 arrays)

CIBEXCIBEX 148 Experiments (2,711 arrays)148 Experiments (2,711 arrays)

SMDSMD 21,52121,521Expts (80,319 incl private data)Expts (80,319 incl private data)

>1,000,000 arrays x >1,000,000 arrays x $500 = $500 = $500,000,000$500,000,000

Cancer Studies account Cancer Studies account for for >14% >14% of all of all studies in studies in databases… databases…

EBI’s Expression Atlas Rocks!EBI’s Expression Atlas Rocks!

TreatmentTreatmentOptionsOptions

QualityQualityOf LifeOf Life

GeneticGeneticRiskRisk

EarlyEarlyDetectionDetection

Patient Patient StratificationStratification

DiseaseDiseaseStagingStaging

OutcomesOutcomes

Natural History of DiseaseNatural History of Disease Clinical CareClinical Care

EnvironmentEnvironment + Lifestyle+ Lifestyle

BirthBirth TreatmentTreatment DeathDeath

Disease Progression and Disease Progression and Personalized CarePersonalized Care

BiomarkersBiomarkers

Welcome to the post-Modern World:Welcome to the post-Modern World:Next-Gen Technologies have Dramatically Next-Gen Technologies have Dramatically

Expanded our Genomic UniverseExpanded our Genomic Universe

Browser-mania rules!Browser-mania rules!

RNA-Seq data of 7 FFPE blocksRNA-Seq data of 7 FFPE blocks

Back to Excel, Man’s Best FriendBack to Excel, Man’s Best Friend

And more websites are integrating And more websites are integrating datadata

Cells Converge to Attractive StatesCells Converge to Attractive States

Stuart Kauffman presented the idea of a gene expression landscape Stuart Kauffman presented the idea of a gene expression landscape with attractorswith attractors

•~250 stable cell types each represent attractors~250 stable cell types each represent attractors

•Cells can be "pushed" or induced to converge to an attractor. Cells can be "pushed" or induced to converge to an attractor.

•Once in the attractor, a cell is robust to small perturbations.Once in the attractor, a cell is robust to small perturbations.

Jess MarJess Mar

Differentiation of Promyelocytes into Differentiation of Promyelocytes into Neutrophil-Like CellsNeutrophil-Like Cells

PromyeloctyesPromyeloctyes

(HL-60 Cell Line)(HL-60 Cell Line)

Neutrophil-like Neutrophil-like CellsCells

Dimethyl Sulfoxide Dimethyl Sulfoxide (DMSO)(DMSO)

All-Trans Retinoic Acid All-Trans Retinoic Acid (ATRA)(ATRA)

~6 days~6 days

Affymetrix Affymetrix GeneChipGeneChip

Time 0Time 0

Day 7Day 7

Collins et al. Collins et al. PNAS PNAS 19781978

RA used in differentiation RA used in differentiation therapy for acute therapy for acute promyelocytic leukemia.promyelocytic leukemia.

Combined with Combined with chemotherapy, complete chemotherapy, complete remission rates as high remission rates as high as 90-95% can be as 90-95% can be achieved.achieved.

Huang et al. Huang et al. PRL PRL 20052005Jess MarJess Mar

GEDI: Cells Display Divergent GEDI: Cells Display Divergent Trajectories That Eventually Converge as Trajectories That Eventually Converge as

they Differentiatethey Differentiate

Huang et al. Huang et al. PRL PRL 20052005

Graphical representation of the results from a Self-Organizing Map clustering.Graphical representation of the results from a Self-Organizing Map clustering.

Expression data from a single sample (time point) clustered according to a grid.Expression data from a single sample (time point) clustered according to a grid.

DMSODMSO, , ATRAATRA

What factors drive this divergent-then-convergent behavior?What factors drive this divergent-then-convergent behavior?

State AState A

State BState B

State AState A

Core Core Differentiation Differentiation PathwayPathway Transient Pathway Transient Pathway

(Perturbation 2)(Perturbation 2)

Transient Pathway Transient Pathway (Perturbation 1)(Perturbation 1)

Observed Observed Trajectory Trajectory (Perturbation 1)(Perturbation 1)

Observed Observed Trajectory Trajectory (Perturbation 2)(Perturbation 2)

State BState B

Our HypothesisOur Hypothesis

Jess MarJess Mar

Observed TrajectoryObserved Trajectory

ATRAATRA

DMSODMSO

ATRAATRA

DMSODMSO

2 hrs2 hrs 4 hrs4 hrs 8 hrs8 hrs 12 hrs12 hrs 18 hrs18 hrs 1 day1 day

ATRAATRA

DMSODMSO

ATRAATRA

DMSODMSO

2 days2 days 3 days3 days 5 days5 days4 days4 days 7 days7 days6 days6 daysJess MarJess Mar

2 hrs2 hrs 4 hrs4 hrs 8 hrs8 hrs 12 hrs12 hrs 18 hrs18 hrs 1 day1 day

Transient TrajectoryTransient Trajectory

ATRAATRA

DMSODMSO

ATRAATRA

DMSODMSO

2 days2 days 3 days3 days 5 days5 days4 days4 days 7 days7 days6 days6 days

Jess MarJess Mar

Core TrajectoryCore Trajectory

2 hrs2 hrs 4 hrs4 hrs 8 hrs8 hrs 12 hrs12 hrs 18 hrs18 hrs 1 day1 day

ATRAATRA

DMSODMSO

ATRAATRA

DMSODMSO

2 days2 days 3 days3 days 5 days5 days4 days4 days 7 days7 days6 days6 days

Jess MarJess Mar

Ultimately, we’d like to get to pathways:Ultimately, we’d like to get to pathways:Functional Roles Are Associated with ConstraintFunctional Roles Are Associated with Constraint

High-variance genes High-variance genes tend to function as tend to function as

cell surface cell surface receptors. receptors.

Low-variance genes Low-variance genes function as kinases function as kinases and transferases. and transferases.

ExtracellularExtracellular

MembraneMembrane

CytoplasmCytoplasm

NuclearNuclear

high variancehigh variance low variancelow variance

But the tools are very primativeBut the tools are very primative

Variance Constraints Alter Variance Constraints Alter Network TopologyNetwork Topology

Degree distributions for the MAPK module are significantly different Degree distributions for the MAPK module are significantly different (Kolmogorov-Smirnov test). (Kolmogorov-Smirnov test).

high variancehigh variance low variancelow variance

Degree of statistical significance Degree of statistical significance is altered by disease status.is altered by disease status.

So we’re back to Heat MapsSo we’re back to Heat MapsThe transcriptional profiles of ONS XS cells from SZ patients more closely The transcriptional profiles of ONS XS cells from SZ patients more closely resemble those of healthy fibroblasts than any other stem cell signature.resemble those of healthy fibroblasts than any other stem cell signature.

And of course, we’ve left out theAnd of course, we’ve left out theinterestingg stuff, like where genes are interestingg stuff, like where genes are

expressed.expressed.

LGRC Research Portal

LGRC Research Portal

PAGE DETAILS

Search-Facets-Search within results-Keyword prompts-Search history

Table:-Paged results-Sortable columns

Actions:-Go to Gene detail page-Add genes to ‘gene set’

Gene Expression Summary

RNASeq

PAGE DETAILS

Annotation summary & summary view for each assay/data type:

Accordion style sections

-GEXP – expression profile across major Dx categories-RNASeq – Exon structure of the gene-SNPs – Table of SNPs in region of gene, highlighting association with major Dx group- Methylation – Methylation profile in region around gene-Genomic alterations – table of CNVs & alterations observed w/ freq in region around gene

Actions:- Click through to assay detail page-Add gene to set

Annotation Summary

LGRC Research Portal

PAGE DETAILS

- View aggregate statistics- View cohort details- Build cohort sets- Build composite phenotypes

Actions:

-Go to data download for selected cohort -Go to assay detail for selected cohort-Go to cohort manager

LGRC Research Portal

Analysis ToolsAnalysis Tools

PAGE DETAILSPAGE DETAILS

-Very minimal parameters and Very minimal parameters and options…here just 2 cohorts of options…here just 2 cohorts of interest, maybe p-value cutoff interest, maybe p-value cutoff

Generates comprehensive reportGenerates comprehensive report

Edit in place results – Don’t set Edit in place results – Don’t set parameters, edit the resultsparameters, edit the results

Analysis goes into queue, email Analysis goes into queue, email notification when finishednotification when finished

Cohort 1:Cohort 1:

Cohort 2:Cohort 2:

Set 1Set 1

Set 2Set 2

Start AnalysisStart AnalysisView analysis parametersView analysis parameters

Job StatusJob Status RunningRunning

Job name: Job name: My job 1My job 1

Analysis of Differential Expression: My Job 1

Supervised Analysis

Meta analysis

Unsupervised analysis

PAGE DETAILS

-Very minimal parameters and options.

Generates comprehensive report

Edit in place results – Don’t set parameters, edit the results

Accordion style result sections

Generate PDF report of analysis

Analysis goes into queue, email notification when finished

Before I came here I was confused Before I came here I was confused about this subject. about this subject.

After listening to your lecture, After listening to your lecture, I am still confused but at a higher level. I am still confused but at a higher level.

- Enrico Fermi, (1901-1954)- Enrico Fermi, (1901-1954)

Genomics is here to stayGenomics is here to stay

The Gene Index TeamThe Gene Index TeamCorina AntonescuCorina Antonescu

Valentin AntonescuValentin AntonescuFenglong LiuFenglong LiuGeo PerteaGeo Pertea

Razvan SultanaRazvan SultanaJohn QuackenbushJohn Quackenbush

Microarray Expression TeamMicroarray Expression Team Stefan BentinkStefan Bentink

Thomas ChittendenThomas ChittendenAedin CulhaneAedin CulhaneKristina HoltonKristina Holton

Jane PakJane PakRenee RubioRenee Rubio

Eskitis InstituteEskitis InstituteChristine WellsChristine Wells

Alan Mackay-SimAlan Mackay-Sim

<johnq@jimmy.harvard.edu><johnq@jimmy.harvard.edu>AcknowledgmentsAcknowledgments

http://compbio.dfci.harvard.eduhttp://compbio.dfci.harvard.edu

(Former) Stellar Students(Former) Stellar StudentsMartin AryeeMartin Aryee

Kaveh Maghsoudi Kaveh Maghsoudi Jess MarJess Mar

Systems SupportSystems SupportStas Alekseev, Sys AdminStas Alekseev, Sys Admin

Array Software Hit TeamArray Software Hit TeamKatie FranklinKatie FranklinEleanor HoweEleanor Howe

Sarita NairSarita NairJerry PapenhausenJerry PapenhausenJohn QuackenbushJohn Quackenbush

Dan SchlauchDan SchlauchRaktim SinhaRaktim SinhaJoseph WhiteJoseph White

AssistantAssistantJoan CoraccioJoan Coraccio

Juliana CoraccioJuliana Coraccio

Center for Cancer Center for Cancer Computational BiologyComputational Biology

Mick CorrellMick CorrellHowie GoodellHowie GoodellKristina HoltonKristina Holton

Jerry PapenhausenJerry PapenhausenPatricia PapastamosPatricia PapastamosJohn QuackenbushJohn Quackenbush

http://cccb.dfci.harvard.eduhttp://cccb.dfci.harvard.edu

Shameless self-promotionShameless self-promotion