Case story: Analysis of the Cell Cycle · 2006. 1. 2. · Prediction of cell cycle regulated...
Transcript of Case story: Analysis of the Cell Cycle · 2006. 1. 2. · Prediction of cell cycle regulated...
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of Denmark
Ulrik de LichtenbergCenter for Biological Sequence Analysis
Technical University of Denmark
DNA microarray analysis, January 2nd 2006
Case story:
Analysis of the Cell Cycle
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkOutline
Experimental studies of cell cycle regulation
• DNA microarrays & cell culture synchronization
• Finding periodically expressed genes
My Research Projects
• Project 1: Comparing computational methods
• Project 2: Finding weakly expressed genes
• Project 3: Regulation of protein complexes
Introduction
• Cell division and cell cycle regulation
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkOutline
Experimental studies of cell cycle regulation
• DNA microarrays & cell culture synchronization
• Finding periodically expressed genes
My Research Projects
• Project 1: Comparing computational methods
• Project 2: Finding weakly expressed genes
• Project 3: Regulation of protein complexes
Introduction
• Cell division and cell cycle regulation
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkCell division
How does one cell become many?
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkThe importance of research in cell division
164
720
6404
Leland Hartwell Tim Hunt Sir Paul Nurse
The 2001 Nobel Prize in
Physiology or Medicine
“for their discoveries of key
regulators of the cell cycle“
Cell division research is important for:
Cancer, cell factories, biomass increase,
cell differentiation, stem cells, signalling,
apoptosis, developmental disorders,
fertility, etc.
Cell division is one of the most
fundametal processes of all
living things and therefore of
enormous importance for our
understanding of life itself.
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkThe cell cycle
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkA complex biological system
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkDifferent levels of control in the cell
The activity of a protein depends on– regulation of its transcription (by activator/inhibitors)
– its mRNA stability
– the efficiency of translation (from mRNA to protein)
– stability of the mature protein
– the subcellular localization of the protein
– regulation of the protein function by binding ofactivator and inhibitors
– regulation of the protein function by post-translational modification (e.g. phosphorylation)
– targeted degradation of the protein (e.g. byubiquitination)
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkThe Just-in-time principle
• Some products you use all the
time, while others are only
needed once in a while
• People tend to buy or order
things right before they need
them…
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkJust-in-time synthesis in the cell
Simon et al., Cell, 2001
S phase S phase
Periodically expressed genes
Cell cycle regulated genes
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkOutline
Experimental studies of cell cycle regulation
• DNA microarrays & cell culture synchronization
• Finding periodically expressed genes
My Research Projects
• Project 1: Comparing computational methods
• Project 2: Finding weakly expressed genes
• Project 3: Regulation of protein complexes
Introduction
• Cell division and cell cycle regulation
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkDNA microarray technology
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkBut cells grow out of synchrony?!
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkArrest-and-release synchronization
• Temperature sensitive mutants – unable to pass a
certain point in the cell cycle at high temperatures
• Alpha factor – all cells arrest in the same stage
until a factor is supplied to the culture
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of Denmark
Synchronized
cell cultureMicroarrays Expression Profile
Microarray analysis of cell cycle regulation
1. Take samples from a synchronized cell culture
2. Hybridize each sample to a microarray chip
3. Normalize to make the chips comparable
4. String the data-points together to a profile
5. Look for genes whose expression is periodic
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkOutline
Experimental studies of cell cycle regulation
• DNA microarrays & cell culture synchronization
• Finding periodically expressed genes
My Research Projects
• Project 1: Comparing computational methods
• Project 2: Finding weakly expressed genes
• Project 3: Regulation of protein complexes
Introduction
• Cell division and cell cycle regulation
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkTime-series versus comparative studies
• Most microarray experiments compare measurementsfrom condition A with condition B to find those genesthat are differentially expressed– the typical analysis is a statistical test, e.g. t-test
• Time-series experiments are usually aimed at studyinghow gene expression changes over time (i.e. identifyinggenes that show a specific pattern of expression)– you can’t make a t-test, instead you have to look for some
pattern!
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of Denmark
Synchronized
cell cultureMicroarrays Expression Profile
Quiz
“You have 6000 profiles with about 20 time-points in each”
“How do you find out which of them are
regulated during the cell cycle and in
which phase they are expressed?”
?
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkMeasuring Periodicity
Fourier-scoring: extracting the ”signal” at the
interdivision frequency.
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkFourier score distribution
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkThe thresholding problem
Yeast Human
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of Denmark
WARNING!
The next slide may be the mostimportant slide of your career…
…so pay attention!
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of Denmark
Specificity
Sensitivity
Sensitivity versus Specificity
find little, of which
most is rightfind a lot, of which
most is right
find little, of which
most is wrongfind a lot, of which
most is wrong
Sensitivity = how many of the edible mushrooms do you accept
Specificity = how many of the mushrooms you accept are edible
edible
poisonous
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkThe thresholding problem
• Every threshold corresponds to a sensitivity and a specificity
• There is (almost always) a trade-off between the two
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkOutline
Experimental studies of cell cycle regulation
• DNA microarrays & cell culture synchronization
• Finding periodically expressed genes
My Research Projects
• Project 1: Comparing computational methods
• Project 2: Finding weakly expressed genes
• Project 3: Regulation of protein complexes
Introduction
• Cell division and cell cycle regulation
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of Denmark
Synchronized
cell cultureMicroarrays Expression Profile
Microarray analysis of cell cycle regulation
Budding yeast
Cho et al., 1998
Spellman et al., 1998
de Lichtenberg et al., 2005
Fission yeast
Rustici et al., 2004
Peng et al., 2004
Oliva et al., 2005
Human cells
Cho et al., 2001
Whitfield et al., 2002
Plant cells
Menges et al., 2003
Periodicity Analysis
Computational methods
Spellman et al., 1998
Zhao et al., 2001
Johansson et al, 2003
Wichert et al., 2004
Luan & Li, 2004
Lu et al., 2004
Liu et al., 2004
de Lichtenberg et al., 2005
Ahdesmaki et al., 2005
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkBenchmark of computational methods
Expression Profile Periodicity Analysis
Motivation?
• The overlap between different
computational methods is
remarkably poor when they are
applied to the same data.
• Although, most analyses have
concluded the periodically
expressed subset of the yeast
genome to comprise about 300 to
800 genes, nearly 1800 different
genes have been proposed in total
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkBenchmarking of computational methods
Computational Methods
• Cho et al., 1998– visual inspection
• Spellman et al., 1998– fourier scoring and correlation
• Zhao et al., 2001– single pulse model
• Johansson et al., 2003– partial least squares regression
• Luan & Li, 2004– cubic spline fitting
• Lu et al., 2004– Bayesian, mixture model
• de Lichtenberg et al., 2004/2005– permutation based statistics
Benchmark sets
B1 113 known periodic genes from smallscale experiments
B2 352 genes bound by at least one ofnine known cell cycle transcriptionfactors, excluding those in set B1
B3 518 genes annotated as “cell cycle andDNA processing” in MIPS, excluding“meiosis” and those in set B1
How do we benchmark?
We assume that the benchmark sets B1-B3
are enriched in cell cycle regulated genes.
We then benchmarks the ability of each
computational method to retrieve these
genes from the data (rank them high)
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkComparison of different methods
--- random
Cho et al.
Spellman et al.
Zhao et al.
Johansson et al.
Luan & Li
Lu et al.
de Lichtenberg et al.
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkWords of wisdom…
“Before you criticize someone, you
should walk a mile in their shoes.
That way when you criticize them,
you’ll be a mile away and you’ll
have their shoes.”
-- Anonymous
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkWhy are some methods better?
“is the gene significantly regulated?”– p(regulation): compare actual standard deviation to randomly
sampled background distribution
“is the expression pattern periodic?”– p(periodicity): compare the actual fourier signal to distribution
made by randomly shuffling the data point within each profile
Combined score with penalty function– Combi score: Combination of p-values for regulation and
periodicity, with penalty terms that require both componentsto be significant.
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkRegulation versus periodicity
--- random
regulation alone
periodicity alone
combined
combined + re-norm
combined + consistency
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkBetter methods give better overlap
• Different experimental conditions,such as cell culture synchronizationmethods
• Handling of samples, amplification,hybridization, etc.
• DNA microarray data are noisy andreplicates rarely overlap completely.
• Signal-to-noise ratio low for weaklyexpressed –and regulated genes
Why not perfect overlap?
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkOutline
Experimental studies of cell cycle regulation
• DNA microarrays & cell culture synchronization
• Finding periodically expressed genes
My Research Projects
• Project 1: Comparing computational methods
• Project 2: Finding weakly expressed genes
• Project 3: Regulation of protein complexes
Introduction
• Cell division and cell cycle regulation
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of Denmark
Periodic
genes
Non-periodic
genes
??Grey zone
area {Train
Model
Predicting new cell cycle regulated genes
periodic
proteins
non-periodic
proteins
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkPrediction of cell cycle regulated proteins
Protein required for mating1.50.96MID2YLR332W
cell-cycle regulation protein, may be involved in the correct timing of cell separation after cytokinesisSK8.70.96EGT2YNL327W
cold- and heat-shock induced protein of the Srp1p/Tip1p family of serine-alanine-rich proteinsCSZK5.90.96TIP1YBR067C
Protein of unknown function2.40.96YGR161C
TIP1-relatedS2.60.96TIR3YIL011W
Protein of unknown functionS5.40.96YLR194C
Factor-Induced Gene 2: expression is induced by the mating pheromones, a and alpha factor3.40.97FIG2YCR089W
Protein of unknown function2.10.97YMR317W
Mid-Two Like 11.80.97MTL1YGR023W
anchorage subunit of a-agglutininSK6.50.97AGA1YNR044W
Protein of unknown function2.50.97YOR220W
Protein of unknown functionSZ4.10.97GAS5YOL030W
EndochitinaseCSK9.30.97CTS1YLR286C
Protein of unknown functionC3.00.97YOL155C
Delayed Anaerobic Gene1.30.97DAN4YJR151C
Protein of unknown function1.00.98TIR4YOR009W
Protein of unknown function5.30.98YDL038C
Pathogen Related in Sc, contains homology to the plant PR-1 class of pathogen related proteins.CS5.50.98PRY3YJL078C
cell wall beta-glucan assembly1.70.98KRE1YNL322C
Protein of unknown functionC2.80.98YIL169C
ANNscore
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of Denmark
Protein required for mating1.50.96MID2YLR332W
cell-cycle regulation protein, may be involved in the correct timing of cell separation after cytokinesisSK8.70.96EGT2YNL327W
cold- and heat-shock induced protein of the Srp1p/Tip1p family of serine-alanine-rich proteinsCSZK5.90.96TIP1YBR067C
Protein of unknown function2.40.96YGR161C
TIP1-relatedS2.60.96TIR3YIL011W
Protein of unknown functionS5.40.96YLR194C
Factor-Induced Gene 2: expression is induced by the mating pheromones, a and alpha factor3.40.97FIG2YCR089W
Protein of unknown function2.10.97YMR317W
Mid-Two Like 11.80.97MTL1YGR023W
anchorage subunit of a-agglutininSK6.50.97AGA1YNR044W
Protein of unknown function2.50.97YOR220W
Protein of unknown functionSZ4.10.97GAS5YOL030W
EndochitinaseCSK9.30.97CTS1YLR286C
Protein of unknown functionC3.00.97YOL155C
Delayed Anaerobic Gene1.30.97DAN4YJR151C
Protein of unknown function1.00.98TIR4YOR009W
Protein of unknown function5.30.98YDL038C
Pathogen Related in Sc, contains homology to the plant PR-1 class of pathogen related proteins.CS5.50.98PRY3YJL078C
cell wall beta-glucan assembly1.70.98KRE1YNL322C
Protein of unknown functionC2.80.98YIL169C
ANNscore
Prediction of cell cycle regulated proteins
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkNo bias towards strongly expressed genes
Estimated gene expression level
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkExperimental validation of predictions
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkExperimental setup
• Synchronization with temperature
sensitive mutant strain (Cdc15-2)
• Fermentation in chemostat
• Sampling every 20 min
• Custom made probes (OligoWiz)
• Custom made microarrays (febit ag)
• New computational method
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkThe results
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkThe results
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkExamples of new discoveries
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkNew genes are weakly expressed
Predicted genes Confirmed genesde Lichtenberg et al., Yeast, 2005de Lichtenberg et al., JMB, 2003
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkOutline
Experimental studies of cell cycle regulation
• DNA microarrays & cell culture synchronization
• Finding periodically expressed genes
My Research Projects
• Project 1: Comparing computational methods
• Project 2: Finding weakly expressed genes
• Project 3: Regulation of protein complexes
Introduction
• Cell division and cell cycle regulation
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkModeling through data integration
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkNetworks – from topology to dynamics
Jeong, Nature, 2001
Dynamics Interactions
Dynamic
Models
Ideker & Lauffenburger, TRENDS in Biotechnology, 2003
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkExtracting a cell cycle interaction network
include non-periodic
proteins that are
strongly connected
to periodic proteins.
Data sourcesNode selection
list of periodically
expressed proteins
with peak timing
Interactions
score and filter by
network topology and
subcellular localization
cell cycle gene
expression data
protein-protein
interaction data
extract cell cycle
network
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkVisualizing the network
G1
S
M
G2
“Interacting proteins
are usually expressed
close in time”
Colorrepresents
transcriptional
timing
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkCoverage of interaction space
Most stable complexes were
captured, at least partially, in
the network. Still, no interactions
could be found for 2/3 of the
dynamic proteins.
Why?
• Some may be missed subunits
of complexes already in the
network
• Some may not participate in
protein-protein interactions
• But, the majority probably
participate in transient
interactions that are not so
well captured by current
interaction assays
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkNetworks as discovery tools
Obervation: The network
places 30+ uncharacterized
proteins in a temporal
interaction context.
The network thus generates
detailed hypothesis about
their function.
Obervation: The network
contains entire novel
modules and complexes.
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkStatic “scaffold proteins” play a major role
Observation: Static (scaffold)
proteins comprise about a third
of the network and participate
in interactions throughout the
entire cycle
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of Denmark
• Expression based time mapping corresponds closely
to the timing of complex assembly and activity.
• Hypothesis: We propose a just-in-time assembly
mechanism where the expression of the dynamic
subunits of each complex control the timing of final
assembly, and thereby the complex function.
Just-in-time complex assembly
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkPhosphorylation of dynamic proteins
By detailed analysis of the network we discover that
this regulation specifically affects the dynamic
proteins, but not the static proteins.
An experimentally determined* set of 332 Cdc28
targets constitute:
6% of all yeast proteins
8% of the static proteins
27% of the dynamic proteins
This over-representation is significant at P < 10-4
The yeast cyclin-dependent kinase, Cdc28/Cdk1,
regulates of a large number of cell cycle proteins by
phosphorylation.
* Ubersax et al., Nature,2003
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkDegradation signals in dynamic proteins
PEST degradation signals are
found in the sequence of many
proteins known to be selectively
targeted for degradation.
We discover that PEST signals
are significantly more frequent
among the dynamic proteins
and among the Cdc28-targets.
Transcriptional
regulationTargeted
degradation
Post-translational
regulation
Dynamic
proteins
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of Denmark
Are the special properties of dynamic proteins conserved?
“What’s the chance of finding the overrepresentation of PEST
regions observed among the dynamic proteins by random?”
Budding yeast (S. cerevisiae): P < 10-2
Fission yeast (S. pombe): P < 10-2
Human (HeLa cells): P < 10-7
Transcriptional
regulationTargeted
degradation
Post-translational
regulation
Dynamic
proteins
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkAcknowledgments
Center for Biological Sequence Analysis, Denmark
Søren Brunak
Thomas Skøt Jensen
Anders Fausbøll
Rasmus Wernersson
Henrik Bjørn Nielsen
The rest of CBS!
EMBL Biocomputing, Heidelberg, Germany
Lars Juhl Jensen
Peer Bork
Sean Hooper
DNA micorarray analysis 27612, January 2nd 2006Ulrik de Lichtenberg
Technical University of DenmarkFurther reading…
http://www.cbs.dtu.dk/cellcycle
All data and supplementary information available at:
Comparison of Computational Methods for the
Identification of Cell Cycle Regulated GenesUlrik de Lichtenberg, Lars Juhl Jensen, Anders Fausbøll, Thomas Skøt
Jensen, Peer Bork and Søren Brunak
Bioinformatics, 21(7), 1164-1171, 2005
New weakly expressed cell cycle regulated genes in yeastUlrik de Lichtenberg, Rasmus Wernersson, Thomas Skøt Jensen, Henrik
Bjørn Nielsen, Peer Schmidt, Flemming Bryde Hansen, Steen Knudsen
and Søren Brunak
Yeast, 22(15), 1191-2017, 2005
Dynamic Complex Formation During the Yeast Cell CycleUlrik de Lichtenberg, Lars Juhl Jensen, Søren Brunak and Peer Bork
Science, 307 (5710), 724-727, 2005