Steps involved in microarray analysis after the...
Transcript of Steps involved in microarray analysis after the...
![Page 1: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/1.jpg)
1
Steps involved in microarray analysisafter the experiments
• Scanning slides to create images• Conversion of images to numerical data• Processing of raw numerical data• Further analysis
– Clustering– Integration with genomic data
Steps involved in microarray analysisafter the experiments
• Scanning slides to create images• Conversion of images to numerical data• Processing of raw numerical data• Further analysis
– Clustering– Integration with genomic data
![Page 2: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/2.jpg)
2
Processing raw microarray data
• Main aims: – To identify and reduce the noise found in
microarray data
– To identify differentially expressed genes/fragments in an experiment
Methods used so far…
• Many have been ad hoc
![Page 3: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/3.jpg)
3
Methods used so far…
• Many have been ad hoc– Visual identification (!)
Methods used so far…
• Many have been ad hoc– Visual identification (!)– 2-fold change in hybridisation signals
![Page 4: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/4.jpg)
4
Methods used so far…
• Many have been ad hoc– Visual identification (!)– 2-fold change in hybridisation signals– Ranking ratios of hybridisation signals
Methods used so far…
• Many have been ad hoc– Visual identification (!)– 2-fold change in hybridisation signals– Ranking ratios of hybridisation signals
• Microarray expts are
![Page 5: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/5.jpg)
5
Methods used so far…
• Many have been ad hoc– Visual identification (!)– 2-fold change in hybridisation signals– Ranking ratios of hybridisation signals
• Microarray expts are– Easy to do, but very sensitive– Lot of noise, artifacts, errors– ~20,000 spots per slide
Methods used so far…
• Many have been ad hoc– Visual identification (!)– 2-fold change in hybridisation signals– Ranking ratios of hybridisation signals
• Microarray expts are– Easy to do, but very sensitive– Lot of noise, artifacts, errors– ~20,000 spots per slide
• Without good processing the results can be completely wrong
![Page 6: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/6.jpg)
6
This is changing…..
0
200
400
600
800
10001997
1998
1999
2000
2001
2002
Year
PubM
ed h
its
clusteringprocessing
microarray
This is changing…..• Opposing camps
0
200
400
600
800
1000
1997
1998
1999
2000
2001
2002
Year
PubM
ed h
its
clusteringprocessing
microarray
![Page 7: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/7.jpg)
7
This is changing…..• Opposing camps
– Ad hoc camp• Biologists• Too simple and
may be wrong
0
200
400
600
800
10001997
1998
1999
2000
2001
2002
Year
PubM
ed h
its
clusteringprocessing
microarray
This is changing…..• Opposing camps
– Ad hoc camp• Biologists• Too simple and
may be wrong
– Complicated maths
• Biostatisticians• Incomprehensible• Idealised datasets0
200
400
600
800
1000
1997
1998
1999
2000
2001
2002
Year
PubM
ed h
its
clusteringprocessing
microarray
![Page 8: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/8.jpg)
8
This is changing…..• Opposing camps
– Ad hoc camp• Biologists• Too simple and
may be wrong
– Complicated maths
• Biostatisticians• Incomprehensible• Idealised datasets
– Must strike a balance
0
200
400
600
800
10001997
1998
1999
2000
2001
2002
Year
PubM
ed h
its
clusteringprocessing
microarray
Getting the most out of your microarray
• Processing the raw data
• Cleaning and assessing the quality of your data
• Identifying differentially hybridised spots
• How do you get the correct list of differentially expressed genes out of ~20000 data points?
![Page 9: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/9.jpg)
9
Getting the most out of your microarray
• Processing the raw data
• Cleaning and assessing the quality of your data
• Identifying differentially hybridised spots
• How do you get the correct list of differentially expressed genes out of ~20000 data points?
Processing flow chart
Data input
Background
correction
Cy5/Cy3
normalisation
Merging
replicate
experiments
Score
differential
hybridisation
![Page 10: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/10.jpg)
10
Processing flow chart
Data input
Background
correction
Cy5/Cy3
normalisation
Merging
replicate
experiments
Score
differential
hybridisation
The data – a GenePix file
![Page 11: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/11.jpg)
11
GenePix file – what do we look at?
• Measure red and green intensities separately
• Signal intensity = foreground – background
GenePix file – what do we look at?
• Measure red and green intensities separately
• Signal intensity = foreground – background
• Foreground signal
![Page 12: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/12.jpg)
12
GenePix file – what do we look at?
• Measure red and green intensities separately
• Signal intensity = foreground – background
• Foreground signal
• Background signal
GenePix file – what do we look at?
• Measure red and green intensities separately
• Signal intensity = foreground – background
• Foreground signal
• Background signal
• Ratio = red/green or green/red
![Page 13: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/13.jpg)
13
Processing flow chart
Data input
Background
correction
Cy5/Cy3
normalisation
Merging
replicate
experiments
Score
differential
hybridisation
Background correction
• Genepix background– Median intensity of
immediate area surrounding each spot
• But– Very variable
between individual spots
– Artifactual background from smudges
• Therefore add to noise
![Page 14: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/14.jpg)
14
Background correction
• Calculate the average background from surrounding area of spot
• Recommendation of 3x3 – 5x5 area
• Repeat for red and green separately
Background correction
• Still have variable distribution of intensity
• Much smoother distribution of background intensity
• Remove artifactual smudges
![Page 15: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/15.jpg)
15
Processing flow chart
Data input
Background
correction
Cy5/Cy3
normalisation
Merging
replicate
experiments
Score
differential
hybridisation
Red/green normalisation
• Normalise red and green intensities – spots with equal hybridisation should have similar
intensities– ie ratio ~ 1 for similarly expressed genes
• Otherwise, will have wrong list of differentially expressed genes
• Multiply one set of intensities by a scale factor
• But must obtain scale factor
![Page 16: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/16.jpg)
16
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
0 1 2 3 4
red/green ratio
num
ber o
f spo
tsMajority method
• Majority method:– Assume that most spots do
not change expression level
– Find average ratio (red/green intensity)
– Scale factor is the amount by which need to multiply the ratio so it is 1.
Majority method
• But several issues– Hybridisation levels differ according to location on slide
– Scanning properties for different colours differ at different intensities
• Scale factor must take these into account
![Page 17: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/17.jpg)
17
Intensity considerations
• Simple scale factor fits a straight line
0
10000
20000
30000
40000
50000
60000
0 10000 20000 30000 40000 50000 60000
green intensity
red
inte
nsity
Intensity considerations
• Simple scale factor fits a straight line
• Distribution is curved– Difference in ratio can be
10-fold different depending on intensity
0
10000
20000
30000
40000
50000
60000
0 10000 20000 30000 40000 50000 60000
green intensity
red
inte
nsity
![Page 18: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/18.jpg)
18
Intensity considerations
• Simple scale factor fits a straight line
• Distribution is curved– Difference in ratio can be
10-fold different depending on intensity
• Different scale factors should be used for different intensities
0
10000
20000
30000
40000
50000
60000
0 10000 20000 30000 40000 50000 60000
green intensity
red
inte
nsity
Positional considerations
• Different regions of the slide have different levels of hybridization
• Difference in average ratio can be ~10 fold
• Different scale factors needed for each region of slide
![Page 19: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/19.jpg)
19
Positional considerations
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
0 1 2 3 4
red/green ratio
num
ber o
f spo
ts
Scale factor for each spot• Calculate the scale factor using surrounding area of spot
• Recommendation of 12x12 – 20x20 area
Positional considerations
• Raw data has large positional dependence
Before
![Page 20: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/20.jpg)
20
Positional considerations
• Raw data has large positional dependence
• Normalisationwithout pos. shifted ratios towards red intensity, but does not remove artifact
Before No positional data
Positional considerations
• Raw data has large positional dependence
• Normalisationwithout pos. shifted ratios towards red intensity, but does not remove artifact
• Positional normalisationremoves most of the artifact
Before No positional data Positional data
![Page 21: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/21.jpg)
21
Processing flow chart
Data input
Background
correction
Cy5/Cy3
normalisation
Merging
replicate
experiments
Score
differential
hybridisation
Combining multiple experiments
• Often have replicates of the same experiment
• Do you have to scale between them?• How do you combine the data from them?
![Page 22: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/22.jpg)
22
Replicate scaling• Different slides may have
different spreads in intensity and ratios
• Adjust spread of distributions by measuring standard deviation
• Estimate of variance by quartiles, fitting distribution, or bootstrapping
0
500
1000
1500
2000
2500
0
200
400
600
800
1000
1200
1400
Combining replicate data
• Take medians of ratios?• Take means of intensity values?• Take weighted means?• Treat each experiment individually and see
which spots are consistently differntially expressed?
![Page 23: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/23.jpg)
23
Getting the most out of your microarray
• Processing the raw data
• Cleaning and assessing the quality of your data
• Identifying differentially hybridised spots
• How do you get the correct list of differentially expressed genes out of ~20000 data points?
Processing flow chart
Data input
Background
correction
Cy5/Cy3
normalisation
Merging
replicate
experiments
Score
differential
hybridisation
Spot
qualityArtifactual
regions
Duplicate spot
variability
Replicate experiment variability
![Page 24: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/24.jpg)
24
Processing flow chart
Data input
Background
correction
Cy5/Cy3
normalisation
Merging
replicate
experiments
Score
differential
hybridisation
Spot
qualityArtifactual
regions
Duplicate spot
variability
Replicate experiment variability
Filter bad spots
• Small spots• Smudgy spots• Non-round spots
etc…
![Page 25: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/25.jpg)
25
Processing flow chart
Data input
Background
correction
Cy5/Cy3
normalisation
Merging
replicate
experiments
Score
differential
hybridisation
Spot
qualityArtifactual
regions
Duplicate spot
variability
Replicate experiment variability
Artifactual array regions
Remove regions that have artifactual background after correction
Background artifacts usually vary from slide to slide – ie not consistent
![Page 26: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/26.jpg)
26
Processing flow chart
Data input
Background
correction
Cy5/Cy3
normalisation
Merging
replicate
experiments
Score
differential
hybridisation
Spot
qualityArtifactual
regions
Duplicate spot
variability
Replicate experiment variability
Measurement of chip quality
• How successful is an experiment?– How consistent are the hybridisations
within experiments
![Page 27: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/27.jpg)
27
Measurement of intrachip quality• Genes are often placed as
neighbouring pairs
• Variation between them provides a measure of variation within an experiment– (x1 – x2)/(x1+x2)
• Mean gives overall variation for expt.
• Remove outliers
• Are the same spots always inconsistent across replicate expts?0
20
40
60
80
100
120
140
-1.2-1.02-0.84-0.66-0.48 -0.
3-0.12 0.0
60.240.42 0.6 0.7
80.961.14
Variation
Num
ber o
f spo
ts
Experimentquality
Outliers
Intrachip variability
0
20
40
60
80
100
120
140
-1.2-1.02-0.84-0.66-0.48 -0.
3-0.12 0.0
60.240.42 0.6 0.7
80.961.14
Variation
Num
ber o
f spo
ts
Mean = 3.7%
![Page 28: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/28.jpg)
28
Intrachip variability – single experiment quality
0
20
40
60
80
100
120
140
-1.2-1.02-0.84-0.66-0.48 -0.
3-0.12 0.0
60.240.42 0.6 0.7
80.961.14
Variation
Num
ber o
f spo
ts
Mean = 3.7% Mean = 10.7%
Filtering poor duplicates
0
1
2
3
4
5
0 1 2 3 4 5
ratio 1
ratio
2
![Page 29: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/29.jpg)
29
Filtering poor duplicates
0
1
2
3
4
5
0 1 2 3 4 5
ratio 1
ratio
2
0
1
2
3
4
5
0 1 2 3 4 5
ratio 1
ratio
2
Processing flow chart
Data input
Background
correction
Cy5/Cy3
normalisation
Merging
replicate
experiments
Score
differential
hybridisation
Spot
qualityArtifactual
regions
Duplicate spot
variability
Replicate experiment variability
![Page 30: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/30.jpg)
30
Measurement of interchip quality• How consistent are
replicate experiments?
• Use same measure for equivalent spots :(x1 – x2)/(x1+x2)
• Mean gives overall variation for expt. With respect to other replicates
• Remove outlier experiments
Measurement of replicate chip quality
0
10000
20000
30000
40000
50000
60000
70000
0 10000 20000 30000 40000 50000
Green intensity
Red
inte
nsity
![Page 31: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/31.jpg)
31
Measurement of replicate chip quality
0
10000
20000
30000
40000
50000
60000
70000
0 10000 20000 30000 40000 50000
Green intensity
Red
inte
nsity
0
10000
20000
30000
40000
50000
60000
70000
0 10000 20000 30000 40000 50000
Green intensity
Red
inte
nsity
Measurement of replicate chip quality
0
10000
20000
30000
40000
50000
60000
70000
0 10000 20000 30000 40000 50000
Green intensity
Red
inte
nsity
0
10000
20000
30000
40000
50000
60000
70000
0 10000 20000 30000 40000 50000
Green intensity
Red
inte
nsity
0
10000
20000
30000
40000
50000
60000
70000
0 10000 20000 30000 40000 50000
Green intensity
Red
inte
nsity
![Page 32: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/32.jpg)
32
Measurement of replicate chip quality
0
10000
20000
30000
40000
50000
60000
70000
0 10000 20000 30000 40000 50000
Green intensity
Red
inte
nsity
0
10000
20000
30000
40000
50000
60000
70000
0 10000 20000 30000 40000 50000
Green intensity
Red
inte
nsity
0
10000
20000
30000
40000
50000
60000
70000
0 10000 20000 30000 40000 50000
Green intensity
Red
inte
nsity
0
10000
20000
30000
40000
50000
60000
70000
0 10000 20000 30000 40000 50000 60000
Green intensity
Red
inte
nsity
Measurement of replicate chip quality
0
10000
20000
30000
40000
50000
60000
70000
0 10000 20000 30000 40000 50000
Green intensity
Red
inte
nsity
0
10000
20000
30000
40000
50000
60000
70000
0 10000 20000 30000 40000 50000
Green intensity
Red
inte
nsity
0
10000
20000
30000
40000
50000
60000
70000
0 10000 20000 30000 40000 50000
Green intensity
Red
inte
nsity
0
10000
20000
30000
40000
50000
60000
70000
0 10000 20000 30000 40000 50000 60000
Green intensity
Red
inte
nsity
0
10000
20000
30000
40000
50000
60000
0 2000 4000 6000 8000 10000 12000 14000
Green intensity
Red
inte
nsity
![Page 33: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/33.jpg)
33
Measurement of replicate chip quality
0
10000
20000
30000
40000
50000
60000
70000
0 10000 20000 30000 40000 50000
Green intensity
Red
inte
nsity
0
10000
20000
30000
40000
50000
60000
70000
0 10000 20000 30000 40000 50000
Green intensity
Red
inte
nsity
0
10000
20000
30000
40000
50000
60000
70000
0 10000 20000 30000 40000 50000
Green intensity
Red
inte
nsity
0
10000
20000
30000
40000
50000
60000
70000
0 10000 20000 30000 40000 50000 60000
Green intensity
Red
inte
nsity
0
10000
20000
30000
40000
50000
60000
0 2000 4000 6000 8000 10000 12000 14000
Green intensityR
ed in
tens
ity
Mean var. = 8.4% Mean var. = 8.2%
Mean var. = 15.2% Mean var. = 32.3%
Mean var. = 7.3%
0
20
40
60
80
100
120
140
-1.2-1.02-0.84-0.66-0.48 -0.
3-0.12 0.0
60.240.42 0.6 0.7
80.961.14
Variation
Num
ber o
f spo
ts
Measurement of replicate chip quality
0
10000
20000
30000
40000
50000
60000
70000
0 10000 20000 30000 40000 50000
Green intensity
Red
inte
nsity
0
10000
20000
30000
40000
50000
60000
70000
0 10000 20000 30000 40000 50000
Green intensity
Red
inte
nsity
0
10000
20000
30000
40000
50000
60000
70000
0 10000 20000 30000 40000 50000
Green intensity
Red
inte
nsity
0
10000
20000
30000
40000
50000
60000
70000
0 10000 20000 30000 40000 50000 60000
Green intensity
Red
inte
nsity
0
10000
20000
30000
40000
50000
60000
0 2000 4000 6000 8000 10000 12000 14000
Green intensity
Red
inte
nsity
Mean var. = 8.4% Mean var. = 8.2%
Mean var. = 15.2% Mean var. = 32.3%
Mean var. = 7.3%
0
20
40
60
80
100
120
140
-1.2-1.02-0.84-0.66-0.48 -0.
3-0.12 0.0
60.240.42 0.6 0.7
80.961.14
Variation
Num
ber o
f spo
ts
![Page 34: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/34.jpg)
34
Measurement of interchip quality
• Quite easy to see by eye, but provides a systematic and objective method for determining consistency
• Use to measure overall consistency of replicates
• Identify and remove bad replicate expts
• Also identify regions of the slide that are consistently error prone
Getting the most out of your microarray
• Processing the raw data
• Cleaning and assessing the quality of your data
• Identifying differentially hybridised spots
• How do you get the correct list of differentially expressed genes out of ~20000 data points?
![Page 35: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/35.jpg)
35
Processing flow chart
Data input
Background
correction
Cy5/Cy3
normalisation
Merging
replicate
experiments
Score
differential
hybridisation
Spot
qualityArtifactual
regions
Duplicate spot
variability
Replicate experiment variability
Scoring differentially hybridized probes
• Identify spots that have differential hybridisation on red and green channels
• Many different methods being published
![Page 36: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/36.jpg)
36
Scoring differentially:ratio-based method
• Calculate red/green ratio for each spot
Scoring differentially:ratio-based method
• Calculate red/green ratio for each spot• Plot distribution:
0
500
1000
1500
2000
2500
3000
0 1 2
ratio (red/green)
Num
ber o
f spo
ts
![Page 37: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/37.jpg)
37
Scoring differentially:ratio-based method
• Calculate red/green ratio for each spot• Plot distribution:
• Define cut-off based on normal distribution(or use 2-fold cut-off)
0
500
1000
1500
2000
2500
3000
0 1 2
ratio (red/green)
Num
ber o
f spo
ts
Problems
![Page 38: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/38.jpg)
38
Problems
• Many experiments don’t give normal distribution
0
100
200
300
400
500
600
700
800
900
0 1 2 3 4 5
ratio (red/green)
Num
ber o
f spo
ts
Problems
• Many experiments don’t give normal distribution
• Ratios ignore the signal intensity– More stringent for high
intensity spots
0
100
200
300
400
500
600
700
800
900
0 1 2 3 4 5
ratio (red/green)
Num
ber o
f spo
ts
0
10000
20000
30000
40000
50000
60000
0 10000 20000 30000 40000 50000 60000
green intensity
red
inte
nsity
![Page 39: Steps involved in microarray analysis after the experimentsbioinfo.mbb.yale.edu/mbb452a/2002/expression.pdf · Processing flow chart Data input Background correction Cy5/Cy3 normalisation](https://reader034.fdocuments.us/reader034/viewer/2022052023/6038422b5f8a750f15745fc3/html5/thumbnails/39.jpg)
39
Processing flow chart
Data input
Background
correction
Cy5/Cy3
normalisation
Merging
replicate
experiments
Score
differential
hybridisation
Spot
qualityArtifactual
regions
Duplicate spot
variability
Replicate experiment variability
Processing flow chartData input
Background
correction
Cy5/Cy3
normalisation
Merging
replicate
experiments
Score
differential
hybridisation
Spot
qualityArtifactual
regions
Duplicate spot
variability
Replicate experiment variability
• Raw microarray requires a lot of initial processing before being useful• Very important as can completely change the answers you get• The issues are beginning to emerge
• Different people have different ideas of how to resolve them• There is no standard method yet – each has problems
• Very labour intensive, but can be computed relatively easily