How to do successful gene expression analysis vs gene maximization sample maximization –to be...
Transcript of How to do successful gene expression analysis vs gene maximization sample maximization –to be...
How to do successful gene expression analysis
Jan Hellemans, PhD
Center for Medical Genetics
Biogazelle
qPCR meeting – June 25th 2010 – Sienna, Italy
qPCR: reference technology for nucleic acid quantification
sensitivity and specificity
wide dynamic range
speed
relative low cost
conceptual and practical simplicity
easy to perform ≠ easy to do it right
many steps involved
all need to be right
Introduction
Introduction
Choice of
chemistry
Choice of
RT
RNA quality
assessment
Sample selection
and handlingData
reporting
Sample extraction
RT and PCR
primer design
cDNA synthesis
strategy
Assay validationData
analysis
prepare – cycle – report
Prepare
experiment design
• power analysis
• sample vs gene maximization
• run layout
samples
• preparation
• quality control
• pre amplification
assays
• design
• in silico validation
• empirical validation
reference gene
• selection
• validation
Prepare
experiment design
• power analysis
• sample vs gene maximization
• run layout
samples
• preparation
• quality control
• pre amplification
assays
• design
• in silico validation
• empirical validation
reference gene
• selection
• validation
Power analysis
determination of the number of data points needed to reach statistical significance for a given
difference
variability
technical constraints
confidence interval (CI)
3 (~ critical t-value t*)
CI = SEM x t*
0,00
2,00
4,00
6,00
8,00
10,00
12,00
14,00
2 3 4 5 10 20 100
cri
tic
al t-
va
lue
number of datapoints
Power analysis
determination of the number of data points needed to reach statistical significance for a given
difference
variability
technical constraints
confidence interval (CI) 3
Mann-Whitney test: nA + nB 8
Wilcoxon test: 6 pairs
http://www.cs.uiowa.edu/~rlenth/Power/
how to set-up an experiment with
3 genes of interest (GOI) & 3 reference genes (REF)
11 samples (S) & 1 no template control (NTC)
Sample vs gene maximization
S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11NTC
S1
S2
S3
S4
S5
S6
S7
NTC
S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11NTC
S1
S2
S3
S8
S9
S10
S11
NTC
sample maximization
GOI2
GOI3
REF1
REF2
REF3
GOI1
gene maximization
REF1 REF2 REF3 GOI1 GOI2 GOI3
GOI2 GOI3REF1 REF2 REF3 GOI1
Sample vs gene maximization
sample maximization – to be preferred
no increase in variation due to absence of inter-run variation
suitable for retrospective studies and controlled experiments
gene maximization
introduces (under-estimated) inter-run variation
applicable for prospective studies or large studies in which the number of samples do not fit in the run anymore
inter-run variation can be measured and corrected for using inter-run calibrators (IRC) through a procedure called inter-run calibration
Prepare
experiment design
• power analysis
• sample vs gene maximization
• run layout
samples
• preparation
• quality control
• pre amplification
assays
• design
• in silico validation
• empirical validation
reference gene
• selection
• validation
Preparation
cDNA synthesis
most variable step in the workflow (> RT replicates)
different performance of the enzymes
linearity and yield are important
DNase treament
retropseudogenes (15%) and single exon genes (5%)
on column vs. in solution
verify absence of DNA
• qPCR for genomic DNA target on RNA as input
Evaluate integrity of 18S and 28S rRNA
Agilent Bioanalyzer
Bio-Rad Experion
Caliper GX
Qiagen QIAxcel
Shimadzu MultiNA
Quality control – RNA integrity value
universally expressed low abundant reference
anchored oligo(dT) reverse transcription
increasing delta-Cq values upon artificial RNA degradation
Quality control – 5’-3’ ratio
AAAAAA
5’ 3’
Cq 5’ Cq 3’Thermic degradation
0
1
2
3
4
5
6
7
8
9
109 109* 109** 275 275* 275** 539 539* 539**
samples
5'-
3' d
elt
a C
t
spiking of synthetic sequence lacking homology with any known human sequence into RNA
Quality control – SPUD assay for inhibition
SPUD
+
H2O
SPUD
+
heparin
SPUD
+
RNA1
SPUD
+
RNA2
SPUD
+
RNA3
Cq 22 Cq 27 Cq 22 Cq 25 Cq 22
ΔCq > 1: presence of inhibitors
------------RT-qPCR---------
methods
WT-Ovation (NuGEN)
limited cycle PCR (PreAmp - Applied Biosystems)
preservation of differential expression (fold changes) before (B) and after (A) sample pre-amplification
(G1S1)B/(G1S2) B = (G1S1) A/(G1S2) A G1B/G2B < > G1A/G2A
gene G, sample S, before B, after A
Pre amplification
Prepare
experiment design
• power analysis
• sample vs gene maximization
• run layout
samples
• preparation
• quality control
• pre amplification
assays
• design
• in silico validation
• empirical validation
reference gene
• selection
• validation
http://www.rtprimerdb.org
Assay design guidelines
location
sequence repeats, protein domains
splice variants
intron spanning vs intra exonic
short amplicons: 80-150bp
SNPs
primers
dTm < 2°C
identical Tm for all assays
maximum 2 GC in last 5 nucleotides
use software to design assays
Primer3(Plus), BeaconDesigner, RTprimerDB
In silico assay validation
do thorough in silico assay evaluation
BLAST/BiSearch specificity analysis
mfold secondary structure
SNP analysis of primer annealing regions
splice variant specificity
streamline in silico analyses with RTprimerDB pipeline
Empirical assay validation
specificity
size analysis (only once)
• agarose or polyacrylamide gel
• capillary electrophoresis
melting curves (SYBR, repeated)
[sequence / restriction digest]
amplification efficiency
standard curve
• range & number dilution points
• representative sample
[single curve efficiency algorithms]
for absolute quantification
linear range and limit of detection
Prepare
experiment design
• power analysis
• sample vs gene maximization
• run layout
samples
• preparation
• quality control
• pre amplification
assays
• design
• in silico validation
• empirical validation
reference gene
• selection
• validation
Single reference gene
quantitative RT-PCR analysis of 10 reference genes (belonging to different functional and abundance classes) on 85 samples from 13 different human tissues
0
1
2
3
4
ACTB
HMBS
HPRT1
TBP
UBC
A B C D E F G
Single vs multiple reference genes
single reference gene
errors related to the use of a single reference gene:> 3 fold in 25% of the cases> 6 fold in 10% of the cases
multiple reference genes
developed a robust algorithm for assessment of expression stability of candidate reference genes
proposed the geometric mean of at least 3 reference genes for accurate and reliable normalisation
geNorm analysis in pilot study
Vandesompele et al. Genome Biol. 2002 Jun 18;3(7):RESEARCH0034.
geNorm
validation
insensitive to outliers
reduce most of the variation
statistically more significant results
accurate assessment of small expression differences
de facto standard for reference gene validation
2 400 citations of the geNorm technology
~12 000 geNorm software downloads in 112 countries
genormPLUS
genormPLUS
genormPLUS
Cycle
cycle
• instrument
• chemistry
• controls
fast PCR
fast ramping ≠ fast qPCR experiment
96-well vs 384-well
384-well system is slightly more expensive
384-well plates harder to pipet (multichannel pipets or pipetting robot)
384-well run gives 4x more data in same time
384-well plates require smaller volumes
plate homogeneity test
Instrument
Chemistry
choose probes for
multiplexing
genotyping
absolute sensitivity (detection past cycle 40) (e.g. clinical-diagnostic setting, GMO detection)
choose SYBR Green I for
all other applications
low cost
seeing what you do
melting curve
unique melt peak for all samples?
replicates
delta-Cq < 0.5 cycles?
controls
negative control really blankdelta-Cq samples/NTC > 5?
positive controls with expected Cq?
amplification plot shape (kinetic outlier detection)
Controls
Report
relative quantification
• efficiency correction
• multiple reference gene normalization
• inter-run calibration
• error propagation
bio statistical analysis
• biological replicates
• log transform data
• selection of statistical test
reporting guidelines
• RDML
• MIQE
Report
relative quantification
• efficiency correction
• multiple reference gene normalization
• inter-run calibration
• error propagation
bio statistical analysis
• biological replicates
• log transform data
• selection of statistical test
reporting guidelines
• RDML
• MIQE
Calculation methods
Cq RQ NRQ CNRQ
Norm
aliz
ation
Inte
r-ru
n c
alib
ration
CqERQ
nref
n
i
toi
iRQ
RQNRQ
n
irc
n
i
soi
iNRQ
NRQCNRQ
Hellemans et al. Genome Biol. 2007;8(2):R19.
ref
toi
RQ
RQNRQ
CqRQ 2irc
soi
NRQ
NRQCNRQ
Data processing - relative quantification
Quality controls
PCR replicates
∆Cq < 0.5 cycles
no template control
no signal (no Cq value)
Cq (NTC) > Cq (samples) + 5
reference gene stability
M < 0.5M < 1 for heterogeneous samples
CV < 25%CV < 50% for heterogeneous samples
normalization factors
no unexpected high variation
Report
relative quantification
• efficiency correction
• multiple reference gene normalization
• inter-run calibration
• error propagation
bio statistical analysis
• biological replicates
• log transform data
• selection of statistical test
reporting guidelines
• RDML
• MIQE
Replicates
technical vs biological replicates
repeated measures vs. replication
PCR replicates (pipetting error & Poisson’s law)
RT replicates
repeated RNA extraction from same sample
repeated cell cultures / patient sampling
true biological replicates (from different subjects)
no statistics on repeated measures
type of replicates dictates conclusions that can be drawn
relative quantities are not normally distributed
log transformation makes them more symmetrical
relevant tests in the field of relative quantification
comparison of 2 unpaired groups
• t test
• Mann-Whitney
• randomization test
comparison of 2 paired groups
• ratio t test (paired t test on log values)
• Wilcoxon rank sum test
correlation analysis
• Pearson
• Spearman
linear regression
correct for multiple testing
Statistical tests
Report
relative quantification
• efficiency correction
• multiple reference gene normalization
• inter-run calibration
• error propagation
bio statistical analysis
• biological replicates
• log transform data
• selection of statistical test
reporting guidelines
• RDML
• MIQE
MIQE
http://www.rdml.org/miqe
Bustin et al. Clin Chem. 2009 Apr;55(4):611-22.
authors
improve quality of qPCR experiments
reliable and unequivocal interpretation of results
reviewers and editors
assess technical merit
full disclosure of reagents and analysis methods
consumers of published research
published results easier to reproduce
MIQE checklist for authors, reviewers and editors
experimental design
sample
nucleic acid extraction
reverse transcription
target information
oligonucleotides
qPCR protocol
qPCR validation
data analysis
E – essential
D – desirable
RDML
http://www.rdml.org
Lefever et al. Nucleic Acids Res. 2009 Apr;37(7):2065-9.