RNA Processing Data Analysis Lisa Bloomer Green April 26, 2010

Post on 24-Feb-2016

29 views 0 download

Tags:

description

RNA Processing Data Analysis Lisa Bloomer Green April 26, 2010. RNA Processing Data Analysis. Real Time PCR Splicing The Problem Solving the Simplified Problem Making it Complicated Again. Real-time PCR. DNA is copied again and again for exponential growth in quantity present. - PowerPoint PPT Presentation

Transcript of RNA Processing Data Analysis Lisa Bloomer Green April 26, 2010

RNA Processing Data Analysis

Lisa Bloomer GreenApril 26, 2010

RNA Processing Data Analysis

● Real Time PCR● Splicing● The Problem● Solving the Simplified Problem● Making it Complicated Again

Real-time PCR DNA is copied

again and again for exponential growth in quantity present.

http://pathmicro.med.sc.edu/pcr/realtime-home.htm

Cycle Number

Amount of DNA

24 1677721625 3355443226 6710886427 13421772828 26843545629 53687091230 107374182431 140000000032 150000000033 155000000034 1580000000

Cycle Number

Amount of DNA

0 11 22 43 84 165 326 647 1288 2569 512

10 1024

Real-time PCR Output is the

number of cycles it takes to pass a certain threshold.

http://pathmicro.med.sc.edu/pcr/realtime-home.htm

The Problem Use real-time PCR to help discover how often

alternative splicing occurs in a given region of RNA.

Alternative Splicing: A mechanism by which different forms of mature mRNAs (messengers RNAs) are generated from the same gene.

http://www.medterms.com/script/main/art.asp?articlekey=16831

rpoB rpoC1 rpoC2

rpoB rpoC1 rpoC2

rpoB rpoC1

rpoC1 rpoC2

Rpo chloroplast operon

mRNA

Preliminary Data

rpoB rpoC1 rpoC2

rpoB

IR1r

rpoC1IR2r

rpoc2

0.00

5.00

10.00

15.00

20.00

25.00

30.00

0 1 2 3 4 5 6

Series1

rpoB IR1r rpoC1 IR2r rpoc219.84 26.88 22.52 23.77 15.71

psbI psbK

psbI psbK

psb chloroplast operon

mRNA

psbI psbK

Preliminary Data

psbI IRp psbK14.86 20.02 14.94

0

5

10

15

20

25

A Simplified Version

Model the curve using an exponential function.

tCt NN 20

A Simplified Version

Because we use the same threshold for each quantity, we will assume that Nt is the same for each quantity.

tCt NN 20

t

t

CtC

t NNN 220

ExamplepsbL IRp psbK

Cycle Count 14.86 20.02 14.94Initial Count (is proportional to) 3.36 * 10-5 9.41 * 10-7 3.18 * 10-5

psbL IRp psbKCycle Count 14.86 20.02 14.94Initial Count (is proportional to) 3.36 * 10-5 9.41 * 10-7 3.18 * 10-5

Percentage 50.7% 1.4% 47.9%

ExamplerpoB IR1r rpoC1 IR2r rpoC2

Cycle Count 19.84 26.88 22.52 23.77 15.71

Initial Count 1.07 * 10-6 8.10 * 10-9 1.66 * 10-7 6.99 * 10-8 1.87 * 10-5

Percentage 5.34% 0.04% 0.83% 0.35% 93.44%

Adding Complexity

● Need to estimate variability for N0

● The process may not be 100% efficient. We probably get less than double the amount with each cycle.

● The shape of the curve shows behavior that does not fit the exponential model.

● Nt might be different for the different quantities.

Variability Estimation

Standard deviation of Ct is estimated between 0.036 cycles and 0.367 cycles, with an average of 0.183 cycles. (Rutledge and Cote)

psbL IRp psbKCycle Count 14.86 20.02 14.94Confidence Interval (using 0.183 for st.dev.)

(14.50, 15.22) (19.66, 20.38) (14.58, 15.30)

Variability Estimation

To get confidence intervals for the percentages, we need to know how these cycle numbers interact.

psbL IRp psbKCycle CountConfidence Interval

(14.50, 15.22) (19.66, 20.38) (14.58, 15.30)

2-Ct Confidence Interval (2.6*10-5, 4.3*10-5) (7.3*10-7, 1.2*10-6) (2.5*10-5, 4.1*10-5)

Efficiency

E is a number between 0 and 1 that quantifies the efficiency of the doubling process.

E can be estimated from the standard curve.● Fit a line to the log curve.● Es=e^(-slope)+1

Estimates of E range from 0.85 to 1. (Fronhoffs. et. al.)

tCt NN 20 tCt ENN )1(0

ExamplepsbL IRp psbK

Cycle Count 14.86 20.02 14.94Initial Count E=1 3.36 * 10-5 9.41 * 10-7 3.18 * 10-5

E=0.95 4.90 * 10-5 1.56 * 10-6 4.64 * 10-5

E=0.90 7.21 * 10-5 2.63 * 10-6 6.85 * 10-5

E=0.85 1.07 * 10-4 4.48 * 10-6 1.02 * 10-4

psbL IRp psbKCycle Count 14.86 20.02 14.94Percentage E=1 50.7% 1.4% 47.9%

E=0.95 50.5% 1.6% 47.9%

E=0.90 50.3% 1.8% 47.8%

E=0.85 50.2% 2.1% 47.7%

Efficiency

Can we assume that the efficiency is the same for each quantity?

How does efficiency affect variability?

(There is evidence that efficiency and threshold cycle are dependent.)

The Curve is Not Exponential

Logistic growth? Does this change the percentages?

Is Nt the same for the different quantities?

Probably not.

The flourescence measured is affected by mass as well as number.

Are the masses known? If not, can we assume that the masses are similar?

Summary

A simplified version of the problem has a straight-forward solution, which may be enough for general purposes.

Reinserting the complexity into the problem leads to interesting statistical issues.

References

● Fronhoffs, et. al. “A method for the rapid construction of cRNA standard curves in quantitative real-time reverse transcription polymerase chain reaction,” Molecular and Cellular Probes (2002) 16, 99-110.

● Rutledge and Côté “Mathematics of quantitative kinetic PCR and the application of standard curves,” Nucleic Acids Research (2003) 31, no. 16.

● Swillens, et. al. “Instant evaluation of the absolute initial nuber of cDNA copies from a single real-time PCR curve,” Nucleic Acids Research (2004) 32, no. 6.