Uncertainty Analysis of Thermocouple Measurements Used in ...
Uncertainty in the Measurements
description
Transcript of Uncertainty in the Measurements
Novel Algorithms for the Novel Algorithms for the Quantification ConfidenceQuantification Confidence in in Quantitative Proteomics with Quantitative Proteomics with
Stable Isotope Labeling*Stable Isotope Labeling*
Chongle PanChongle Pan1,21,2; David L. Tabb; David L. Tabb11; Dale Pelletier; Dale Pelletier11; ; W. Hayes McDonaldW. Hayes McDonald11; Greg Hurst; Greg Hurst11; Nagiza F. ; Nagiza F.
SamatovaSamatova11; Robert L. Hettich; Robert L. Hettich11;;
11Oak Ridge National Laboratory, Oak Ridge, TNOak Ridge National Laboratory, Oak Ridge, TN22 Genome Science and Technology, UT-ORNL Genome Science and Technology, UT-ORNL
* Research support provided by the U.S. Department of Energy, Office of Biological and Environmental Research.
Uncertainty in the MeasurementsUncertainty in the Measurements
Mass spectrometric measurement of a proteinMass spectrometric measurement of a protein
Mr = 23,564 DaMr = 23,564 Da ±10 Da 95% confidence±10 Da 95% confidence
Relative quantification of a protein in Relative quantification of a protein in quantitative proteomicsquantitative proteomics
Abundance ratio = 1:1Abundance ratio = 1:1
95% confidence interval = [2:1, 1:2]95% confidence interval = [2:1, 1:2]The principal aimThe principal aim
RelExRelEx11, ASAPratio, ASAPratio22, , XPRESSXPRESS3 3 , MSQuan, MSQuan44
11Anal Chem, 2003. Anal Chem, 2003. 7575: p. 6912-21: p. 6912-2122Anal Chem, 2003. 75: p. 6648-57Anal Chem, 2003. 75: p. 6648-5733Nat Biotechnol, 2001. 19: p. 946-51Nat Biotechnol, 2001. 19: p. 946-5144Nat Biotechnol, 2004. 22: p. 1139-45.Nat Biotechnol, 2004. 22: p. 1139-45.
ExperimentalExperimental1.1. Metabolic labeling ofMetabolic labeling of Rhodopseudomonas Rhodopseudomonas
palustris palustris withwith thethe stable isotope stable isotope 1515N N 2.2. Standard mixtures of natural and Standard mixtures of natural and 1515N-labeled N-labeled
proteomes at the pre-determined mixing ratios proteomes at the pre-determined mixing ratios 3.3. Shotgun proteomics analysisShotgun proteomics analysis
– MS instrument:MS instrument: linear ion trap linear ion trap (LTQ, Finnigan)(LTQ, Finnigan)
– 2D-LC method:2D-LC method: 24-hour MudPIT technique 24-hour MudPIT technique55
4.4. Protein identificationProtein identification– Database searching:Database searching: DBDigger DBDigger66
– Identification filtering:Identification filtering: DTASelect DTASelect77
5 5 Int. J. of Mass Spec. 2002. 219: p. 245-251.Int. J. of Mass Spec. 2002. 219: p. 245-251.6 6 Anal Chem, 2003. Anal Chem, 2003. 7575: p. 6912-21: p. 6912-217 7 J. Proteome Res. 2002 1: p. 21-26.J. Proteome Res. 2002 1: p. 21-26.
Benchmark DataBenchmark Data
Peptide I.D. filtering: 95% of true positive rate Peptide I.D. filtering: 95% of true positive rate
Protein I.D. filtering: minimum of 2 peptidesProtein I.D. filtering: minimum of 2 peptides14N:15N Peptide hits Protein hits
1:1 16,100 1,9621:1 15,876 1,8985:1 16,095 1,8721:5 16,800 2,082
10:1 16,725 1,9171:10 17,738 2,163
Sample dynamic rangeno difference
5-fold difference 1:5 5:110-fold difference 1:10 10:1
Mixing ratio (14N:15N)1:1 (in duplicate)
Data qualityData quality
ReproducibilityReproducibility
MS1 or mzXML formatMS1 or mzXML format
SIC reconstructionSIC reconstruction
peak detectionpeak detection
peptide quantificationpeptide quantification
protein quantificationprotein quantificationmaximum likelihood maximum likelihood
estimationestimation
principal component principal component analysisanalysis
parallel paired covarianceparallel paired covariance
Block DiagramBlock Diagram
selected ion chromatogramselected ion chromatogram
mass spectral datamass spectral data
chromatographic peakchromatographic peak
peptide abundance ratiopeptide abundance ratio
confidence score confidence score
protein abundance ratioprotein abundance ratio
confidence intervalconfidence interval
Peak DetectionPeak Detection
cova
rian
ce
scan numberscan number
ion
inte
nsity
Light isotopologue SIC; Heavy isotopologue SIC
S/N=3; S/N=13
S/N=42
Parallel paired covariance Parallel paired covariance chromatogram (PPC)chromatogram (PPC)
Peak boundaries are defined as the local minima in the PPC, which include all MS/MS matching the peptide
Pea
k b
ou
nd
arie
s
Peptide QuantificationPeptide Quantification
Peptide abundance ratios can be estimated byPeptide abundance ratios can be estimated by Peak height ratioPeak height ratio
scan number
ion
inte
nsity
ion
inte
nsity
scan number
Peak area ratio Peak area ratio ASAPratio, MSQuan, XPRESSASAPratio, MSQuan, XPRESS
heavy isotopologue ion intensity
light
isot
opol
ogue
ion
inte
nsity
Peptide QuantificationPeptide Quantificationio
n in
tens
ity
scan number
Linear regression RelExRelEx
ratio = tan(θ)
θ
PC1
PC2
Principal component analysis (PCA)
signal-to-noise ratio = PCA-SNR
θ
Quantification AccuracyQuantification Accuracy
Pep
tid
e co
un
ts
log2(ratio)
ExpectedExpected loglog 22(ratio)(ratio)
Peak height ratioPeak area ratioPCA/linear regression
1:5
5:1 10:1
1:101:1
log2(ratio)
Pep
tid
e co
un
tsP
epti
de
cou
nts
log2(ratio)log2(ratio)
Quantification AccuracyQuantification Accuracy
1:1 5:1
1:5 1:10
10:1
Quantification ConfidenceQuantification Confidence
log2(ratio)
log
2(P
CA
-SN
R)
pep
tid
e co
un
ts
5:1 2D histogram ofpeptide log2(ratio) & log2(PCA-SNR)
Quantification ConfidenceQuantification Confidence
log2(ratio)
log
2(P
CA
-SN
R)
5:1 Bin the peptides by their log2(PCA-SNR) value
Bias: the deviation of the average estimated log2(ratio)from the expected log2(ratio)
Bias increases asPCA-SNR decreases below a threshold
Quantification ConfidenceQuantification Confidence
log2(ratio)
log
2(P
CA
-SN
R)
5:1 Bin the peptides by their log2(PCA-SNR) value
Variance: the variability of the estimated log2(ratio)
Variance increases asPCA-SNR decreases
Quantification ConfidenceQuantification Confidence1:5
log2(ratio)
log 2
(S/N
)
Comet-like two-dimensional distributionComet-like two-dimensional distribution
As logAs log22(SNR) decreases,(SNR) decreases,
the spread of logthe spread of log22(ratio) estimates increases (ratio) estimates increases
the average of logthe average of log22(ratio) estimates regresses to zero(ratio) estimates regresses to zero
log2(ratio)
log
2(P
CA
-SN
R)
log2(ratio)log2(ratio)
log
2(P
CA
-SN
R)
1:1 1:10
10:11:1 5:1
Quantification ConfidenceQuantification Confidencelo
g2(P
CA
-SN
R)
| mean { log2(ratio) } |
5:1&1:5 10:1&1:101:1
log
2(P
CA
-SN
R)
standard deviation { log2(ratio) }
1:15:1&1:5
10:1&1:10
The quantification bias and variance for peptides are linear functions of PCA-SNR
Protein QuantificationProtein Quantification
log2(ratio)
log
2(P
CA
-SN
R)
mean
Maximum likelihood point estimate of a protein’s abundance ratio is the ratio that best explains its measured peptides’ estimated log2(ratio) at the calculated log2(PCA-SNR)
2 sd
measuredpeptides
A series of theoretical probability distributions of peptide abundance ratio estimates at each PCA-SNR level
Quantification AccuracyQuantification Accuracy
RelEx filtering:> 0.7 correlation at 1> 0.4 correlation at 10> 3 signal-to-noise> 2 peptides
log2(ratio)
pro
tein
co
un
ts
1:5 protein MSERelEx 1090 2.0
PRATIO 1091 0.9
MSE: Mean Square Error
PRATIO filtering:> 2 PCA-SNR> 2 peptides< 4 95% confidence interval width for log2(ratio)
5:1
1:10
Quantification AccuracyQuantification Accuracy
log2(ratio) log2(ratio) log2(ratio)
1:1
1:1 5:1
1:5 1:10
10:1
prot
ein
coun
tspr
otei
n co
unts protein MSE
1115 0.41262 0.6
protein MSE1210 0.51242 0.5
protein MSE1090 2.01091 0.9
protein MSE1046 2.2980 1.8
protein MSE1092 4.61061 1.6
protein MSE1070 4.01072 1.9
Confidence Interval EstimationConfidence Interval Estimation
log2(ratio)
1:5 Display of the point estimates (+) and the 95% confidence interval estimates ( ----------- ) for protein abundance ratios
Pro
tein
Confidence Interval EstimationConfidence Interval Estimation
Point estimates and confidence interval estimates of protein abundance ratios
log2(ratio) log2(ratio)log2(ratio)
1:1 1:5
5:1 10:11:1
1:10
ConclusionsConclusions
Three novel algorithmsThree novel algorithms Parallel paired covarianceParallel paired covariance for peak detection for peak detection Principal component analysisPrincipal component analysis for peptide quantification for peptide quantification Maximum likelihood estimationMaximum likelihood estimation for protein quantification for protein quantification
Improved Protein Quantification Accuracy
Rigorous Confidence Interval Estimation
Three novel algorithmsThree novel algorithms Parallel paired covarianceParallel paired covariance for peak detection for peak detection Principal component analysisPrincipal component analysis for peptide quantification for peptide quantification Maximum likelihood estimationMaximum likelihood estimation for protein quantification for protein quantification
Improved Protein Quantification Accuracy
Rigorous Confidence Interval Estimation
The fully automated program with graphic user The fully automated program with graphic user interface is freely available for testing by contacting interface is freely available for testing by contacting C. Pan (email: [email protected])C. Pan (email: [email protected])