Lecture 11 03 23 2011 - University of Notre Dame Microsoft PowerPoint - Lecture_11_03_23_2011 Author...

19
Error Analysis

Transcript of Lecture 11 03 23 2011 - University of Notre Dame Microsoft PowerPoint - Lecture_11_03_23_2011 Author...

Error Analysis

Error Analysis

• The big questions are:

- What value best represents a set of measurements and how reliable is it?

Error Analysis

• An ICP analysis yields LOTS of data typically, 9 individual measurements for each analyte for each sample (3 replicates * 3 runs)

• the report sequence or ASCII data file gives you the mean (average) of the replicates and the standard deviation of the replicates

• What does that REALLY mean?

Error Analysis• Assuming a normal (gaussian) distribution of the

data:

• 1st question - what is the most probable value for the population?

• 1st approach - take the mean (average)

• 2nd approach - take the median– useful if there are few measurements and asymmetry

is involved

Error Analysis

• Mean: x = Ʃ(xi) / n

• Median = the value of the middle item, or the mean of the values of the two middle items, when the data are arranged in an increasing or decreasing order of magnitude

Error Analysis

• E.g.• 42, 39, 31, 35, and 38

• Median = 38

Error Analysis

• Generally speaking, the median of a set of n items, where n is odd, is the value of the

– (n +1) / 2th largest item

– E.g. The median of 25 numbers is the value of the (25 +1) /2 = 13th largest number

Error Analysis• By definition, the mean is heavily influenced by extreme

values while the median is not

• Outliers strongly effect the mean value, but show little to no effect on median

• E.g.:

• xi = 4, 4, 4, 5, 5, 6, 6, 6, 6, 7, 7, 8, 9, 9, 17

• Median = 6• Mean = 6.867

Error Analysis

• Dispersion of a population– Ok, you’ve decided to take the mean - How

reliable is it?

• Two measures to judge reliability of the mean:

- Range and Standard Deviation

Error Analysis• Range: R = xmax – xmin

• strongly influenced by extreme values

• with n being sufficiently large (> 9), you can use quantiles to buffer the range against outliers

• n = 10, cast out high and low. Left with n’ = 8, the range of which is called the 10-90% range

• if n is large enough, can bracket the data in the 17-83% range (will bound ~2/3 of all data)

Error Analysis

• Taking the mean (or median) of the bounded range should provide a good estimate of the true value

• Why 17th and 83rd quantiles? Encompasses ~2/3 of the data points, which is very close the interval of the mean +/- 1 standard deviation (SD)

Error Analysis

• The Standard Deviation

– most common measure of dispersion

– robust statistic - will provide meaningful data even if the population does not strictly meet the definition of the normal population

Error Analysis

• Relative standard deviation (or coefficient of variation c.o.v.)

• RSD (%) = 100*sx / mean

Error Analysis

• Predominant sources of error in ICP-MS analysis:

– weighing/volume error

– error in standard concentration

– instrument error

Error Analysis

• How does one evaluate the true ‘error’ or uncertainty associated with an ICP-MS analysis?

• Sample preparation errors are probably greater than instrument error

Error Analysis

• SO....strictly speaking, replicates should be conducted by preparing multiple aliquots of the same sample and running them multiple times

• However, this is generally impractical to do for all samples

• A better idea is to do this for maybe one or two samples

Error Analysis

• Reproducibility and Repeatability

• Reproducibility = standard deviation of a method over a long time frame

• Repeatability = standard deviation of a method over a short time frame (with all controllable conditions being the same)

Error Analysis

• Reporting ICP Data

– due to systematic and random errors, ICP data should rarely, if ever, be reported to greater than 3 or 4 significant digits