Dr. Richard Young Optronic Laboratories, Inc.. Uncertainty budgets are a growing requirement of...

61
CORM 2002: Uncertainty CORM 2002: Uncertainty Uncertainty and Sampling Uncertainty and Sampling Dr. Richard Young Dr. Richard Young Optronic Laboratories, Optronic Laboratories, Inc. Inc.

Transcript of Dr. Richard Young Optronic Laboratories, Inc.. Uncertainty budgets are a growing requirement of...

Page 1: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Uncertainty and SamplingUncertainty and SamplingUncertainty and SamplingUncertainty and Sampling

Dr. Richard YoungDr. Richard Young

Optronic Laboratories, Inc.Optronic Laboratories, Inc.

Dr. Richard YoungDr. Richard Young

Optronic Laboratories, Inc.Optronic Laboratories, Inc.

Page 2: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

IntroductionIntroductionIntroductionIntroduction

Uncertainty budgets are a growing Uncertainty budgets are a growing requirement of measurements.requirement of measurements.

Multiple measurements are generally Multiple measurements are generally required for estimates of uncertainty.required for estimates of uncertainty.

Multiple measurements can also decrease Multiple measurements can also decrease uncertainties in results.uncertainties in results.

How many measurement repeats are How many measurement repeats are enough?enough?

Uncertainty budgets are a growing Uncertainty budgets are a growing requirement of measurements.requirement of measurements.

Multiple measurements are generally Multiple measurements are generally required for estimates of uncertainty.required for estimates of uncertainty.

Multiple measurements can also decrease Multiple measurements can also decrease uncertainties in results.uncertainties in results.

How many measurement repeats are How many measurement repeats are enough?enough?

Page 3: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Random Data SimulationRandom Data SimulationRandom Data SimulationRandom Data SimulationPDF of Normal Distribution [=100, =10]

0

0.01

0.02

0.03

0.04

60 70 80 90 100 110 120 130 140

Value

Pro

bab

ilit

y

Here is an example Here is an example probability probability distribution distribution function of some function of some hypothetical hypothetical measurements.measurements.

We can use a We can use a random number random number

generator with generator with this distribution this distribution

to investigate to investigate the effects of the effects of

sampling.sampling.

Page 4: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Random Data SimulationRandom Data SimulationRandom Data SimulationRandom Data SimulationEffect of sampling on mean and standard deviation

20

40

60

80

100

120

140

160

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

Sample #

Va

lue

of

da

ta

data

Here is a set of 10,000 data points…Here is a set of 10,000 data points…

Page 5: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Random Data SimulationRandom Data SimulationRandom Data SimulationRandom Data SimulationEffect of sampling on mean and standard deviation

20

40

60

80

100

120

140

160

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

Sample #

Val

ue

of

dat

a o

r m

ean

0

5

10

15

20

25

30

35

Val

ue

of

sam

ple

sta

nd

ard

dev

iati

on

data mean Standard Deviation

Plotting Sample # on a log scale is better Plotting Sample # on a log scale is better to show behaviour at small samples.to show behaviour at small samples.

Page 6: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Random Data SimulationRandom Data SimulationRandom Data SimulationRandom Data SimulationEffect of sampling on mean and standard deviation

20

40

60

80

100

120

140

160

1 10 100 1000 10000

Sample #

Val

ue

of

dat

a o

r m

ean

0

5

10

15

20

25

30

35

Val

ue

of

sam

ple

sta

nd

ard

dev

iati

on

data mean Standard Deviation

There is a lot of variation, but how is this There is a lot of variation, but how is this affected by the data set?affected by the data set?

Page 7: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Sample MeanSample MeanSample MeanSample MeanSample means of normal distribution random numbers [=100, =10] vs number of samples

70

80

90

100

110

120

130

1 10 100

Number of Samples

Sa

mp

le m

ea

n

Here we have results Here we have results for 200 data sets.for 200 data sets.

n

3

Page 8: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Sample MeanSample MeanSample MeanSample MeanPDF for sample mean [=100, =10] with samples taken

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

80 85 90 95 100 105 110 115 120

Value of calculated mean

Pro

bab

ilit

y

2

3

5

10

100

Page 9: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Sample standard deviations of normal distribution [=100, =10] vs number of samples

0

5

10

15

20

25

30

35

1 10 100

Number of Samples

Sam

ple

sta

nd

ard

dev

iati

on

Sample Standard DeviationSample Standard DeviationSample Standard DeviationSample Standard Deviation

1

2

0

n

Page 10: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Sample Standard DeviationSample Standard DeviationSample Standard DeviationSample Standard DeviationPDF of Sample standard deviation [=10] with samples taken

0

0.1

0.2

0.3

0.4

0.5

0.6

0 2 4 6 8 10 12 14 16 18 20

Value of calculated sample standard deviation

Pro

bab

ilit

y 2

3

4

5

10

100

Samples

The most probable The most probable value for the value for the sample standard sample standard deviation of 2 deviation of 2 samples is zero! samples is zero! Many samples are Many samples are needed to make 10 needed to make 10 most probable.most probable.

Page 11: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Cumulative DistributionCumulative DistributionCumulative DistributionCumulative DistributionCDF of Sample standard deviation [=10] with samples taken

0

0.5

1

0 5 10 15 20 25 30

Value of calculated sample standard deviation

Cu

mu

lati

ve P

rob

abil

ity

2

3

4

5

10

100

Samples6.758.298.869.159.609.98

50%

Sometimes it Sometimes it is best to look is best to look at the CDF.at the CDF.

The 50% The 50% level is level is where where lower or lower or higher higher values are values are equally equally likely.likely.

Page 12: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Uniform DistributionUniform DistributionUniform DistributionUniform DistributionPDF of Sample standard deviation [=10] with samples taken - Uniform Distribution

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 2 4 6 8 10 12 14 16 18 20

Value of calculated sample standard deviation

Pro

bab

ilit

y

2

3

4

5

10

100

What if the What if the distribution distribution was uniform was uniform instead of instead of normal?normal?

The most The most probable value probable value for >2 samples is for >2 samples is 10. 10.

Page 13: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Uniform DistributionUniform DistributionUniform DistributionUniform DistributionCDF of Sample standard deviation [=10] with samples taken - Uniform Distribution

0

0.5

1

0 5 10 15 20 25 30

Value of calculated sample standard deviation

Cu

mu

lati

ve P

rob

abil

ity

2

3

4

5

10

100

Samples

7.289.219.659.839.949.99

50%

Underestimated Underestimated values are still values are still more probable more probable

because the because the PDF is PDF is

asymmetric.asymmetric.

Page 14: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Uniform DistributionUniform DistributionUniform DistributionUniform Distribution

Throwing a die is an example of a uniform Throwing a die is an example of a uniform random distribution.random distribution.

A uniform distribution is not necessarily random A uniform distribution is not necessarily random however.however. It may be cyclic e.g. temperature variations It may be cyclic e.g. temperature variations

due to air conditioning.due to air conditioning.

With computer controlled acquisition, data With computer controlled acquisition, data collection is often at regular intervals.collection is often at regular intervals.

This can give interactions between the cycle This can give interactions between the cycle period and acquisition interval.period and acquisition interval.

Throwing a die is an example of a uniform Throwing a die is an example of a uniform random distribution.random distribution.

A uniform distribution is not necessarily random A uniform distribution is not necessarily random however.however. It may be cyclic e.g. temperature variations It may be cyclic e.g. temperature variations

due to air conditioning.due to air conditioning.

With computer controlled acquisition, data With computer controlled acquisition, data collection is often at regular intervals.collection is often at regular intervals.

This can give interactions between the cycle This can give interactions between the cycle period and acquisition interval.period and acquisition interval.

Page 15: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Cyclic VariationsCyclic VariationsCyclic VariationsCyclic VariationsSinusoidal Variation [=100, =10]

80

85

90

95

100

105

110

115

120

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Phase

Val

ue

For symmetric cycles, any For symmetric cycles, any multiple of two data points multiple of two data points

per cycle will average to per cycle will average to the average of the cycle.the average of the cycle.

Page 16: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Cyclic VariationsCyclic VariationsCyclic VariationsCyclic VariationsMean values of sinusiodal data with phase at 10 samples per cycle

80

85

90

95

100

105

110

115

120

1 10 100 1000

Sample #

Sam

ple

Mea

n V

alu

e

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

Phase

Correct averages are Correct averages are obtained when full obtained when full

cycles are sampled, cycles are sampled, regardless of the phase.regardless of the phase.

Unless synchronized, data Unless synchronized, data collection may begin at any collection may begin at any point (phase) within the cycle.point (phase) within the cycle.

Page 17: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Cyclic VariationsCyclic VariationsCyclic VariationsCyclic VariationsSample standard deviations of sinusiodal data with phase at 10 samples per cycle

0

2

4

6

8

10

12

14

1 10 100 1000

Sample #

Sam

ple

sta

nd

ard

dev

iati

on

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

Phase

Again, whole cycles are Again, whole cycles are needed to give good values.needed to give good values.

The value is not 10 The value is not 10 because sample because sample

standard deviation standard deviation has a (n-1)has a (n-1)0.50.5 term. term.

The value is not 10 The value is not 10 because sample because sample

standard deviation standard deviation has a (n-1)has a (n-1)0.50.5 term. term.

Standard Standard DeviationDeviation

Page 18: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Cyclic VariationsCyclic VariationsCyclic VariationsCyclic VariationsPopulation standard deviations of sinusiodal data with phase at 10 samples per cycle

0

2

4

6

8

10

12

1 10 100 1000

Sample #

Po

pu

lati

on

sta

nd

ard

dev

iati

on

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

Phase

The population standard The population standard deviation is 10 at each deviation is 10 at each complete cycle.complete cycle.

Each cycle contains all the Each cycle contains all the data of the population.data of the population.

The standard deviation for The standard deviation for full cycle averages = 0.full cycle averages = 0.

The standard deviation for The standard deviation for full cycle averages = 0.full cycle averages = 0.

Page 19: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

SmoothingSmoothingSmoothingSmoothing

Smoothing involves combining adjacent Smoothing involves combining adjacent data points to create a smoother curve data points to create a smoother curve than the original.than the original.

A basic assumption is that data contains A basic assumption is that data contains noise, but the calculation does NOT allow noise, but the calculation does NOT allow for uncertainty.for uncertainty.

Smoothing should be used with caution.Smoothing should be used with caution.

Smoothing involves combining adjacent Smoothing involves combining adjacent data points to create a smoother curve data points to create a smoother curve than the original.than the original.

A basic assumption is that data contains A basic assumption is that data contains noise, but the calculation does NOT allow noise, but the calculation does NOT allow for uncertainty.for uncertainty.

Smoothing should be used with caution.Smoothing should be used with caution.

Page 20: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

SmoothingSmoothingSmoothingSmoothing

What is the difference?What is the difference?

Page 21: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Savitzky-Golay SmoothingSavitzky-Golay SmoothingSavitzky-Golay SmoothingSavitzky-Golay SmoothingEffect of Savitzky-Golay smoothing

-1000

0

1000

2000

3000

4000

5000

6000

7000

350 400 450 500 550 600 650 700 750 800

Wavelength [nm]

Sig

nal

[cp

s]

0.02s data

Here is a Here is a spectrum spectrum of a white of a white LED.LED.

It is recorded at It is recorded at very short very short integration time integration time to make it to make it deliberately deliberately noisy.noisy.

Page 22: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Savitzky-Golay SmoothingSavitzky-Golay SmoothingSavitzky-Golay SmoothingSavitzky-Golay SmoothingEffect of Savitzky-Golay smoothing

-1000

0

1000

2000

3000

4000

5000

6000

7000

350 400 450 500 550 600 650 700 750 800

Wavelength [nm]

Sig

nal

[cp

s]

0.02s data

0.02s 25 pt S-G

A 25 point A 25 point Savitzky-Golay Savitzky-Golay smooth gives a smooth gives a line through the line through the center of the center of the noise.noise.

Page 23: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Savitzky-Golay SmoothingSavitzky-Golay SmoothingSavitzky-Golay SmoothingSavitzky-Golay SmoothingEffect of Savitzky-Golay smoothing

-1000

0

1000

2000

3000

4000

5000

6000

7000

350 400 450 500 550 600 650 700 750 800

Wavelength [nm]

Sig

nal

[cp

s]

0.02s data

0.02s 25 pt S-G

7s data

The result of the The result of the smooth is very smooth is very

close to the close to the same device same device

measured with measured with optimum optimum

integration timeintegration time

Page 24: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Spectral SamplingSpectral SamplingSpectral SamplingSpectral SamplingEffect of Savitzky-Golay smoothing

-1000

0

1000

2000

3000

4000

5000

6000

7000

350 400 450 500 550 600 650 700 750 800

Wavelength [nm]

Sig

nal

[cp

s]

0.02s data

But how does the But how does the number of data number of data points affect points affect results?results?

Here we have 1024 data points.Here we have 1024 data points.

Page 25: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Spectral SamplingSpectral SamplingSpectral SamplingSpectral SamplingEffect of Savitzky-Golay smoothing

-1000

0

1000

2000

3000

4000

5000

6000

7000

350 400 450 500 550 600 650 700 750 800

Wavelength [nm]

Sig

nal

[cp

s]

0.02s data

Now we have 512 data points.Now we have 512 data points.

Page 26: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Spectral SamplingSpectral SamplingSpectral SamplingSpectral SamplingEffect of Savitzky-Golay smoothing

-1000

0

1000

2000

3000

4000

5000

6000

7000

350 400 450 500 550 600 650 700 750 800

Wavelength [nm]

Sig

nal

[cp

s]

0.02s data

Now we have 256 data points.Now we have 256 data points.

Page 27: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Spectral SamplingSpectral SamplingSpectral SamplingSpectral SamplingEffect of Savitzky-Golay smoothing

-1000

0

1000

2000

3000

4000

5000

6000

7000

350 400 450 500 550 600 650 700 750 800

Wavelength [nm]

Sig

nal

[cp

s]

0.02s data

Now we have 128 data points.Now we have 128 data points.

Page 28: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Spectral SamplingSpectral SamplingSpectral SamplingSpectral SamplingEffect of Savitzky-Golay smoothing

-1000

0

1000

2000

3000

4000

5000

6000

7000

350 400 450 500 550 600 650 700 750 800

Wavelength [nm]

Sig

nal

[cp

s]

0.02s data

0.02s 25 pt S-G

A 25 point A 25 point smooth follows smooth follows the broad peak the broad peak but not the but not the narrower narrower primary peak.primary peak.

Page 29: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Spectral SamplingSpectral SamplingSpectral SamplingSpectral SamplingEffect of Savitzky-Golay smoothing

-1000

0

1000

2000

3000

4000

5000

6000

7000

350 400 450 500 550 600 650 700 750 800

Wavelength [nm]

Sig

nal

[cp

s]

0.02s data

0.02s 7pt S-G

To follow the To follow the primary peak primary peak we need to use we need to use a 7 point a 7 point smooth…smooth…

But it But it doesn’t doesn’t work so work so well on well on the the broad broad peak.peak.

Page 30: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Spectral SamplingSpectral SamplingSpectral SamplingSpectral SamplingEffect of Savitzky-Golay smoothing

-1000

0

1000

2000

3000

4000

5000

6000

7000

350 400 450 500 550 600 650 700 750 800

Wavelength [nm]

Sig

nal

[cp

s]

0.02s data

0.02s 7 pt S-G

7s data

Comparing to Comparing to the optimum the optimum scan, the scan, the intensity of the intensity of the primary peak is primary peak is underestimated.underestimated.

This is This is because because some of some of the the higher higher signal signal data have data have been been removed.removed.

Page 31: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Spectral SamplingSpectral SamplingSpectral SamplingSpectral Sampling

Effect of Savitzky-Golay smoothing

-1000

0

1000

2000

3000

4000

5000

6000

7000

350 400 450 500 550 600 650 700 750 800

Wavelength [nm]

Signa

l [cps

]

0.02s data

0.02s 7 pt S-G

7s data

Effect of Savitzky-Golay smoothing

-1000

0

1000

2000

3000

4000

5000

6000

7000

350 400 450 500 550 600 650 700 750 800

Wavelength [nm]

Signa

l [cps

]

0.02s data

0.02s 25 pt S-G

7s data

Beware of under-sampling peaks – Beware of under-sampling peaks – you may underestimate or you may underestimate or overestimate intensities.overestimate intensities.

Page 32: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Exponential SmoothingExponential SmoothingExponential SmoothingExponential SmoothingEffect of Exponential smoothing

-1000

0

1000

2000

3000

4000

5000

6000

7000

350 400 450 500 550 600 650 700 750 800

Wavelength [nm]

Sig

nal

[cp

s]

0.02s data

Here is the Here is the original original data again.data again.

What about other types of What about other types of smoothing?smoothing?

Page 33: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Exponential SmoothingExponential SmoothingExponential SmoothingExponential SmoothingEffect of Exponential smoothing

-1000

0

1000

2000

3000

4000

5000

6000

7000

350 400 450 500 550 600 650 700 750 800

Wavelength [nm]

Sig

nal

[cp

s]

0.02s data

0.02s 0.8 Exp

An exponential An exponential smooth shifts the smooth shifts the peak.peak.

Beware of asymmetric Beware of asymmetric algorithms!algorithms!

Page 34: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Sampling Without NoiseSampling Without NoiseSampling Without NoiseSampling Without NoiseEffect of Sampling on data without noise

0

1000

2000

3000

4000

5000

6000

7000

350 400 450 500 550 600 650 700 750 800

Wavelength [nm]

Sig

nal

[cp

s]

This is the optimum integration This is the optimum integration scan but with 128 points like scan but with 128 points like the noisy example.the noisy example.

With lower noise, can we With lower noise, can we describe curves with describe curves with

fewer points?fewer points?

Page 35: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Sampling Without NoiseSampling Without NoiseSampling Without NoiseSampling Without NoiseEffect of Sampling on data without noise

0

1000

2000

3000

4000

5000

6000

7000

350 400 450 500 550 600 650 700 750 800

Wavelength [nm]

Sig

nal

[cp

s]

… … 64 points.64 points.

Page 36: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Sampling Without NoiseSampling Without NoiseSampling Without NoiseSampling Without NoiseEffect of Sampling on data without noise

0

1000

2000

3000

4000

5000

6000

7000

350 400 450 500 550 600 650 700 750 800

Wavelength [nm]

Sig

nal

[cp

s]

… … 32 points.32 points.

Is this enough to Is this enough to describe the peak?describe the peak?Is this enough to Is this enough to

describe the peak?describe the peak?

Page 37: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

InterpolationInterpolationInterpolationInterpolation

Interpolation is the process of estimating Interpolation is the process of estimating data between given points.data between given points.

National Laboratories often provide data National Laboratories often provide data that requires interpolation to be useful.that requires interpolation to be useful.

Interpolation algorithms generally estimate Interpolation algorithms generally estimate a smooth curve.a smooth curve.

Interpolation is the process of estimating Interpolation is the process of estimating data between given points.data between given points.

National Laboratories often provide data National Laboratories often provide data that requires interpolation to be useful.that requires interpolation to be useful.

Interpolation algorithms generally estimate Interpolation algorithms generally estimate a smooth curve.a smooth curve.

Page 38: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

InterpolationInterpolationInterpolationInterpolation

There are many forms of interpolation:There are many forms of interpolation: LeGrange, B-spline, Bezier, Hermite, LeGrange, B-spline, Bezier, Hermite,

Cardinal spline, cubic, etc.Cardinal spline, cubic, etc.They all have one thing in common:They all have one thing in common:

They go through each given point and They go through each given point and hence ignore uncertainty completely.hence ignore uncertainty completely.

Generally, interpolation algorithms are Generally, interpolation algorithms are local in nature and commonly use just 4 local in nature and commonly use just 4 points.points.

There are many forms of interpolation:There are many forms of interpolation: LeGrange, B-spline, Bezier, Hermite, LeGrange, B-spline, Bezier, Hermite,

Cardinal spline, cubic, etc.Cardinal spline, cubic, etc.They all have one thing in common:They all have one thing in common:

They go through each given point and They go through each given point and hence ignore uncertainty completely.hence ignore uncertainty completely.

Generally, interpolation algorithms are Generally, interpolation algorithms are local in nature and commonly use just 4 local in nature and commonly use just 4 points.points.

Page 39: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

InterpolationInterpolationInterpolationInterpolationEffect of Interpolation

40

60

80

100

120

140

160

180

200 400 600 800 1000 1200 1400 1600 1800 2000 2200 2400

Wavelength [nm]

Ran

do

m N

um

ber

[

=10

0,

=10

]

Random data

LeGrange Interpolated

Excel Smooth curve The interesting thing about The interesting thing about interpolating data containing interpolating data containing random noise is you never know random noise is you never know what you will get.what you will get.

Let’s zoom this portion…Let’s zoom this portion…Let’s zoom this portion…Let’s zoom this portion…

Page 40: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

InterpolationInterpolationInterpolationInterpolationEffect of Interpolation

40

60

80

100

120

140

160

180

500 520 540 560 580 600 620 640 660 680 700

Wavelength [nm]

Ran

do

m N

um

ber

[

=10

0,

=10

]

Random data

LeGrange Interpolated

Excel Smooth curve

The Excel curve can The Excel curve can even double back.even double back.

The Excel curve can The Excel curve can even double back.even double back.

Uneven sampling can Uneven sampling can cause overshoots.cause overshoots.

Uneven sampling can Uneven sampling can cause overshoots.cause overshoots.

Page 41: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Combining a SmoothCombining a Smoothand Interpolationand Interpolation

Combining a SmoothCombining a Smoothand Interpolationand Interpolation

If a spectrum can be represented by a If a spectrum can be represented by a function, e.g. polynomial, the closest “fit” function, e.g. polynomial, the closest “fit” to the data can provide smoothing and to the data can provide smoothing and give the values between points.give the values between points.

The “fit” is achieved by changing the The “fit” is achieved by changing the coefficients of the function until it is coefficients of the function until it is closest to the data.closest to the data. A least-squares fit.A least-squares fit.

If a spectrum can be represented by a If a spectrum can be represented by a function, e.g. polynomial, the closest “fit” function, e.g. polynomial, the closest “fit” to the data can provide smoothing and to the data can provide smoothing and give the values between points.give the values between points.

The “fit” is achieved by changing the The “fit” is achieved by changing the coefficients of the function until it is coefficients of the function until it is closest to the data.closest to the data. A least-squares fit.A least-squares fit.

Page 42: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Combining a SmoothCombining a Smoothand Interpolationand Interpolation

Combining a SmoothCombining a Smoothand Interpolationand Interpolation

The square of the differences between The square of the differences between values predicted by the function, and values predicted by the function, and those given by the data are added to give a those given by the data are added to give a “goodness of fit” measure.“goodness of fit” measure.

Coefficients are changed until the Coefficients are changed until the “goodness of fit” is minimized.“goodness of fit” is minimized.

Excel has a regression facility that Excel has a regression facility that performs this calculation.performs this calculation.

The square of the differences between The square of the differences between values predicted by the function, and values predicted by the function, and those given by the data are added to give a those given by the data are added to give a “goodness of fit” measure.“goodness of fit” measure.

Coefficients are changed until the Coefficients are changed until the “goodness of fit” is minimized.“goodness of fit” is minimized.

Excel has a regression facility that Excel has a regression facility that performs this calculation.performs this calculation.

Page 43: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Combining a SmoothCombining a Smoothand Interpolationand Interpolation

Combining a SmoothCombining a Smoothand Interpolationand Interpolation

Theoretically, any simple smoothly varying Theoretically, any simple smoothly varying curve can be fitted by a polynomial.curve can be fitted by a polynomial.

Sometimes it is better to “extract” the data Sometimes it is better to “extract” the data you want to fit by some reversible you want to fit by some reversible calculation.calculation.

This means you can use, say, 9This means you can use, say, 9thth order order polynomials instead of 123polynomials instead of 123rdrd order to make order to make the calculations easier.the calculations easier.

Theoretically, any simple smoothly varying Theoretically, any simple smoothly varying curve can be fitted by a polynomial.curve can be fitted by a polynomial.

Sometimes it is better to “extract” the data Sometimes it is better to “extract” the data you want to fit by some reversible you want to fit by some reversible calculation.calculation.

This means you can use, say, 9This means you can use, say, 9thth order order polynomials instead of 123polynomials instead of 123rdrd order to make order to make the calculations easier.the calculations easier.

Page 44: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Lamp 525

0

5

10

15

20

25

30

0 500 1000 1500 2000 2500

Wavelength [nm]

E [ W

cm

-2 n

m-1

]

Data

Polynomial FittingPolynomial FittingPolynomial FittingPolynomial Fitting

NIST provide NIST provide data at uneven data at uneven intervals.intervals.

To use the data, we To use the data, we have to interpolate to have to interpolate to intervals required by intervals required by

our measurements.our measurements.

Page 45: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

9th power polynomial fit

-5

0

5

10

15

20

25

30

0 500 1000 1500 2000 2500

Wavelength [nm]

E*

m5 /e

xp

(a+

b/

m)

data

fit

Method 1Method 1Method 1Method 1

NIST recommend to fit a high-NIST recommend to fit a high-order polynomial to data values order polynomial to data values multiplied by multiplied by 55/exp(a+b//exp(a+b/) for ) for interpolation.interpolation.

The result looks The result looks good, but…good, but…

Page 46: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

9th power polynomial fit

0.001

0.01

0.1

1

250 300 350 400 450

Wavelength [nm]

E*

m5 /e

xp

(a+

b/

m)

data

fit

Method 1Method 1Method 1Method 1

...on a log scale, the match ...on a log scale, the match is very poor at lower values.is very poor at lower values.

Page 47: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Lamp 525

0.01

0.1

1

10

250 300 350 400 450

Wavelength [nm]

E [ W

cm

-2 n

m]

Data

fit

Method 1Method 1Method 1Method 1

When converted back to the When converted back to the original scale, lower values bear original scale, lower values bear no relation to the data.no relation to the data.

Page 48: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

What went wrong?What went wrong?What went wrong?What went wrong?

The “goodness of fit” parameter is a measure of The “goodness of fit” parameter is a measure of absolute differences, not relative differences.absolute differences, not relative differences. NIST use a weighting of 1/ENIST use a weighting of 1/E22 to give relative to give relative

differences, and hence closer matching, but differences, and hence closer matching, but that is not easy in Excel.that is not easy in Excel.

Large values tend to dominate smaller ones in Large values tend to dominate smaller ones in the calculation.the calculation.

A large dynamic range of values should be A large dynamic range of values should be avoided.avoided.

We are trying to match data over 4 decades!We are trying to match data over 4 decades!

The “goodness of fit” parameter is a measure of The “goodness of fit” parameter is a measure of absolute differences, not relative differences.absolute differences, not relative differences. NIST use a weighting of 1/ENIST use a weighting of 1/E22 to give relative to give relative

differences, and hence closer matching, but differences, and hence closer matching, but that is not easy in Excel.that is not easy in Excel.

Large values tend to dominate smaller ones in Large values tend to dominate smaller ones in the calculation.the calculation.

A large dynamic range of values should be A large dynamic range of values should be avoided.avoided.

We are trying to match data over 4 decades!We are trying to match data over 4 decades!

Page 49: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

How do NIST deal with it?How do NIST deal with it?How do NIST deal with it?How do NIST deal with it?

Although NIST’s 1/EAlthough NIST’s 1/E22 weighting gives weighting gives closer matches than this data, to get best closer matches than this data, to get best results they split the data into 2 regions results they split the data into 2 regions and calculate separate polynomials for and calculate separate polynomials for each.each.

This a reasonable thing to do but can lead This a reasonable thing to do but can lead to local data effects and arbitrary splits to local data effects and arbitrary splits that do not fit all examples.that do not fit all examples.

Is there an alternative?Is there an alternative?

Although NIST’s 1/EAlthough NIST’s 1/E22 weighting gives weighting gives closer matches than this data, to get best closer matches than this data, to get best results they split the data into 2 regions results they split the data into 2 regions and calculate separate polynomials for and calculate separate polynomials for each.each.

This a reasonable thing to do but can lead This a reasonable thing to do but can lead to local data effects and arbitrary splits to local data effects and arbitrary splits that do not fit all examples.that do not fit all examples.

Is there an alternative?Is there an alternative?

Page 50: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Alternative Method 1Alternative Method 1Alternative Method 1Alternative Method 1Alternative 9th power polynomial fit

-5

-4

-3

-2

-1

0

1

2

3

0 0.5 1 1.5 2 2.5 3 3.5 4

1/ [m-1)

Lo

g(E

*m

5)

data

fit

A plot of the log of A plot of the log of E*E*55 values vs. values vs. -1-1 is is

a gentle curvea gentle curve

– – almost a straight almost a straight line.line.

We can calculate a We can calculate a polynomial without polynomial without splitting the data.splitting the data.

The fact that we are fitting a log scale The fact that we are fitting a log scale means we are effectively using relative means we are effectively using relative differences in the least squares calculation.differences in the least squares calculation.

Page 51: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Method 2Method 2Method 2Method 2Lamp 525

0

5

10

15

20

25

30

0 500 1000 1500 2000 2500

Wavlength [nm]

E [

W c

m-2

nm

-1]

Scaled Blackbody @ 3207.9K

Data Incandescent lamp Incandescent lamp emission is close to emission is close to that of a blackbody.that of a blackbody.

Page 52: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Method 2Method 2Method 2Method 2Lamp 525

0

5

10

15

20

25

30

0 500 1000 1500 2000 2500

Wavlength [nm]

E [

W c

m-2

nm

-1]

Scaled Blackbody @ 3207.9K

Data If we calculate a If we calculate a scaled blackbody scaled blackbody

curve as we would to curve as we would to get the distribution get the distribution

temperature…temperature…

……and then divide the and then divide the data by the blackbody...data by the blackbody...

Page 53: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Method 2Method 2Method 2Method 29th power polynomial fit

0.6

0.7

0.8

0.9

1

1.1

0 500 1000 1500 2000 2500

Wavelength [nm]

E/E

BB

Data

fit

...we get a smooth curve with very ...we get a smooth curve with very little dynamic range. little dynamic range.

The “fit” is not good because of The “fit” is not good because of the high initial slope and almost the high initial slope and almost linear falling slope.linear falling slope.

Page 54: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Method 2Method 2Method 2Method 29th power polynomial fit

0.6

0.7

0.8

0.9

1

1.1

0 0.5 1 1.5 2 2.5 3 3.5 4

1/ [m-1]

E/E

BB

Data

fit

Plotting vs. Plotting vs. -1-1, as in alternative , as in alternative method 1, allows close fitting of method 1, allows close fitting of the polynomial.the polynomial.

Page 55: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Residuals for Lamp 525

-0.6%

-0.4%

-0.3%

-0.1%

0.0%

0.1%

0.3%

0 500 1000 1500 2000 2500

Wavelength [nm]

Res

idu

als

[%]

NIST Program, Region 1

NIST Program, Region 2

Alternative Method 1

Method 2

Comparing resultsComparing resultsComparing resultsComparing results

Method 2 shows lower Method 2 shows lower residuals, but there is residuals, but there is not much difference.not much difference.

Page 56: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Lamp 525

0

5

10

15

20

25

30

0 500 1000 1500 2000 2500

Wavelength [nm]

E [ W

cm

-2 n

m-1

]

Data

fit

Comparing resultsComparing resultsComparing resultsComparing results

All methods discussed All methods discussed give essentially the give essentially the

same result when same result when converted back to the converted back to the

original scale.original scale.

Page 57: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Algorithms and UncertaintyAlgorithms and UncertaintyAlgorithms and UncertaintyAlgorithms and Uncertainty

None of the algorithms mentioned allow None of the algorithms mentioned allow for uncertainty or assume it is constant.for uncertainty or assume it is constant.

If we replaced the least-squares If we replaced the least-squares “goodness of fit” parameter with “most “goodness of fit” parameter with “most probable,” this would use the uncertainty probable,” this would use the uncertainty we know is there to determine the best fit.we know is there to determine the best fit.

Why is this not done?Why is this not done? Difficult in Excel.Difficult in Excel. Easy with custom programs.Easy with custom programs.

None of the algorithms mentioned allow None of the algorithms mentioned allow for uncertainty or assume it is constant.for uncertainty or assume it is constant.

If we replaced the least-squares If we replaced the least-squares “goodness of fit” parameter with “most “goodness of fit” parameter with “most probable,” this would use the uncertainty probable,” this would use the uncertainty we know is there to determine the best fit.we know is there to determine the best fit.

Why is this not done?Why is this not done? Difficult in Excel.Difficult in Excel. Easy with custom programs.Easy with custom programs.

Page 58: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Algorithms and UncertaintyAlgorithms and UncertaintyAlgorithms and UncertaintyAlgorithms and UncertaintyPDF of Normal Distribution [=100, =10]

0

0.01

0.02

0.03

0.04

60 70 80 90 100 110 120 130 140

Value

Pro

bab

ilit

y

From the data From the data value (mean) value (mean)

and the and the standard standard

deviation, we deviation, we can calculate can calculate

the PDF.the PDF.

The value from The value from the fit has a the fit has a probability that probability that we can use.we can use.

Page 59: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

Algorithms and UncertaintyAlgorithms and UncertaintyAlgorithms and UncertaintyAlgorithms and Uncertainty

Multiply the probabilities at each point to Multiply the probabilities at each point to give the “goodness of fit” parameter.give the “goodness of fit” parameter.

Use this parameter instead of the least-Use this parameter instead of the least-squares in the fit calculations.squares in the fit calculations.

MAXIMIZE the “goodness of fit” parameter MAXIMIZE the “goodness of fit” parameter to obtain the best fit.to obtain the best fit.

The fit will be closest where uncertainties The fit will be closest where uncertainties are lowest.are lowest.

Multiply the probabilities at each point to Multiply the probabilities at each point to give the “goodness of fit” parameter.give the “goodness of fit” parameter.

Use this parameter instead of the least-Use this parameter instead of the least-squares in the fit calculations.squares in the fit calculations.

MAXIMIZE the “goodness of fit” parameter MAXIMIZE the “goodness of fit” parameter to obtain the best fit.to obtain the best fit.

The fit will be closest where uncertainties The fit will be closest where uncertainties are lowest.are lowest.

Page 60: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

ConclusionsConclusionsConclusionsConclusions

Standard deviations may be under-Standard deviations may be under-estimated with small samples.estimated with small samples.

Cyclic variations should be integrated for Cyclic variations should be integrated for complete cycle periods.complete cycle periods.

Smoothing and interpolation should be Smoothing and interpolation should be used with caution:used with caution: Do not assume results are valid – Do not assume results are valid –

check.check.

Standard deviations may be under-Standard deviations may be under-estimated with small samples.estimated with small samples.

Cyclic variations should be integrated for Cyclic variations should be integrated for complete cycle periods.complete cycle periods.

Smoothing and interpolation should be Smoothing and interpolation should be used with caution:used with caution: Do not assume results are valid – Do not assume results are valid –

check.check.

Page 61: Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

CORM 2002: UncertaintyCORM 2002: Uncertainty

ConclusionsConclusionsConclusionsConclusions

Polynomial fits can give good results, but:Polynomial fits can give good results, but: Avoid large dynamic rangeAvoid large dynamic range Avoid complex curvaturesAvoid complex curvatures Avoid high initial slopesAvoid high initial slopes

All these manipulations ignore uncertainty All these manipulations ignore uncertainty (or assume it is constant).(or assume it is constant). But least-squares fits can be replaced But least-squares fits can be replaced

by maximum probability to take by maximum probability to take uncertainty into consideration.uncertainty into consideration.

Polynomial fits can give good results, but:Polynomial fits can give good results, but: Avoid large dynamic rangeAvoid large dynamic range Avoid complex curvaturesAvoid complex curvatures Avoid high initial slopesAvoid high initial slopes

All these manipulations ignore uncertainty All these manipulations ignore uncertainty (or assume it is constant).(or assume it is constant). But least-squares fits can be replaced But least-squares fits can be replaced

by maximum probability to take by maximum probability to take uncertainty into consideration.uncertainty into consideration.