Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in...
-
Upload
sophia-bates -
Category
Documents
-
view
214 -
download
0
Transcript of Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in...
![Page 1: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/1.jpg)
Hypothesis Testing
"Parametric" tests -- we will have to assume Normal distributions (usually)
in ways detailed below
These standard tests are useful to know, and for communication, but during your analysis you should be doing more robust eyeball checking of significance – scramble the data, split it in halves/thirds, make syntehtic data, etc. etc.
![Page 2: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/2.jpg)
purpose of the lecture
to introduce
Hypothesis Testing
the process of determining the statistical significance of results
![Page 3: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/3.jpg)
Part 1
motivation
random variation as a spurious source of patterns
![Page 4: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/4.jpg)
1 2 3 4 5 6 7 8-5
-4
-3
-2
-1
0
1
2
3
4
5
d
x
![Page 5: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/5.jpg)
1 2 3 4 5 6 7 8-5
-4
-3
-2
-1
0
1
2
3
4
5
d
x
looks pretty linear
![Page 6: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/6.jpg)
actually, its just a bunch of random numbers!
figure(1);for i = [1:100] clf; axis( [1, 8, -5, 5] ); hold on; t = [2:7]'; d = random('normal',0,1,6,1); plot( t, d, 'k-', 'LineWidth', 2 ); plot( t, d, 'ko', 'LineWidth', 2 ); [x,y]=ginput(1); if( x<1 ) break; endend
the script makes plot after plot, and lets you stop
when you see one you like
![Page 7: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/7.jpg)
the linearity was due to random variation
Beware:
5% of random results will be
"significant at the 95% confidence level"!
The following are "a priori" significance tests.
You have to have an a priori reason to be looking for a particular relationship to use these tests properly
For a data "fishing expedition" the significance threshold is higher, and depends on
how long you've been fishing!
![Page 8: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/8.jpg)
Four Important Distributions
used in hypothesis testing
![Page 9: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/9.jpg)
#1: the Z distribution
(standardized Normal distribution)("Z scores")
p(Z) is theNormal distribution for a quantity Z with zero mean and unit variance
![Page 10: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/10.jpg)
if d is Normally-distributed with mean d and variance σ2d
then Z = (d-d)/ σd is Normally-distributed with zero mean and
unit variance
The "Z score" of a result is just "how many sigma it is from the
mean"
![Page 11: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/11.jpg)
#2: t-scoresthe distribution of a finite sample (N)
of values e that are Z distributed in reality
this is a new distribution, called the"t-distribution"
![Page 12: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/12.jpg)
-5 -4 -3 -2 -1 0 1 2 3 4 50
0.1
0.2
0.3
0.4
0.5
N=1
N=5
tN
p(tN)
t-distribution
![Page 13: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/13.jpg)
-5 -4 -3 -2 -1 0 1 2 3 4 50
0.1
0.2
0.3
0.4
0.5
N=1
N=5
tN
p(tN)
t-distribution
heavier tails than a
Normal p.d.f.
for small N *
becomes Normal p.d.f.
for large N
N=1*because you mis-estimate the mean with too few samples, such that values
too far from the mis-estimated mean are far more likely than rapid exp(-x^2) falloff
![Page 14: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/14.jpg)
#3 the chi-squared distribution
The Normal or Z distribution comes from the limit of the sum of any large number of i.i.d. variables.
The chi-squared distribution comes from the sum of the square of N Normally distributed variables.
Its limit is therefore Normal, but for N < ∞ it differs...For one thing, it is positive definite!
![Page 15: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/15.jpg)
Chi-squared distribution
total errorE = χN2 = Σ i=1N ei2
![Page 16: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/16.jpg)
Chi-squared
total errorE = χN2 = Σ i=1N ei2p(E) is called 'chi-squared' when ei is
Normally-distributed with zero mean and unit variance
called chi-squared p.d.f
![Page 17: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/17.jpg)
N=1
2
3 45 c2
p(cN2)
Chi-Squared p.d.f.the PDF of the sum of squared Normal
variablesN called “the degrees of freedom”
mean N, variance 2N
asymptotes to
Normal (Gaussian)
for large N
![Page 18: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/18.jpg)
In MatLab
![Page 19: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/19.jpg)
#4 Distribution of the ratio of two variances from finite samples (M,N)
(each of which is Chi-squared distributed)
it's another new distribution, called the "F-distribution"
![Page 20: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/20.jpg)
p(FN,2)
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50
0.51
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50
0.51
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5012
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5012
p(FN,5)
p(FN,50)F
F
F
F
p(FN,25)
N=2 50
N=2 50
N=2 50
N=2 50
F-distribution The ratio of two imperfect (undersampled) estimates of unit variance – for N,M ∞ it becomes a spike at 1 as both estimates are right
starts to look Normal, and gets narrower
around 1 for large N and M
skewed at low N and M
![Page 21: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/21.jpg)
Part 4
Hypothesis Testing
![Page 22: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/22.jpg)
Step 1. State a Null Hypothesis
some version of
the result is due to random or meaningless data variations
(too few samples to see the truth)
![Page 23: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/23.jpg)
Step 1. State a Null Hypothesis
some variation of
the result is due to random variation
e.g.
the means of the Sample A and Sample B are different only because of random variation
![Page 24: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/24.jpg)
Step 2. Define a standardized quantity that is
unlikely to be large
when the Null Hypothesis is true
![Page 25: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/25.jpg)
Step 2. Define a standardized quantity that is
unlikely to be large
when the Null Hypothesis is true
called a “statistic”
![Page 26: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/26.jpg)
e.g.
the difference in the means Δm=(meanA – meanB) is unlikely to be large (compared to the standard
deviation) if the Null Hypothesis is true
![Page 27: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/27.jpg)
Step 3.
Calculate that the probability that your observed value or greater of the statistic
would occur if the Null Hypothesis were true
![Page 28: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/28.jpg)
Step 4.
Reject the Null Hypothesisif such large values have a probability of ocurrence of
less than 5% of the time
![Page 29: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/29.jpg)
An example
test of a particle size measuring device
![Page 30: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/30.jpg)
manufacturer's specs:
* machine is perfectly calibrated so
particle diameters scatter about true value
* random measurement error isσd = 1 nm
![Page 31: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/31.jpg)
your test of the machine
purchase batch of 25 test particleseach exactly 100 nm in diameter
measure and tabulate their diameters
repeat with another batch a few weeks later
![Page 32: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/32.jpg)
Results of Test 1
![Page 33: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/33.jpg)
Results of Test 2
![Page 34: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/34.jpg)
Question 1Is the Calibration Correct?
Null Hypothesis:
The observed deviation of the average particle size from its true value of 100 nm is due to random variation (as contrasted to a bias in the calibration).
![Page 35: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/35.jpg)
in our case
the key question isAre these unusually large values for Z ?
= 0.278 and 0.243
![Page 36: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/36.jpg)
example for Normal (Z) distributed statistic P(Z’) is the cumulative probability from -∞ to Z’
0 Z’ Zp(Z)
called erf(Z')
![Page 37: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/37.jpg)
The probability that a difference of either sign between sample means A and B is due to chance is P( |Z| > Zest )This is called a two-sided test
0 Zest Zp(Z)
-Zestwhich is1 – [erf(Zest) - erf(-Zest)]
![Page 38: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/38.jpg)
in our case
the key question isAre these unusually large values for Z ?
= 0.278 and 0.243
= 0.780 and 0.807
So values of |Z| greater than Zest are very common
The Null Hypotheses cannot be rejected.There is no reason to think the machine is biased
![Page 39: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/39.jpg)
suppose the manufacturer had not specified that random measurement
error is σd = 1 nm
then you would have to estimate it from the data
= 0.876 and 0.894
![Page 40: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/40.jpg)
but then you couldn’t form Zsince you need the true variance
![Page 41: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/41.jpg)
we examined a quantity t, defined as the ratio of a Normally-distributed variable e and something
that has the form of an estimated standard deviation instead of the true sd:
![Page 42: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/42.jpg)
so we will test tinstead of Z
![Page 43: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/43.jpg)
in our case
Are these unusually large values for t ?= 0.297 and 0.247
![Page 44: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/44.jpg)
in our case
Are these unusually large values for t ?= 0.297 and 0.247
= 0.768 and 0.806
So values of |t| > test are very common(and verrry close to Z test for 25 samples)
The Null Hypotheses cannot be rejectedthere is no reason to think the machine is biased
= 0.780 and 0.807
![Page 45: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/45.jpg)
Question 2Is the variance in spec?
Null Hypothesis:
The observed deviation of the variance from its true value of 1 nm2 is due to random variation (as contrasted to the machine being noisier than the specs).
![Page 46: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/46.jpg)
the key question is:Are these unusually large values for χ2
based on 25 independent samples?
= ?
Results of the two tests
![Page 47: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/47.jpg)
Are values ~20 to 25 unusual
for a chi-squared statistic
with N=25?
No, the median almost follows
N
![Page 48: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/48.jpg)
In MatLab
= 0.640 and 0.499So values of χ2 greater than χest2 are very common
The Null Hypotheses cannot be rejectedthere is no reason to think the machine is noiser than
advertised
![Page 49: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/49.jpg)
Question 3Has the calibration changed between the two tests?
Null Hypothesis
The difference between the means is due to random variation (as contrasted to a change in the calibration).
= 100.055 and 99.951
![Page 50: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/50.jpg)
since the data are Normal
their means (a linear function) are Normal
and the difference between them (a linear function) is Normal
![Page 51: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/51.jpg)
since the data are Normal
their means (a linear function) is Normal
and the difference between them (a linear function) is Normal
if c = a – b then σc2 = σa2 + σb2
![Page 52: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/52.jpg)
so use a Z test
in our case
Zest = 0.368
![Page 53: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/53.jpg)
= 0.712
Values of |Z| greater than Zest are very common
so the Null Hypotheses cannot be rejectedthere is no reason to think the bias of the machine has
changed
using MatLab
0.368
![Page 54: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/54.jpg)
Question 4Has the variance changed between the two
tests?
Null Hypothesis:
The difference between the variances is due to random variation (as contrasted to a change in the machine’s precision).
= 0.896 and 0.974
![Page 55: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/55.jpg)
recall the distribution of a quantity F, the ratio of variances
![Page 56: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/56.jpg)
so use an F test
in our case
F est = 1.110N1=N2=25
![Page 57: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/57.jpg)
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
F
p(F)
F
p(F)
1/Fest Fest
whether the top or bottom χ2 in
is the bigger is irrelevant, since our Null Hypothesis only concerns their being different. Hence we need evaluate the "two-sided" test:
![Page 58: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/58.jpg)
= 0.794
Values of F so close to 1are very common even with N = M = 25
using MatLab
so the Null Hypotheses cannot be rejectedthere is no reason to think the noisiness of the machine
has changed
1.11 1.11
![Page 59: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/59.jpg)
Another use of the F-test
![Page 60: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/60.jpg)
we often develop two
alternative models
to describe a phenomenon
and want to know
which is better?
![Page 61: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/61.jpg)
A "better" model?
look for difference in total error (unexplained variance) between the
two models
Null Hyp: the difference is just due to random variations
in the data
![Page 62: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/62.jpg)
linear fit
cubic fittime t, hours
time t, hours
d(i)
d(i)
ExampleLinear Fit vs. Cubic Fit?
![Page 63: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/63.jpg)
A) linear fit
B) cubic fittime t, hours
time t, hours
d(i)
d(i)
ExampleLinear Fit vs Cubic Fit?
cubic fit has 14% smaller error, E
![Page 64: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/64.jpg)
The cubic fits 14% better, but …
The cubic has 4 coefficients, the line only 2, so the error of the cubic will tend to be smaller
anyway
and furthermore
the difference could just be dueto random variation
![Page 65: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/65.jpg)
Use an F-test
degrees of freedom on linear fit:νL = 50 data – 2 coefficients = 48
degrees of freedom on cubic fit:νC = 50 data – 4 coefficients = 46
F = (EL/ νL) / (EC/ νC) = 1.14
![Page 66: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/66.jpg)
so use an F test
in our case
F est = 1.14N1,N2 = 48, 46
![Page 67: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/67.jpg)
in our case
= 0.794
Values of F greater than F est or less than 1/F est are very common
so the Null Hypothesis cannot be rejected
![Page 68: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/68.jpg)
in our case
= 0.794
Values of F greater than F est or less than 1/F est are very common
so the Null Hypothesis cannot be rejectedthere is no reason to think one model is ‘really’ better
than the other
![Page 69: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/69.jpg)
Degrees of freedom• All the finite-sample tests depend on how many
degrees of freedom (DOFs) you assume. • In some applications, every sample is
independent so #DOFs = #samples• In a lot of our work this isn't true!– e.g. time series have "serial correlation"
• one value is correlated with the next one• real DOFs more like ~ length / (autocorrelation decay time)• Another way to think: 2 DOFs per Fourier component
• Parametric significance hinges on DOFs– Hazard! This is why you should kick your data around
a lot before falling back on these canned tests.
![Page 70: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/70.jpg)
A cautionary tale
• Unnamed young assistant professor (and several coauthors)
• Studying year to year changes in the western edge of the Atlantic summer subtropical high– Important for climate impacts (moisture flux into
SE US, tropical storm steering)
• Watch carefully for null hypothesis...
![Page 71: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/71.jpg)
-Z850’ at FL panhandle &9y smooth-PDO 9y smooth-PDO + ¼ AMO 9y smooth
- global T
“We thoroughly investigated possible natural causes, including the Atlantic Multidecadal Oscillation (AMO) and Pacific Decadal Oscillation (PDO), but found no links...Our analysis strongly suggests that the changes in the NASH [Z850'] are mainly due to anthropogenic warming.”
This claim fails the eyeball test, in my view
![Page 72: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/72.jpg)
The evidence (mis)used:"Are the observed changes of the NASH caused by natural climate variability or anthropogenic forcing? We have examined the relationship between the changes of NASH and other natural decadal variability modes, such as the AMO and the PDO (Fig. 2). The correlation between the AMO (PDO) index and longitude of the western ridge is only 0.19 (0.18) and does not pass significance tests. Thus, natural decadal modes do not appear to explain the changes of NASH. We therefore examine the potential of anthropogenic forcing..."
unsmoothed indices, yet the word "decadal" is in the name
![Page 73: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/73.jpg)
The evidence (mis)used:The correlation between the AMO (PDO) index and longitude of the western ridge is only 0.19 (0.18) and does not pass significance tests. Thus, natural decadal modes do not appear to explain the changes of NASH.
This is factually correct (table): correlation would have to be 0.25 to be significantly (at 95%) different from zero, with 60 degrees of freedom (independent samples).
![Page 74: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/74.jpg)
Logical flaw: Null hypothesis misuse• Hypothesis: that PDO explains Z850 signal• Null hypothesis: that PDO-Z850 correlation is really
zero, and just happens to be 0.18 or 0.19 due to random sampling fluctuations
• t-test result: We cannot reject the null hypothesis with 95% confidence
• Fallacious leap: Authors concluded that the null hypothesis is therefore true, i.e. that "no links" to PDO are "strongly suggest[ed]" by evidence (as stated in their popular-press quote).
![Page 75: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/75.jpg)
Flaw in the spirit of "null"
• Hypothesis: that a trend is in the data, ready to extrapolate into the future (which would be a splashy, newsworthy result)
• Null Hyp: That previously described natural oscillations suffice to explain the low frequency component of the data (oatmeal)
• The first test: eyeball
![Page 76: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/76.jpg)
-Z850’ at FL panhandle &9y smooth-PDO 9y smooth-PDO + ¼ AMO 9y smooth
The correlation of these smoothed curves would be
much higher than 0.19, but with only
~2 DOFs.
Beware very small N like that! Trust your eyes at that
point, not a canned test.
The correlation between the AMultidecadalO (PDecadalO) index and longitude of the
western ridge is only 0.19 (0.18) and does not pass significance tests. Thus, natural decadal
modes do not appear to explain the changes...
Subtler point: spectral view of DOFs in time seriesUse smoothing to isolate "decadal" part of noisy
"indices" (pattern correlations, defined every day)
![Page 77: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/77.jpg)
Went wrong from step 0 (choice of variable to study)
Z850' psi'
![Page 78: Hypothesis Testing "Parametric" tests -- we will have to assume Normal distributions (usually) in ways detailed below These standard tests are useful to.](https://reader038.fdocuments.us/reader038/viewer/2022110100/56649e0d5503460f94af70d5/html5/thumbnails/78.jpg)
v850' (the real interest)