Hypothesis testing and confidence intervals by resampling
description
Transcript of Hypothesis testing and confidence intervals by resampling
Hypothesis testing and confidence intervals by resampling
by J. Kárász
Contents
• The bootstrap method• The bootstrap analysis of the Kolmogorov-
Smirnov test• The bootstrap analysis of the GEV parameter
investing• The bootstrap analysis of the movig window
method
Testing of homogenity
• Kolmogorov-Smirnov test
• Investigating the GEV parameter of dependence on time
Bootstrap methodconditions of use
FXXXiid
n ~,..., 21
is a random sample from the unknown probability distribution function (F) with finite variance.
)(F is the unknown parameter, the function of F.
),...,(ˆˆ21 nXXX
is the non-parametric estimate of the parameter, the function of the random sample.
Bootstrap methodbootstrap estimate of the standard error
2/1
21 ),...,(ˆ)( nF XXXVarF
is the standard error of the estimate.
Then the bootstrap estimate is) ˆ ( ˆF
Unfortunatly in most cases it’s impossible to express it as a simple function of or the random sample, so we have to use numeric approximation.
F
Bootstrap methodbootstrap sample and bootstrap replicate
To approximate the empirical distribution function, the bootstrap algorithm takes random samples from the empirical distribution function:
FXXXiid
nˆ,..., ~21
where njin
xXP ji ,...2,1, 1
)(
This is the bootstrap sample. It is nothing else but a random sample from with replacement. By evaluating the statistic of interest we get a bootstrap replicate:
nxxx ,..., 21
)(ˆ),...,(ˆˆ21
XXXX n
Bootstrap methodapproximation with Monte Carlo method
1. Independently draw a large number of bootstrap samples:
2. Evaluate the statistic of interest, so we get B bootstrap replicates:
3. Calculate the sample mean and sample standard deviation of the replicates:
)(),...2(),1( BXXX
BbbXb ,...2,1 ))((ˆ)(ˆ
Bb
BbB
bB
)(ˆ)(ˆ
)1()(ˆ)(ˆˆ2/1
1
2
Bootstrap methodconfidence intervals
The histogram of the bootstrap replicates is an empirical density function for , so the and histogram percentiles are suitable limit estimates for the percent confidence interval.
Bootstrap methodhypotesis testing
H0: are iid random variables.H1: are not iid random variables.The bootstrap samples are drawn from the same
distribution ( ) independently, so if H0 holds, then and the replicates are from quite similar distribution, because F and are similar for large sample size.
If is out of the empirical confidence interval,we accept H1, in other case, we accept H0.
nXXX ,..., 21
nXXX ,..., 21
F
F
Kolmogorov-Smirnov test
H0: are iid random variables.
H1: are not iid random variables.
Our suppose was that the annual maximum water levels are independent, so if we refuse H0 we have to accept that are not from the same distribution. This can even mean trend.
nXXX ,..., 21
nXXX ,..., 21
nXXX ,..., 21
Resultst=1 t=2 t=3
namhnamqpolh 1polqszeghszolh 1 1tbhtivhtivqzahhzahq
1 means H0 was refused.
t=1,2,3 : parameter for cutpoint in K-S test.
Results - exampleAnnual maximum water level at Szolnok, t=1
akonf 0.08170000stat 0.40400000mean 0.58684509fkonf 0.98890000
Results - example
Annual maximum water level at Szolnok, t=2
akonf 0.07790000stat 0.00960000mean 0.58174930fkonf 0.98710000
Examinating the GEV parameter of shapeH0: are iid random variables.
H1: are not iid random variables.
Our suppose is that the annual maximum water levels are independent, so if we refuse H0 we have to accept that are not from the same distribution. This can even mean trend.
nXXX ,..., 21
nXXX ,..., 21
nXXX ,..., 21
Results
namhnamqpolhpolqszeghszolhtbhtivhtivq 1zahhzahq
1 means H0 was refused.
No dataset was found to refuse H0 both in K-S test and GEV parameter testing.
Results – data table
mean: stat: a. kvant f. kvantnamh 0.001840892 0.001009485 0.0009822134 0.002242123namq 0.003384007 0.002375165 0.002281889 0.00474283polh 0.002357022 0.002284407 0.002236172 0.002665589polq -0.0008121447 0.0002126318 -0.005037174 0.0002659811szeg 0.0005446647 0.000524554 0.0005190951 0.000650231szol 0.002080454 0.002240719 0.001939278 0.002299348tbh -0.001831871 -0.002128239 -0.002172273 0.001530627tbq 0.000863550 0.0003681706 0.000368170646947404 0.00456312439698707tivh 0.001255741 0.0006564334 0.0006503345 0.002444052tivq 0.002279426 0.008236962 0.00144411 0.008093537zah 0.0009532596 0.0009549216 0.0008029558 0.001073905zah 0.002086978 0.002254521 0.00196992 0.002347529
Results - example
mean: stat: a. kvant f. kvantpolh 0.002357022 0.002284407 0.002236172 0.002665589tivq 0.002279426 0.008236962 0.00144411 0.008093537
Further questions – two-peeked bootstrap empirical distributions
Permutation testingsimilar to bootstrap method
Same as the bootstrap algorithm except that permutation sample is drawn without replacement.
The hypotesis testing is similar too, we examine the estimate and the empirical confidence interval.
Moving window method forecastanalysis by permutation method
Our aims were:
• Simulating the original dataset by permutation.
• Supervise the quality of the forecast.
Results -example
Results - example
Results - example
Results - example