Hypothesis testing and confidence intervals by resampling

Hypothesis testing and confidence intervals by resampling

by J. Kárász

Contents

• The bootstrap method• The bootstrap analysis of the Kolmogorov-

Smirnov test• The bootstrap analysis of the GEV parameter

investing• The bootstrap analysis of the movig window

method

Testing of homogenity

• Kolmogorov-Smirnov test

• Investigating the GEV parameter of dependence on time

Bootstrap methodconditions of use

FXXXiid

n ~,..., 21

is a random sample from the unknown probability distribution function (F) with finite variance.

)(F is the unknown parameter, the function of F.

),...,(ˆˆ21 nXXX

is the non-parametric estimate of the parameter, the function of the random sample.

Bootstrap methodbootstrap estimate of the standard error

2/1

21 ),...,(ˆ)( nF XXXVarF

is the standard error of the estimate.

Then the bootstrap estimate is) ˆ ( ˆF

Unfortunatly in most cases it’s impossible to express it as a simple function of or the random sample, so we have to use numeric approximation.

F

Bootstrap methodbootstrap sample and bootstrap replicate

To approximate the empirical distribution function, the bootstrap algorithm takes random samples from the empirical distribution function:

FXXXiid

nˆ,..., ~21

where njin

xXP ji ,...2,1, 1

)(

This is the bootstrap sample. It is nothing else but a random sample from with replacement. By evaluating the statistic of interest we get a bootstrap replicate:

nxxx ,..., 21

)(ˆ),...,(ˆˆ21

XXXX n

Bootstrap methodapproximation with Monte Carlo method

1. Independently draw a large number of bootstrap samples:

2. Evaluate the statistic of interest, so we get B bootstrap replicates:

3. Calculate the sample mean and sample standard deviation of the replicates:

)(),...2(),1( BXXX

BbbXb ,...2,1 ))((ˆ)(ˆ

Bb

BbB

bB

)(ˆ)(ˆ

)1()(ˆ)(ˆˆ2/1

1

2

Bootstrap methodconfidence intervals

The histogram of the bootstrap replicates is an empirical density function for , so the and histogram percentiles are suitable limit estimates for the percent confidence interval.

Bootstrap methodhypotesis testing

H0: are iid random variables.H1: are not iid random variables.The bootstrap samples are drawn from the same

distribution ( ) independently, so if H0 holds, then and the replicates are from quite similar distribution, because F and are similar for large sample size.

If is out of the empirical confidence interval,we accept H1, in other case, we accept H0.

nXXX ,..., 21

nXXX ,..., 21

F

F

Kolmogorov-Smirnov test

H0: are iid random variables.

H1: are not iid random variables.

Our suppose was that the annual maximum water levels are independent, so if we refuse H0 we have to accept that are not from the same distribution. This can even mean trend.

nXXX ,..., 21

nXXX ,..., 21

nXXX ,..., 21

Resultst=1 t=2 t=3

namhnamqpolh 1polqszeghszolh 1 1tbhtivhtivqzahhzahq

1 means H0 was refused.

t=1,2,3 : parameter for cutpoint in K-S test.

Results - exampleAnnual maximum water level at Szolnok, t=1

akonf 0.08170000stat 0.40400000mean 0.58684509fkonf 0.98890000

Results - example

Annual maximum water level at Szolnok, t=2

akonf 0.07790000stat 0.00960000mean 0.58174930fkonf 0.98710000

Examinating the GEV parameter of shapeH0: are iid random variables.

H1: are not iid random variables.

Our suppose is that the annual maximum water levels are independent, so if we refuse H0 we have to accept that are not from the same distribution. This can even mean trend.

nXXX ,..., 21

nXXX ,..., 21

nXXX ,..., 21

Results

namhnamqpolhpolqszeghszolhtbhtivhtivq 1zahhzahq

1 means H0 was refused.

No dataset was found to refuse H0 both in K-S test and GEV parameter testing.

Results – data table

mean: stat: a. kvant f. kvantnamh 0.001840892 0.001009485 0.0009822134 0.002242123namq 0.003384007 0.002375165 0.002281889 0.00474283polh 0.002357022 0.002284407 0.002236172 0.002665589polq -0.0008121447 0.0002126318 -0.005037174 0.0002659811szeg 0.0005446647 0.000524554 0.0005190951 0.000650231szol 0.002080454 0.002240719 0.001939278 0.002299348tbh -0.001831871 -0.002128239 -0.002172273 0.001530627tbq 0.000863550 0.0003681706 0.000368170646947404 0.00456312439698707tivh 0.001255741 0.0006564334 0.0006503345 0.002444052tivq 0.002279426 0.008236962 0.00144411 0.008093537zah 0.0009532596 0.0009549216 0.0008029558 0.001073905zah 0.002086978 0.002254521 0.00196992 0.002347529

Results - example

mean: stat: a. kvant f. kvantpolh 0.002357022 0.002284407 0.002236172 0.002665589tivq 0.002279426 0.008236962 0.00144411 0.008093537

Further questions – two-peeked bootstrap empirical distributions

Permutation testingsimilar to bootstrap method

Same as the bootstrap algorithm except that permutation sample is drawn without replacement.

The hypotesis testing is similar too, we examine the estimate and the empirical confidence interval.

Moving window method forecastanalysis by permutation method

Our aims were:

• Simulating the original dataset by permutation.

• Supervise the quality of the forecast.

Results -example

Results - example

Hypothesis testing and confidence intervals by resampling

Documents

Transcript of Hypothesis testing and confidence intervals by resampling