Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services...

30
Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model Evaluation and Alternatives Seattle, January 11, 2004

Transcript of Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services...

Page 1: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Permutation Procedures, Bootstrap Methods and the

Jackknife

Bob LivezeyClimate Services Division/OCWWS/NWS

AMS Short Course on Significance Testing, Model Evaluation and Alternatives

Seattle, January 11, 2004

Page 2: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Outline

• Introduction– Problems addressed– What is being done, why, and how

• Resampling/rerandomization primer• Bootstrap/correlation example

– Histograms, standard error, bias, confidence intervals– Significance test

• Multivariate applications– Discussion examples– Livezey and Chen example

• Serial correlation– Impact– Solutions

• Summary

Page 3: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Introduction

• Problems: A statistic has been estimated from a sample, we want to– know how confident we can be in the estimator and what its

standard error and bias are, and– gauge the estimator against a null distribution we want to

discount

• What, why, and how.– Rather than using classical and/or analytical statistics we use

brute force (Monte Carlo) computations to generate huge numbers of synthetic or fake samples. These samples form the basis for constructing sampling distributions of either the estimator itself or its null distribution to address respectively the two problems.

Page 4: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Introduction

• What, why, and how.– It is not clear assumptions for usual

approaches are satisfied.– Sample sizes are too small for satisfactory

application of usual approaches.– It is not easy or possible to derive analytical

descriptions of distributions for the estimator.– The inference problem is complicated.

Page 5: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Introduction

• What, why, and how.– Resampling/rerandomization: Using the

available sample to generate additional samples.

– Statistical modeling: Fitting a model to the available sample and using the model to generate additional samples, another meaning for “Monte Carlo Method,” ex. is time series modeling.

Page 6: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Introduction

• Take away knowledge:– Clear intuitive understanding of the basic

problems, and whys and hows of computer intensive solutions to the problems.

– Basic algorithms for permutation, bootstrap, and jackknife procedures and when to use.

– The necessity to preserve spatial-temporal interdependence in applying methods.

– Reference sources to build understanding and study more examples.

Page 7: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Comparison of Resampling Techniques

Resampling Procedure ApplicationsPermutationPermutation Samples are drawn

at random from original pool without replacement

Tests of hypotheses

BootstrapBootstrap11 Samples are drawn with replacement

Tests of hypotheses

AND

Standard error, bias, and confidence intervals of estimator

Jack KnifeJack Knife22 Samples consist of original pool with one at a time withheld

Standard error, bias, and confidence intervals of estimator

1 Most versatile.2 Generally outperformed by others.

Page 8: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Resampling Examples

• Mean DJF temperature in Eastern North Dakota for 10 moderate to strong El Nino years from a 60-year record.

• Null hypothesis is that moderate to strong El Ninos do not impact DJF temperature in Eastern North Dakota.

• Null distribution is for average of 10 DJFs chosen randomly.

Page 9: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Resampling Examples

• Null distributions from permutation and bootstrap procedures:

– Permutation: Shuffle the 60 years, relabel them, pull out the 10 relabeled El Nino years and average them (equivalent to random draw of 10 from 60 without replacement). Repeat huge (1000?) number of times.

– Bootstrap: Shuffle a huge deck where the 60 years are replicated many, many times, take the first 60 and relabel (same as random draw of 10 from 60 with replacement). Repeat huge (1000?) number of times.

Page 10: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Re

lati

ve

Fre

qu

enc

y (%

)NULL RESAMPLING DISTRIBUTIONS (1000 samples)

10 Year Means of Eastern North Dakota DJF Temperature (1941-2000)

0

2

4

6

8

10

12

14

16

4 6 8 10 12 14 16 18 20

Bootstrap

Permutation

0.5º F Bins (Upper limits)

Page 11: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Resampling Examples

• Distribution of 10 El Nino-year mean from bootstrap and jackknife procedures:

– Bootstrap: Shuffle a huge deck where the 10 El Nino years are replicated many, many times and average the first 10 (equivalent to random draw of 10 from 10 with replacement). Repeat huge (1000?) number of times.

– Jackknife: Delete one of 10 El Nino years from the sample and average the rest. Repeat for each of the 10 years. Produce 10 9-year means.

Page 12: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Re

lati

ve

Fre

qu

enc

y (%

)RESAMPLING DISTRIBUTIONS

10 Year Means of Eastern North Dakota DJF Temperature (1941-2000)

0

10

20

30

40

4 6 8 10 12 14 16 18 20

Bootstrap (1000)

Jackknife (10)

0.5º F Bins (Upper limits)

Page 13: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

0

2

4

6

8

10

12

14

16

4 6 8 10 12 14 16 18 20

Null ElNino

0.5º F Bins (Upper limits)

Re

lati

ve

Fre

qu

enc

y (%

)BOOTSTRAP DISTRIBUTIONS (1000 samples)

10 Year Means of Eastern North Dakota DJF Temperature (1941-2000)

Page 14: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Resampling Examples

• Notes for permutation and bootstrap:

– Random selection uses uniform distribution by assigning probability of 1/N (N is sample size) to each member of the sample being drawn from.

– Number of replications depends on the distribution attribute and precision desired (ex. information about the tails).

Page 15: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Bootstrap Correlation Examples

• Correlations between JFM temperature for CD93 (San Diego) and CD76 (Olympic Peninsula) and CD67 (Central Florida) are respectively 0.72 and -0.3.

• Computed– 10,000-sample bootstrap histograms for both. Paired

data were resampled with replacement.– 10,000-sample bootstrap null histogram for the

corr(CD93,CD67). Each series separately resampled with replacement to form pairs.

Page 16: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

0

1

2

3

4

-0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3

0

1

2

3

4

-0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5

BOOTSTRAP DISTRIBUTIONS (10000 Samples)

Null Correlation (1950-1999) between JFM Temperatures at CD93 and CD67

Correlation (1950-1999) between JFM Temperatures at CD93 and CD67

Correlation

Rel

ativ

e F

requ

ency

(%

)

.002 tail for corr -0.297

Page 17: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Bootstrap Correlation Examples

• Computed (continued)– For corr(CD93,CD76)

• Standard error

• Bias

• 68% (plus/minus one in standard normal distribution) confidence intervals

– Percentile method

– Bias-corrected percentile method (see Efron and Gong)

/ , * * * * B bb

B

bb

B

B B

2

1

1

2

1

1

* * B bb

B

B

1

1

Page 18: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

0

1

2

3

4

5

6

7

8

9

10

0.4 0.5 0.6 0.7 0.8

BOOTSTRAP DISTRIBUTION (1000 SAMPLES)FOR CORRELATION (1950-1999)

BETWEEN JFM TEMPERATURES AT CD93 and CD76

Correlation

Rel

ativ

e F

requ

ency

(%

)Correlation 0.7170.717

Page 19: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

0

1

2

3

4

5

6

7

8

9

10

0.4 0.5 0.6 0.7 0.8

Correlation

Rel

ativ

e F

requ

ency

(%

)

BOOTSTRAP DISTRIBUTION (10000 SAMPLES)FOR CORRELATION (1950-1999)

BETWEEN JFM TEMPERATURES AT CD93 and CD76

Correlation 0.7170.717Bias 0.0010.001St. error 0.0510.051Conf. Limits: Percentile method Bias-corrected

Page 20: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Multivariate Applications

• Sampling error for an estimator generally decreases as independent sample size increases. Ex. Florida January mean temperature.

Flo

rida

Jan

Te

mpe

ratu

re

(°F

)

Average

Start year

Page 21: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Multivariate Applications

• Samples drawn from different locations and/or times may not be independent of each other, i.e. spatially and/or serial correlated.

– Bootstrap and permutation resampling under the null hypothesis among such locations and/or times reduces or destroys this interdependence.

– This leads to null distributions that are too narrow.

Page 22: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Multivariate Applications

• Interdependencies must be preserved when resampling.– Ex. DJF skill score for CPC temperature forecasts at

100 locations over 10 winters.– Both forecasts and observations have considerable

spatial correlation.– Incorrect strategy for null distribution is to form

forecast/observation pairs by separately resampling with replacement 1000 pooled forecasts and 1000 pooled observations.

– Correct strategy is to form pairs by separately resampling with replacement 10 pooled forecast maps and 10 pooled observation maps.

Page 23: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Multivariate Applications

● In climate studies a defining problem is the Livezey and Chen (1983) example; determine the statistical significance of correlation of the SOI time series to the full field of NH seasonal mean 700 mb heights.

It will be used to illustrate:

The effects of spatial correlation on the spread of a false signal distribution;

Field significance.

Page 24: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Multivariate Applications

Livezey and Chen (1983) estimated the probability that a map with a similar number of locally significant correlations could have been obtained by chance.

They coined the term field significance for this probability.

Page 25: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Multivariate Applications

Sampling distributions developed by repeatedly computing correlations with random series instead of SOI– statistic is count of passed significance tests;

Distribution becomes narrower as the ratio of the domain size to signal scale increases (from C to A to B).

Page 26: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Serial Correlation

• Zwiers (1990) example of impact.– Generated a multivariate statistic (dimension m,

sample size 10) from a known null-distribution. Each m-variable is uncorrelated with the others but all have the same serial correlation.

– Used a permutation procedure to develop the null distribution from the sample.

– Tested the statistic against the constructed distribution at the 5% level.

– Repeated the experiment many, many times.– Noted the percent of times the null hypothesis is

rejected (should be near 5%).

Page 27: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Serial Correlation

• Zwiers (1990) example continued.– Percent rejections

– Serial correlation makes almost all of the tests worthless.

ρ

m 0.0 0.3 0.75

2 4 12 55

4 5 28 81

8 8 32 91

12

7 40 98

24 5 72 100

Page 28: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Serial Correlation• Remedies

– Model the time series with an autoregressive model and use the model to generate samples.

• Livezey and Chen could have done this with their SOI series.• Many meterological time series with the climatological seasonal

cycle removed are well represented by a red noise (AR(1), damped persistence) model:

• AR(1) model not appropriate for quasi-cyclical series, like MJO, QBO, etc.

• See references in Livezey (1999) for more guidance.

x t x t tx1 1

x xx t x x t x s ( ) ( / 2

Page 29: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

Serial Correlation

• Remedies continued– Use Moving-Blocks bootstrap

• Idea is to preserve much of the serial correlation by resampling blocks of data of length L with replacement to build up the full series from N/L blocks.

• There are N-L+1 blocks to choose from.• See Livezey (1999) for information (including

references) for choosing L.

Page 30: Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.

References

• Basic sources

– Diaconis, P., and B. Efron, 1983: Computer-intensive methods in statistics. Sci. Am., 248, 116-130. (Popular description.)

– Efron, B., and G. Gong, 1983: A leisurely look at the bootstrap, the jackknife, and cross-validation. Am. Stat., 37, 36-48. (Basic strategies and algorithms.)

– Efron, B., and R. Tibshirani, 1997: Improvements on cross-validation: the .632+ bootstrap method. J. Amer. Stat. Assoc., 92, 548-560.

• Texts

– Livezey, R. E., 1999: Chapter 9, Field intercomparison. Analysis of Climate Variability: Applications of Statistical Techniques, Second Updated and Extended Edition, Eds. H. von Storch and A. Navarra, Springer-Verlag, Berlin, 161-178. (Contains unlisted references.)

– von Storch, H., and F. W. Zwiers, 1999: Statistical Analysis in Climate Research, Cambridge University Press, 484pp.

– Wilks, D. S., 1995: Statistical Methods in the Atmospheric Sciences. Academic Press, 467pp.