Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU...

70
Stochastic Simulation The Bootstrap method Bo Friis Nielsen Institute of Mathematical Modelling Technical University of Denmark 2800 Kgs. Lyngby – Denmark Email: [email protected]

Transcript of Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU...

Page 1: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

Stochastic SimulationThe Bootstrap methodBo Friis Nielsen

Institute of Mathematical Modelling

Technical University of Denmark

2800 Kgs. Lyngby – Denmark

Email: [email protected]

Page 2: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 2DTU

The Bootstrap methodThe Bootstrap method

Page 3: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 2DTU

The Bootstrap methodThe Bootstrap method

• A technique for estimating the variance (etc) of an estimator.

Page 4: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 2DTU

The Bootstrap methodThe Bootstrap method

• A technique for estimating the variance (etc) of an estimator.

• Based on sampling from the empirical distribution.

Page 5: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 2DTU

The Bootstrap methodThe Bootstrap method

• A technique for estimating the variance (etc) of an estimator.

• Based on sampling from the empirical distribution.

• Non-parametric technique

Page 6: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 3DTU

Recall the simple situationRecall the simple situation

Page 7: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 3DTU

Recall the simple situationRecall the simple situation

• We have n observations

Page 8: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 3DTU

Recall the simple situationRecall the simple situation

• We have n observations xi, i = 1, . . . , n.

Page 9: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 3DTU

Recall the simple situationRecall the simple situation

• We have n observations xi, i = 1, . . . , n.

• If we want to estimate the mean value of the underlying

distribution,

Page 10: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 3DTU

Recall the simple situationRecall the simple situation

• We have n observations xi, i = 1, . . . , n.

• If we want to estimate the mean value of the underlying

distribution, we (typically) just use the estimator

Page 11: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 3DTU

Recall the simple situationRecall the simple situation

• We have n observations xi, i = 1, . . . , n.

• If we want to estimate the mean value of the underlying

distribution, we (typically) just use the estimator x̄ =∑

xi/n.

Page 12: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 3DTU

Recall the simple situationRecall the simple situation

• We have n observations xi, i = 1, . . . , n.

• If we want to estimate the mean value of the underlying

distribution, we (typically) just use the estimator x̄ =∑

xi/n.

• This estimator has the variance

Page 13: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 3DTU

Recall the simple situationRecall the simple situation

• We have n observations xi, i = 1, . . . , n.

• If we want to estimate the mean value of the underlying

distribution, we (typically) just use the estimator x̄ =∑

xi/n.

• This estimator has the variance 1

nVar(X).

Page 14: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 3DTU

Recall the simple situationRecall the simple situation

• We have n observations xi, i = 1, . . . , n.

• If we want to estimate the mean value of the underlying

distribution, we (typically) just use the estimator x̄ =∑

xi/n.

• This estimator has the variance 1

nVar(X). To estimate this, we

(typically) just use the sample variance.

Page 15: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 4DTU

A not-so-simple-situationA not-so-simple-situation

Page 16: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 4DTU

A not-so-simple-situationA not-so-simple-situation

• Assume we want to estimate the median, rather than the mean.

Page 17: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 4DTU

A not-so-simple-situationA not-so-simple-situation

• Assume we want to estimate the median, rather than the mean.

• (This makes much sense w.r.t. robustness)

Page 18: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 4DTU

A not-so-simple-situationA not-so-simple-situation

• Assume we want to estimate the median, rather than the mean.

• (This makes much sense w.r.t. robustness)

• The natural estimator for the median is the sample median.

Page 19: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 4DTU

A not-so-simple-situationA not-so-simple-situation

• Assume we want to estimate the median, rather than the mean.

• (This makes much sense w.r.t. robustness)

• The natural estimator for the median is the sample median.

• But what is the variance of the estimator?

Page 20: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 5DTU

The variance of the sample medianThe variance of the sample median

If we had access to the “true” underlying distribution,

Page 21: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 5DTU

The variance of the sample medianThe variance of the sample median

If we had access to the “true” underlying distribution, we could

Page 22: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 5DTU

The variance of the sample medianThe variance of the sample median

If we had access to the “true” underlying distribution, we could

1. Simulate a number of data sets like the one we had.

Page 23: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 5DTU

The variance of the sample medianThe variance of the sample median

If we had access to the “true” underlying distribution, we could

1. Simulate a number of data sets like the one we had.

2. For each simulated data set, compute the median.

Page 24: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 5DTU

The variance of the sample medianThe variance of the sample median

If we had access to the “true” underlying distribution, we could

1. Simulate a number of data sets like the one we had.

2. For each simulated data set, compute the median.

3. Finally report the variance among these medians.

Page 25: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 5DTU

The variance of the sample medianThe variance of the sample median

If we had access to the “true” underlying distribution, we could

1. Simulate a number of data sets like the one we had.

2. For each simulated data set, compute the median.

3. Finally report the variance among these medians.

We don’t have the true distribution.

Page 26: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 5DTU

The variance of the sample medianThe variance of the sample median

If we had access to the “true” underlying distribution, we could

1. Simulate a number of data sets like the one we had.

2. For each simulated data set, compute the median.

3. Finally report the variance among these medians.

We don’t have the true distribution. But we have the empirical

distribution!

Page 27: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 6DTU

Empirical distributionEmpirical distribution

Page 28: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 6DTU

Empirical distributionEmpirical distribution20 N(0, 1) variates (sorted):

Page 29: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 6DTU

Empirical distributionEmpirical distribution20 N(0, 1) variates (sorted): -2.20, -1.68, -1.43, -0.77, -0.76, -0.12, 0.30,

0.39, 0.41, 0.44, 0.44, 0.71, 0.85, 0.87, 1.15, 1.37, 1.41, 1.81, 2.65,

3.69

Page 30: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 6DTU

Empirical distributionEmpirical distribution20 N(0, 1) variates (sorted): -2.20, -1.68, -1.43, -0.77, -0.76, -0.12, 0.30,

0.39, 0.41, 0.44, 0.44, 0.71, 0.85, 0.87, 1.15, 1.37, 1.41, 1.81, 2.65,

3.69

Page 31: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 6DTU

Empirical distributionEmpirical distribution20 N(0, 1) variates (sorted): -2.20, -1.68, -1.43, -0.77, -0.76, -0.12, 0.30,

0.39, 0.41, 0.44, 0.44, 0.71, 0.85, 0.87, 1.15, 1.37, 1.41, 1.81, 2.65,

3.69

Page 32: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 7DTU

Empirical distributionEmpirical distribution

Page 33: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 7DTU

Empirical distributionEmpirical distribution

Xi iid random variables with F (x) = P(X ≤ x)

Page 34: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 7DTU

Empirical distributionEmpirical distribution

Xi iid random variables with F (x) = P(X ≤ x)

Each leads to a (simple) random function Fe,i(x) = 1{Xi≤x}

leading to Fe(x) =1

n

∑n

i=1Fe,i(x) =

1

n

∑n

i=11{Xi≤x}

Page 35: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 7DTU

Empirical distributionEmpirical distribution

Xi iid random variables with F (x) = P(X ≤ x)

Each leads to a (simple) random function Fe,i(x) = 1{Xi≤x}

leading to Fe(x) =1

n

∑n

i=1Fe,i(x) =

1

n

∑n

i=11{Xi≤x}

E (Fe(x))

Page 36: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 7DTU

Empirical distributionEmpirical distribution

Xi iid random variables with F (x) = P(X ≤ x)

Each leads to a (simple) random function Fe,i(x) = 1{Xi≤x}

leading to Fe(x) =1

n

∑n

i=1Fe,i(x) =

1

n

∑n

i=11{Xi≤x}

E (Fe(x)) = E(

1

n

∑n

i=11{Xi≤x}

)

Page 37: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 7DTU

Empirical distributionEmpirical distribution

Xi iid random variables with F (x) = P(X ≤ x)

Each leads to a (simple) random function Fe,i(x) = 1{Xi≤x}

leading to Fe(x) =1

n

∑n

i=1Fe,i(x) =

1

n

∑n

i=11{Xi≤x}

E (Fe(x)) = E(

1

n

∑n

i=11{Xi≤x}

)

= 1

n

∑n

i=1E(

1{Xi≤x}

)

Page 38: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 7DTU

Empirical distributionEmpirical distribution

Xi iid random variables with F (x) = P(X ≤ x)

Each leads to a (simple) random function Fe,i(x) = 1{Xi≤x}

leading to Fe(x) =1

n

∑n

i=1Fe,i(x) =

1

n

∑n

i=11{Xi≤x}

E (Fe(x)) = E(

1

n

∑n

i=11{Xi≤x}

)

= 1

n

∑n

i=1E(

1{Xi≤x}

)

= F (x)

Once we have sample

Page 39: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 7DTU

Empirical distributionEmpirical distribution

Xi iid random variables with F (x) = P(X ≤ x)

Each leads to a (simple) random function Fe,i(x) = 1{Xi≤x}

leading to Fe(x) =1

n

∑n

i=1Fe,i(x) =

1

n

∑n

i=11{Xi≤x}

E (Fe(x)) = E(

1

n

∑n

i=11{Xi≤x}

)

= 1

n

∑n

i=1E(

1{Xi≤x}

)

= F (x)

Once we have sample xi,

Page 40: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 7DTU

Empirical distributionEmpirical distribution

Xi iid random variables with F (x) = P(X ≤ x)

Each leads to a (simple) random function Fe,i(x) = 1{Xi≤x}

leading to Fe(x) =1

n

∑n

i=1Fe,i(x) =

1

n

∑n

i=11{Xi≤x}

E (Fe(x)) = E(

1

n

∑n

i=11{Xi≤x}

)

= 1

n

∑n

i=1E(

1{Xi≤x}

)

= F (x)

Once we have sample xi, i = 1, 2, . . . , n

Page 41: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 7DTU

Empirical distributionEmpirical distribution

Xi iid random variables with F (x) = P(X ≤ x)

Each leads to a (simple) random function Fe,i(x) = 1{Xi≤x}

leading to Fe(x) =1

n

∑n

i=1Fe,i(x) =

1

n

∑n

i=11{Xi≤x}

E (Fe(x)) = E(

1

n

∑n

i=11{Xi≤x}

)

= 1

n

∑n

i=1E(

1{Xi≤x}

)

= F (x)

Once we have sample xi, i = 1, 2, . . . , n we have a realised version

Page 42: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 7DTU

Empirical distributionEmpirical distribution

Xi iid random variables with F (x) = P(X ≤ x)

Each leads to a (simple) random function Fe,i(x) = 1{Xi≤x}

leading to Fe(x) =1

n

∑n

i=1Fe,i(x) =

1

n

∑n

i=11{Xi≤x}

E (Fe(x)) = E(

1

n

∑n

i=11{Xi≤x}

)

= 1

n

∑n

i=1E(

1{Xi≤x}

)

= F (x)

Once we have sample xi, i = 1, 2, . . . , n we have a realised version

of the empirical distribution function

Fe(x)

Page 43: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 7DTU

Empirical distributionEmpirical distribution

Xi iid random variables with F (x) = P(X ≤ x)

Each leads to a (simple) random function Fe,i(x) = 1{Xi≤x}

leading to Fe(x) =1

n

∑n

i=1Fe,i(x) =

1

n

∑n

i=11{Xi≤x}

E (Fe(x)) = E(

1

n

∑n

i=11{Xi≤x}

)

= 1

n

∑n

i=1E(

1{Xi≤x}

)

= F (x)

Once we have sample xi, i = 1, 2, . . . , n we have a realised version

of the empirical distribution function

Fe(x) =1

n

n∑

i=1

Page 44: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 7DTU

Empirical distributionEmpirical distribution

Xi iid random variables with F (x) = P(X ≤ x)

Each leads to a (simple) random function Fe,i(x) = 1{Xi≤x}

leading to Fe(x) =1

n

∑n

i=1Fe,i(x) =

1

n

∑n

i=11{Xi≤x}

E (Fe(x)) = E(

1

n

∑n

i=11{Xi≤x}

)

= 1

n

∑n

i=1E(

1{Xi≤x}

)

= F (x)

Once we have sample xi, i = 1, 2, . . . , n we have a realised version

of the empirical distribution function

Fe(x) =1

n

n∑

i=1

Fe,i(x)

Page 45: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 7DTU

Empirical distributionEmpirical distribution

Xi iid random variables with F (x) = P(X ≤ x)

Each leads to a (simple) random function Fe,i(x) = 1{Xi≤x}

leading to Fe(x) =1

n

∑n

i=1Fe,i(x) =

1

n

∑n

i=11{Xi≤x}

E (Fe(x)) = E(

1

n

∑n

i=11{Xi≤x}

)

= 1

n

∑n

i=1E(

1{Xi≤x}

)

= F (x)

Once we have sample xi, i = 1, 2, . . . , n we have a realised version

of the empirical distribution function

Fe(x) =1

n

n∑

i=1

Fe,i(x) =1

n

n∑

i=1

Page 46: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 7DTU

Empirical distributionEmpirical distribution

Xi iid random variables with F (x) = P(X ≤ x)

Each leads to a (simple) random function Fe,i(x) = 1{Xi≤x}

leading to Fe(x) =1

n

∑n

i=1Fe,i(x) =

1

n

∑n

i=11{Xi≤x}

E (Fe(x)) = E(

1

n

∑n

i=11{Xi≤x}

)

= 1

n

∑n

i=1E(

1{Xi≤x}

)

= F (x)

Once we have sample xi, i = 1, 2, . . . , n we have a realised version

of the empirical distribution function

Fe(x) =1

n

n∑

i=1

Fe,i(x) =1

n

n∑

i=1

δ{xi≤x}

Page 47: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 7DTU

Empirical distributionEmpirical distribution

Xi iid random variables with F (x) = P(X ≤ x)

Each leads to a (simple) random function Fe,i(x) = 1{Xi≤x}

leading to Fe(x) =1

n

∑n

i=1Fe,i(x) =

1

n

∑n

i=11{Xi≤x}

E (Fe(x)) = E(

1

n

∑n

i=11{Xi≤x}

)

= 1

n

∑n

i=1E(

1{Xi≤x}

)

= F (x)

Once we have sample xi, i = 1, 2, . . . , n we have a realised version

of the empirical distribution function

Fe(x) =1

n

n∑

i=1

Fe,i(x) =1

n

n∑

i=1

δ{xi≤x}

where δ

Page 48: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 7DTU

Empirical distributionEmpirical distribution

Xi iid random variables with F (x) = P(X ≤ x)

Each leads to a (simple) random function Fe,i(x) = 1{Xi≤x}

leading to Fe(x) =1

n

∑n

i=1Fe,i(x) =

1

n

∑n

i=11{Xi≤x}

E (Fe(x)) = E(

1

n

∑n

i=11{Xi≤x}

)

= 1

n

∑n

i=1E(

1{Xi≤x}

)

= F (x)

Once we have sample xi, i = 1, 2, . . . , n we have a realised version

of the empirical distribution function

Fe(x) =1

n

n∑

i=1

Fe,i(x) =1

n

n∑

i=1

δ{xi≤x}

where δ is Kroneckers delta-function

Page 49: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

The Bootstrap Algorithm for the variance of a

parameter estimator

The Bootstrap Algorithm for the variance of a

parameter estimator

Page 50: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

The Bootstrap Algorithm for the variance of a

parameter estimator

The Bootstrap Algorithm for the variance of a

parameter estimator• Given a data set with n observations.

Page 51: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

The Bootstrap Algorithm for the variance of a

parameter estimator

The Bootstrap Algorithm for the variance of a

parameter estimator• Given a data set with n observations.

• Simulate r

Page 52: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

The Bootstrap Algorithm for the variance of a

parameter estimator

The Bootstrap Algorithm for the variance of a

parameter estimator• Given a data set with n observations.

• Simulate r

• (e.g., r = 100)

Page 53: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

The Bootstrap Algorithm for the variance of a

parameter estimator

The Bootstrap Algorithm for the variance of a

parameter estimator• Given a data set with n observations.

• Simulate r

• (e.g., r = 100)

• data sets,

Page 54: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

The Bootstrap Algorithm for the variance of a

parameter estimator

The Bootstrap Algorithm for the variance of a

parameter estimator• Given a data set with n observations.

• Simulate r

• (e.g., r = 100)

• data sets,

• each with n “observations”

Page 55: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

The Bootstrap Algorithm for the variance of a

parameter estimator

The Bootstrap Algorithm for the variance of a

parameter estimator• Given a data set with n observations.

• Simulate r

• (e.g., r = 100)

• data sets,

• each with n “observations”

• sampled form the empirical distribution Fe.

Page 56: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

The Bootstrap Algorithm for the variance of a

parameter estimator

The Bootstrap Algorithm for the variance of a

parameter estimator• Given a data set with n observations.

• Simulate r

• (e.g., r = 100)

• data sets,

• each with n “observations”

• sampled form the empirical distribution Fe.

• (To simulate such one data set,

Page 57: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

The Bootstrap Algorithm for the variance of a

parameter estimator

The Bootstrap Algorithm for the variance of a

parameter estimator• Given a data set with n observations.

• Simulate r

• (e.g., r = 100)

• data sets,

• each with n “observations”

• sampled form the empirical distribution Fe.

• (To simulate such one data set, simply take n samples from

Page 58: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

The Bootstrap Algorithm for the variance of a

parameter estimator

The Bootstrap Algorithm for the variance of a

parameter estimator• Given a data set with n observations.

• Simulate r

• (e.g., r = 100)

• data sets,

• each with n “observations”

• sampled form the empirical distribution Fe.

• (To simulate such one data set, simply take n samples from the

original data set

Page 59: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

The Bootstrap Algorithm for the variance of a

parameter estimator

The Bootstrap Algorithm for the variance of a

parameter estimator• Given a data set with n observations.

• Simulate r

• (e.g., r = 100)

• data sets,

• each with n “observations”

• sampled form the empirical distribution Fe.

• (To simulate such one data set, simply take n samples from the

original data set with replacement)

Page 60: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

The Bootstrap Algorithm for the variance of a

parameter estimator

The Bootstrap Algorithm for the variance of a

parameter estimator• Given a data set with n observations.

• Simulate r

• (e.g., r = 100)

• data sets,

• each with n “observations”

• sampled form the empirical distribution Fe.

• (To simulate such one data set, simply take n samples from the

original data set with replacement)

• For each simulated data set,

Page 61: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

The Bootstrap Algorithm for the variance of a

parameter estimator

The Bootstrap Algorithm for the variance of a

parameter estimator• Given a data set with n observations.

• Simulate r

• (e.g., r = 100)

• data sets,

• each with n “observations”

• sampled form the empirical distribution Fe.

• (To simulate such one data set, simply take n samples from the

original data set with replacement)

• For each simulated data set, estimate the parameter of interest

Page 62: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

The Bootstrap Algorithm for the variance of a

parameter estimator

The Bootstrap Algorithm for the variance of a

parameter estimator• Given a data set with n observations.

• Simulate r

• (e.g., r = 100)

• data sets,

• each with n “observations”

• sampled form the empirical distribution Fe.

• (To simulate such one data set, simply take n samples from the

original data set with replacement)

• For each simulated data set, estimate the parameter of interest

(e.g., the median).

Page 63: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

The Bootstrap Algorithm for the variance of a

parameter estimator

The Bootstrap Algorithm for the variance of a

parameter estimator• Given a data set with n observations.

• Simulate r

• (e.g., r = 100)

• data sets,

• each with n “observations”

• sampled form the empirical distribution Fe.

• (To simulate such one data set, simply take n samples from the

original data set with replacement)

• For each simulated data set, estimate the parameter of interest

(e.g., the median). This is a bootstrap replicate of the estimate.

Page 64: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

The Bootstrap Algorithm for the variance of a

parameter estimator

The Bootstrap Algorithm for the variance of a

parameter estimator• Given a data set with n observations.

• Simulate r

• (e.g., r = 100)

• data sets,

• each with n “observations”

• sampled form the empirical distribution Fe.

• (To simulate such one data set, simply take n samples from the

original data set with replacement)

• For each simulated data set, estimate the parameter of interest

(e.g., the median). This is a bootstrap replicate of the estimate.

• Finally report the variance among the bootstrap replicates.

Page 65: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 9DTU

Advantages of the Bootstrap methodAdvantages of the Bootstrap method

Page 66: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 9DTU

Advantages of the Bootstrap methodAdvantages of the Bootstrap method

• Does not require the distribution in parametric form.

Page 67: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 9DTU

Advantages of the Bootstrap methodAdvantages of the Bootstrap method

• Does not require the distribution in parametric form.

• Easily implemented.

Page 68: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 9DTU

Advantages of the Bootstrap methodAdvantages of the Bootstrap method

• Does not require the distribution in parametric form.

• Easily implemented.

• Applies also to estimators which cannot easily be analysed.

Page 69: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

02443 – lecture 10 9DTU

Advantages of the Bootstrap methodAdvantages of the Bootstrap method

• Does not require the distribution in parametric form.

• Easily implemented.

• Applies also to estimators which cannot easily be analysed.

• Generalizes e.g. to confidence intervals.

Page 70: Sample of slides for Sandsynlighedsregning · 02443–lecture10 2 DTU TheBootstrapmethodTheBootstrapmethod • A technique for estimating the variance (etc) of an estimator.

Exercise 8Exercise 8

1. Exercise 13 in Chapter 8 of Ross (P.152).

2. Exercise 15 in Chapter 8 of Ross (P.152).

3. Write a subroutine that takes as input a “data” vector of

observed values, and which outputs the median as well as the

bootstrap estimate of the variance of the median, based on

r = 100 bootstrap replicates. Simulate N = 200 Pareto

distributed random variates with β = 1 and k = 1.05.

(a) Compute the mean and the median (of the sample)

(b) Make the bootstrap estimate of the variance of the sample

mean.

(c) Make the bootstrap estimate of the variance of the sample

median.

(d) Compare the precision of the estimated median with the

precision of the estimated mean.