Chapter 7. Exercise 1 Mean X*:7.6,8.1,9.6,10.2,10.7,12.3,13.4,13.9,14.6,15.2 Here the percentile BS...

Chapter 7

Exercise 1Mean X*:7.6,8.1,9.6,10.2,10.7,12.3,13.4,13.9,14.6,15.2

Here the percentile BS method is applied, testing

P is given by where is the proportion of bootstrap sample means greater than μo, in this case 8.5. 80% of the boostrap samples are greater than 8.5, thus P is estimated at 0.4. Calculation is given by (2✕(1-0.8))

The middle 80% of observed bootstraped sample means range from 8.1 to 14.6, so this would be the 0.8 CI for the population mean.

Exercise 2

x=c(5,12,23,24,6,58,9,18,11,66,15,8)

out(x)$out.val[1] 58 66

trimci(x)[1] 7.157829 22.842171

trimci(x, tr=0)[1] 8.504757 33.995243

Distribution is skewed with outliers Inflating variance for means but not trimmed means

Exercise 3x=c(5,12,23,24,6,58,9,18,11,66,15,8)> trimcibt(x,tr=0,side=F)[1] "Taking bootstrap samples. Please wait."[1] "NOTE: p.value is computed only when side=T"$estimate[1] 21.25

$ci[1] 12.40284 52.46772

$test.stat[1] 3.669678

trimci(x)[1] 7.157829 22.842171

trimci(x, tr=0)[1] 8.504757 33.995243

Exercise 4

The trimmed mean will be more accurate because skewness and outliers shift the the T distribution in a manner that changes its cumulative probabilities (and quantiles). Trimmed means, based in bootstrap methods (trimcibt), perform much better under skewness, and they remove outliers. The boostrap method estimates well the sampling distribution and actual T.

Exercise 5

y=c(7.6,8.1,9.6,10.2,10.7,12.3,13.4,13.9,14.6,15.2)trimcibt(y, side=F)

$estimate[1] 11.68333

$ci[1] 8.171324 15.754730

Exercise 6

x=c(5,12,23,24,6,58,9,18,11,66,15,8)> trimpb(x)[1] "The p-value returned by the this function is based on the"[1] "null value specified by the argument null.value, which defaults to 0"[1] "Taking bootstrap samples. Please wait."$ci[1] 9.75 31.50

$p.value[1] 0

Exercise 7

Percentile t bootstrap will be more accurate in this case, based on simulations studies:

Exercise 8

a=c(2,4,6,7,8,9,7,10,12,15,8,9,13,19,5,2,100,200,300,400)

onesampb(a)$ci[1] 7.425758 19.806385

Exercise 9a=c(2,4,6,7,8,9,7,10,12,15,8,9,13,19,5,2,100,200,300,400)

onesampb(a)$ci[1] 7.425758 19.806385

trimpb(a)$ci[1] 7.25000 63.33333

The M estimators remove the outliers by design. In contrast, when bootstrapping trimmed means with 4 outliers, some bootstrap samples will include more than 4 outliers (sampling with replacement) in a manner that exceeds the 20% trimming. For this reason, the CI will be longer for trimmed means.

Exercise 10

trimpb(a,tr=.30)$ci[1] 7.125 35.500

CI is shorter relative to 20% trimming.The trimming is higher so it can withstand higher number of outliers that are sampled in the bootstrap procedure.

Exercise 11

trimpb(a,tr=.40)[1] 7.0 14.5

CI is now even shorter because the trimming is high enough to handle almost all outliers that are sampled in the bootstrap procedure.

Exercise 12

The bootstrap method for estimating standard errors can be highly inaccurate with respect to probability coverage in the case of the median when there are tied values in the data set.

Tied values can significantly alter the estimate of the median across different bootstrap samples.

Exercise 13

sint(a)[1] 7.00000 14.52953> onesampb(a)$ci[1] 7.425758 19.806385

There are situation where CI for the median is shorter than CI for M estimators even though they remove outliers.

Exercise 14Data from table 6.7, not 6.6. x=c(34,49,49,44,66,48,49,39,54,57,39,65,43,43,44,42,71,40,41,38,42,77,40,38,43,42,36,55,57,57,41,66,69,38,49,51,45,141,133,76,44,40,56,50,75,44,181,45,61,15,23,42,61,146,144,89,71,83,49,43,68,57,60,56,63,136,49,57,64,43,71,38,74,84,75,64,48)

y=c(129,107,91,110,104,101,105,125,82,92,104,134,105,95,101,104,105,122,98,104,95,93,105,132,98,112,95,102,72,103,102,102,80,125,93,105,79,125,102,91,58,104,58,129,58,90,108,95,85,84,77,85,82,82,111,58,99,77,102,82,95,95,82,72,93,114,108,95,72,95,68,119,84,75,75,122,127) lsfitci(x,y)$intercept.ci[1] 89.80599 110.16031

$slope.ci [,1] [,2][1,] -0.3293914 0.1169401

Exercise 15

corb(x,y)$r[1] -0.03474317

$ci[1] -0.3326596 0.2138606

CI contains 0, so do not reject

Exercise 16

indt(x,y)[1] "Taking bootstrap sample, please wait.”$dstat[1] 26.20945$p.value.d[1] 0.016

Test for independence is rejected, so variable are dependent.

Exercise 17

0 correlation does not mean independence. There are many situations for which there is an association between variables while r=0, some pertain to curvature, heteroscedasticity, and bad leverage points.

Exercises 18 and 19lsfitci(x,y,xout=T)$intercept.ci[1] 104.4706 140.0141

$slope.ci [,1] [,2][1,] -0.8733226 -0.07469801

After manual removallsfitci(x,y)[$intercept.ci[1] 104.3009 137.8797

$slope.ci [,1] [,2][1,] -0.7979083 -0.1467203

pcorb(x,y)$r[1] -0.3868361

$ci[1] -0.6403710 -0.1176394

Now we reject, whereas previously we did not

Data from table 6.7, not 6.6.

Exercise 20

c=c(300,280,305,340,348,357,380,397,453,456,510,535,275,270,335,342,354,394,383,450,446,513,520,520)d=c(32.75,28,30.75,29,27,31.20,27,27,23.50,21,21.5,22.8,30.75,27.25,31,26.50,23.50,22.70,25.80,27.80,21.50,22.50,20.60,21)

lsfitci(c,d)$intercept.ci[1] 36.11869 44.19982

$slope.ci [,1] [,2][1,] -0.04761471 -0.02518437

Conventional methods can have accurate probability coverage in certain situations.There isn’t always a difference.

Exercise 21x=c(0.032,0.034,0.214,0.263,0.275,0.275,0.450,0.500,0.500,0.630,0.800,0.900,0.900,0.900,0.900,1.000,1.100,1.100,1.400,1.700,2.000,2.000,2.000,2.000)y=c(170,290,-130,-70,-185,-220,200,290,270,200,300,-30,650,150,500,920,450,500,500,960,500,850,800,1090)

lsfitci(x,y)$intercept.ci[1] -191.9351 108.0508

$slope.ci [,1] [,2][1,] 305.8974 642.884

Chapter 7. Exercise 1 Mean X*:7.6,8.1,9.6,10.2,10.7,12.3,13.4,13.9,14.6,15.2 Here the percentile BS...

Documents

Transcript of Chapter 7. Exercise 1 Mean X*:7.6,8.1,9.6,10.2,10.7,12.3,13.4,13.9,14.6,15.2 Here the percentile BS...