Anh Nguyen 6
-
Upload
jason-nguyen -
Category
Documents
-
view
214 -
download
0
Transcript of Anh Nguyen 6
7/23/2019 Anh Nguyen 6
http://slidepdf.com/reader/full/anh-nguyen-6 1/9
MATH 325 HW6
Anh Nguyen
November 4, 2015
7/23/2019 Anh Nguyen 6
http://slidepdf.com/reader/full/anh-nguyen-6 2/9
7/23/2019 Anh Nguyen 6
http://slidepdf.com/reader/full/anh-nguyen-6 3/9
3
Problem 5.10 p.154
Description . Estimate the total number of acres of trees on farms in the state, and place
a bound on the error of estimation. Graph the data on an appropriate plot and comment
on the variation as we move from I to IV.
(a) Solving without using macros from library(survey):
Denote:
N i : ith stratum size
ni : sample size from ith stratum
Y i : mean of the sample taken from ith stratum
si : standard deviation of sample taken from ith stratum
Entering data into R:
> source("http://educ.jmu.edu/~garrenst/math325.dir/Rmacros")
> source("http://educ.jmu.edu/~garrenst/math100.dir/Rmacros")
> myframe=read.table2("EXER5_10.DAT")
> y1=myframe[1:14,2]
> y2=myframe[15:26,2]> y3=myframe[27:35,2]
> y4=myframe[36:40,2]
> N=c(86,72,52,30)
> n=c(length(y1),length(y2),length(y3),length(y4))
Estimate τ , the population total number of acres of trees on farms in the state, for
stratified sampling.
bτ st = N Y st =4
Xi=1 N iY i (5)
> sample.means=c(mean(y1),mean(y2),mean(y3),mean(y4))
> tau.hat.st=sum(N*sample.means)
> tau.hat.st
[1] 50505.6
7/23/2019 Anh Nguyen 6
http://slidepdf.com/reader/full/anh-nguyen-6 4/9
4
The estimated total number of acres of trees on farms in the state is 50, 505.6. Next,
the estimate of the variance of the τ is:
bV (τ st) =
bV (N Y st) =
3
Xi=1N 2
i 1 −
ni
N is2i
ni
(6)
> sample.sd=c(sd(y1),sd(y2),sd(y3),sd(y4))
> var.tau.hat=sum(N^2*(1-(n/N))*(sample.sd^2/n))
> var.tau.hat
[1] 18762431
Place a bound on the error of the estimation:
B = 2q V (τ st) (7)
> 2*sqrt(var.tau.hat)
[1] 8663.124
So the bound on error of estimation is 8663.124.
(b) Solving with using macros from library(survey)
> library(survey)
> mydesign=svydesign(~1,strata=~V1,fpc=rep(N,n),data=myframe)
> tau=svytotal(~V2,mydesign)
> tau.hat.st=coef(tau)
> tau.hat.st
V2
50505.6
> bound=2*SE(tau)
> bound
V2
V2 8663.124
So the estimated total number of acres of trees on farms in the state is 50, 505.6 and
the bound on error of estimation is 8663.124
7/23/2019 Anh Nguyen 6
http://slidepdf.com/reader/full/anh-nguyen-6 5/9
5
(c) The graph of the sample data from all strata is shown below
> ggplot(data=myframe,aes(x=V1,y=V2,fill=V1))+geom_boxplot()+labs(title="
Plot of the Stratums",x="Stratum",y="Farm Sizes") + scale_fill_brewer(
palette="Blues") + theme_classic()
Figure 1: Boxplot of the sample data from all strata
0
200
400
600
800
I II III IV
Stratum
F a r m S
i z e s
V1
I
II
III
IV
Plot of the Stratums
From the above boxplot, the variation in the sample taken from the fourth stratum
is very high compared to all other samples. In addition, observe that the variation is
steadily increasing as we move from stratum I to stratum IV.
7/23/2019 Anh Nguyen 6
http://slidepdf.com/reader/full/anh-nguyen-6 6/9
6
Problem 5.8 p.154
Denote:
N i : ith stratum size
ni : sample size from ith stratum
ai : proportional allocation of the sample taken from ith stratum
si : standard deviation of sample taken from ith stratum
Entering data into R:
> myframe=read.table2("EXER5_6.DAT")
> N=c(55,80,65)
> y1=myframe[1:14,2]
> y2=myframe[15:34,2]
> y3=myframe[35:50,2]
> n=c(length(y1),length(y2),length(y3))
> a=n/sum(n)
The stratum standard deviations are approximated by the sample variances obtained using
the given data, that is σ2i ≈ s2i . The approximate sample sized required to estimate average
score with a bound B = 4 on the error of estimation is:
n =
P3
i=1 N 2
i σ2
i/ai
N 2D +P
3
i=1 N iσ2i
(8)
where D = B2/4. In R:
> y.vars=c(var(y1),var(y2),var(y3))
> B=4
> D=B^2/4
> (sum(N^2*y.vars/a))/(sum(N)^2*D+sum(N*y.vars))
[1] 32.14369
So, the sample size required to estimate the average score is 32
7/23/2019 Anh Nguyen 6
http://slidepdf.com/reader/full/anh-nguyen-6 7/9
7
Problem 5.31 p.159
(a) Estimate the population proportion of those who think they have worked beyond a
safe limit. Calculate a bound on the error of estimation.
The population of those practicing anesthesiology was stratified intro 3 groups: anes-
thesiologists (50%), anesthesiology resident (10%), nurse anesthetist (40%). The anes-
thesiologists were asked whether or not they had worked without a break beyond a safe
limit. We want to estimate p, the population proportion of those who would answer
”yes”.
Given the sample proportion of anesthesiologist, ˆ p1 = 0.687
Given the sample proportion of anesthesiology resident, ˆ p2 = 0.824
Given the sample proportion of nurse anesthetist, ˆ p3 = 0.782
The population proportion, p, is estimated by the stratified sample proportion ˆ pst,
which is the weighted average of the sample proportions, ˆ pi of the strata:
ˆ pst =3X
i=1
N iN
ˆ pi
= 0.5 · 0.687 + 0.1 · 0.824 + 0.4 · 0.782
> N.frac=c(0.5,0.10,0.40)
> p.hat=c(0.687,0.824,0.782)
> p.hat.st=sum(N.frac*p.hat)> p.hat.st
[1] 0.7387
As a result, the estimated population proportion of practicing anesthesiology who
would answer ”yes” is 0.7387. Also, the estimated population proportion of those who
would answer ”no” is q = 1 − ˆ p. Therefore, the estimated variance of ˆ pst is calculated
by:
V (ˆ pst) =3
Xi=1N 2
i
N 2 1 −
ni
N i ˆ piq i
ni − 1 (9)
However, since we assume the population size is large,
1 − ni
N i
= 1. Equation (9)
will become:
V (ˆ pst) =3X
i=1
N iN
2
ˆ piq ini − 1
(10)
> n=c(417+913,29+136,240+860)
7/23/2019 Anh Nguyen 6
http://slidepdf.com/reader/full/anh-nguyen-6 8/9
8
> v.hat.p.st=sum(N.frac^2*p.hat*(1-p.hat)/(n-1))
> B=2*sqrt(v.hat.p.st)
> B
[1] 0.01721764
So the bound on error of estimation is 0.0172. In addition, there is another way to
solve this problem by using macros from library(survey). First, we need to build a
dataframe, which cotains two columns: ”job” and ”safelimit”. The values of ”job”
column is 1, 2 and 3 which correspond to the numberth of stratum. The values of
”safelimit” column is 0 and 1, which 0 corresponds to ”No” and 1 corresponds to
”Yes”.
> y1=c(rep(1,417+913),rep(2,29+136),rep(3,240+860))
> y2=c(rep(0,417),rep(1,913),rep(0,29),rep(1,136),rep(0,240),rep(1,860))
> y3=cbind(y1,y2)
> colnames(y3)=c("job","safelimit")
> myframe=data.frame(y3)
and then we can use macros from library(survey) to help us solve this problem:
> N.frac=c(0.5,0.10,0.40)
> N=1e10*N.frac
> n=c(417+913,29+136,240+860)
> mydesign=svydesign(~1,strata=~V1,fpc=rep(N,n),data=myframe)
> z=svymean(~safelimit,mydesign)
> coef(z)
safelimit
0.7383846
> 2*SE(z)
safelimit
safelimit 0.01722261
From the ouput, the estimated population proportion of practicing anesthesiology who
would answer ”yes” is 0.7384 and the bound on error of estimation is 0.0172
7/23/2019 Anh Nguyen 6
http://slidepdf.com/reader/full/anh-nguyen-6 9/9
9
(b) Do anesthesiologists diff er significantly from residents in this matter?
To answer this question, we need to compute the confidence interval on the diff erence
between those two’s corresponding population proportions and see whether or not it
contains zero.
b p1 − b p2 ± 2
s b p1(1 − b p1)
n1+ b p2(1 − b p2)
n2
> (p.hat[1]-p.hat[2])+c(-1,1)*2*sqrt(p.hat[1]*(1-p.hat[1])/n[1]+p.hat[2]*(1-
p.hat[2])/n[2] )
[1] -0.201517 -0.072483
The 95% confidence interval on p1 − p2 is (−0.2015, −0.0724). Since zero is not
included in this interval, there is enough evidence to suggest that anesthesiologistsdiff er significantly from residents in this matter.
(c) Do anesthesiologists diff er significantly from nurse anesthetists in this matter?
Again, we need to compute the confidence interval on the diff erence between those
two’s corresponding population proportions and see whether or not it contains zero.
b p1 −
b p3 ± 2
s b p1(1 −
b p1)
n1+
b p3(1 −
b p3)
n3
> (p.hat[1]-p.hat[3])+c(-1,1)*2*sqrt(p.hat[1]*(1-p.hat[1])/n[1] + p.hat
[3]*(1-p.hat[3])/n[3] )
[1] -0.13058964 -0.05941036
The 95% confidence interval on p1 − p3 is (−0.1306, −0.0594). Since zero is not
included in this interval, there is enough evidence to suggest that anesthesiologists
diff er significantly from nurse anesthesiologists in this matter.