Ch9. Inferences Concerning Proportions
description
Transcript of Ch9. Inferences Concerning Proportions
![Page 1: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/1.jpg)
Ch9. Inferences Concerning Proportions
![Page 2: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/2.jpg)
Outline
Estimation of ProportionsHypothesis concerning one ProportionHypothesis concerning several proportionsAnalysis of r*c tablesGoodness of fit
![Page 3: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/3.jpg)
Estimation
In acceptance sampling we are concerned with the proportion of defectives in a lot, and in life testing we are concerned with the percentage of certain components which will perform satisfactorily during a stated period of time.
It should be clear from these examples that problems concerning proportions, percentages, or probabilities are really equivalent.
![Page 4: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/4.jpg)
Estimation
The point estimator of the population proportion, itself, is usually the sample proportion X/n. If the n trials satisfy the assumptions underlying the binomial distribution(P105), we know the mean and the standard deviation of the number of success is given by and (1 )np pnp
![Page 5: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/5.jpg)
Estimator
The mean and the standard deviation of the proportion of success (namely, of the sample proportion) are given by
np pn
and (1 ) (1 )np p p pn n
The first of these results shows that the sample proportion is an unbiased estimator of the binomial parameter p.
![Page 6: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/6.jpg)
Confidence interval
Construction of confidence interval for the binomial parameter p (estimator). We first define 0x and 1x
0
0
( ; , ) / 2x
k
b k n p
and
such that
1
( ; , ) / 2n
k x
b k n p
Thus, we assert with a probability of approximate , and at least , that the inequality
0 1( ) ( )x p x x p
1 1
![Page 7: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/7.jpg)
EX
Suppose we want to find approximate 95% confidence interval for p for samples of size n=20.
0( ; , ) 0.025B x n p 0x and 1x can be determined by
11 ( 1; , ) 0.025B x n p
p 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9- 0 1 3 5 7 9 11 146 9 11 13 15 17 19 20 -
0x
1x
![Page 8: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/8.jpg)
Confidence interval
![Page 9: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/9.jpg)
For n is largeWe construct approximate confidence intervals for the binomial parameter p by using the normal approximation to the binomial distribution. With the probability 1
/ 2 / 2(1 )X npz znp p
/ 2 / 2
(1 ) (1 )x x x xx xn n n nz p zn n n n
This yields
![Page 10: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/10.jpg)
EX.
If x=36 of n=100 persons interviewed are familiar with the tax incentives for installing certain energy-saving devices, construct a 95% confidence interval for the corresponding true proportion. Solution: x/n =36/100=0.36
hence 0.36(1 0.36) 0.36(1 0.36)0.36 1.96 0.36 1.96100 100
p
0.266 0.454p
![Page 11: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/11.jpg)
Maximum error
The error when we use X/n as estimator of p is given by |X/n -p|Again using the normal distribution, we can assert with probability that the inequality
1
/ 2(1 )| |X p pp z
n n
Maximum error of estimate
/ 2(1 )p pE zn
![Page 12: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/12.jpg)
EX
In a sample survey conducted in a large city, 136 of 400 persons answered yes to the question of whether their city’s public transportation is adequate. With 99% confidence, what can we say about the maximum error if x/n=0.36 is used as an estimate of the corresponding true proportion?
/ 2(1 ) 0.34 0.662.575 0.061
400p pE zn
![Page 13: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/13.jpg)
Sample size determine
2/ 2(1 )( )zn p pE
2/ 21 ( )4znE
If p is unknown
If p is known
![Page 14: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/14.jpg)
9.2 Hypothesis
The test of null hypothesis that a proportion equals some specified constant is widely used in sampling inspection, quality control, and reliability verification.
Statistic for large sample test concerning p
0
0 0(1 )X np
Znp p
Null hypothesis 0p p
![Page 15: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/15.jpg)
Criterion Region for testing (Large sample)
Alternative hypothesis
Reject null hypothesis if
0p p
0p p
0p p
0p p
Z z
Z z
/ 2 / 2Z z or Z z
![Page 16: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/16.jpg)
EX
In a study designed to investigate whether certain detonator used with explosives in coal mining meet the requirement that at least 90% will ignite the explosive when charged, it is found that 174 of 200 detonators function properly. Test the null hypothesis p=0.9 again the alternative p<0.9 at the 0.05 level of significance.
![Page 17: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/17.jpg)
Solution
1. Null hypothesis: Alternative hypothesis
2. Level of significance: 0.05
3. Criterion: Reject the null hypothesis if Z<-1.645
4. Calculation:
5. The null hypothesis cannot be rejected. 6. P-value: 0.079 > level of significance 0.05
0.9p 0.9p
0
0 0
174 200(0.9) 1.41(1 ) 200(0.9)(0.1)
X npZ
np p
![Page 18: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/18.jpg)
Hypothesis concerning several proportions
We compare the consumer response to two different products, when we decide whether the proportion of defectives of a given process remains constant from day to day.Testing
1 2 kp p p p
![Page 19: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/19.jpg)
Large-sample testWe require independent random samples of size if the corresponding number of successes are
the test we should use is based on the fact that
1) Large samples the sampling distribution of is approximately the standard normal distribution
2) The square of random variable having the standard normal distribution with 1 degree of freedom
3) The sum of k independent random variables having chi-square distribution with 1 degree of freedom is a random variable having the chi-square distribution with k degrees of freedom. (Proves are not required)
1 2, , , kn n n
1 2, , , kX X X
(1 )i i
ii i
X npZ
np p
![Page 20: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/20.jpg)
Cont.2
2
1
( )(1 )
ki i i
i i i i
x n pn p p
Is a value of random variable having approximately the chi-square distribution with k degrees of freedom.
In practice, we substitute for the pi, which under the null hypothesis are all equal, the pooled estimate
1 2
1 2
ˆ k
k
x x xpn n n
![Page 21: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/21.jpg)
The null hypothesis should be rejected if the difference between the and are large, the critical region is
where the number of degrees of freedom is k-1.
ix ˆin p2 2
Another approach
Sample1 ... Sample k Total
successes ...
Failures ...
Total ..
1x kx
1 1n x k kn x
1n kn
n xx
n
![Page 22: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/22.jpg)
: 1, 2, 1, ,ijo i j k Define the observed cell frequency
The expected number of successes and failures for the j-th sample are estimated by
1 ˆj je n p and 2 ˆ(1 )j je n p
thus22
2
1 1
( )kij ij
i j ij
o ee
![Page 23: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/23.jpg)
Samples of three kind of materials, subjected to extreme temperature changes, produced the results should in the following table
A B C Total
successes 41 27 22 90
Failures 79 53 78 210
Total 120 80 100 300
Use the 0.05 level of significance to test whether, the probability of crumbling is the same of the three kinds of materials.
![Page 24: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/24.jpg)
Solution
1. Null hypothesis: Alternative hypothesis: are not all equal
2. Level of significance: 0.05
3. Criterion: Reject the null hypothesis if , degree 2
4. Calculation:
5. The null hypothesis cannot be rejected.
1 2 3p p p
2 5.991
1190 120 36300
e 12
90 80 24300
e
90 3ˆ300 10
p 575.42
3013 e
![Page 25: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/25.jpg)
Statistics for test concerning difference between two proportions
1 2
1 2
1 2
1 1ˆ ˆ(1 )( )
X Xn nZ
p pn n
For large samples, is a random variable having approximately the standard normal distribution.
1 1 2 2
1 2 1 1 2 2/ 2
1 2 1 2
(1 ) (1 )x x x xx x n n n nzn n n n
Confidence interval for the values of 1 2p p
![Page 26: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/26.jpg)
9.4 Analysis of r*c tables
The key random variable
22
1 1
( )r cij ij
i j ij
o ee
is chi-square distribution with (r-1)(c-1) degrees of freedom
![Page 27: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/27.jpg)
9.5 Goodness of fit
Goodness of fit: try to compare an observed frequency distribution with the corresponding values of an expected, or theoretical, distribution.
22
1
( )ki i
i i
o ee
is a random variable has the chi-square distribution with k-m degrees of freedom, where k is the number of terms in the formula and m is the number of quantities, obtained from the observed data that are needed to calculate the expected frequencies.
![Page 28: Ch9. Inferences Concerning Proportions](https://reader036.fdocuments.us/reader036/viewer/2022062310/5681683f550346895dde0e2f/html5/thumbnails/28.jpg)
EX
Page 312~313.