Examples for Ch 8

8/10/2019 Examples for Ch 8

1/7

1

Examples for Ch 8

Confidence Intervals for the Sample Mean with Known

Example 1:An auditor takes a random sample of size 36 from a population of 1000 accounts

receivable. The mean value of the accounts receivable for the population is known (from aprevious large survey) to be $2600 with the standard deviation of $450. Find the 95% confidence

interval for the sample mean.

We are not told whether the population is normal or not. But the sample size is large enough to

use the normal approximation. So we will use Z-values to construct the confidence interval. The

Z-value for 95% confidence interval is 1.96. The standard error of the sample mean is 450/=$75.Therefore the margin of error is 1.96*75 = $147. Therefore, the confidence interval is:

2600 147 or between $2453 and $2747

You can similarly find the 90% and 99% confidence intervals using the corresponding Z-values.

Now suppose I asked you to find the probability that the sample mean will fall in the interval

which is within $150 around the mean? In this case we are given the margin of error in the units

of the X values and asked to find the probability of the resulting confidence interval. This is a

reverse process relative to above where we are given the confidence level (or probability) and

asked to find the interval.

You can simply convert this value to the Z-value by dividing by the std. error of $75 for the

sample mean to give 2

The probability that Z will be between 2 from the Z tables = 0.97720.0228 = 0.9544 or95.44% chance. Thus the confidence level increased slightly from 95% as the range of estimated

interval (or tolerable margin of error) became wider ($150 compared to $147).

Similarly, I could ask you to find the probability that the sample mean will be less than some

value or greater than some value. But I should not ask the probability that the sample mean will

be exactly equal to some value. Why?

Required Sample size for a specified level of error

The (minimum) required sample size for a random sample is n = Z2

/2*2

/E2

, where Z/2isalready defined above in the context of confidence Intervals (it is the Z value corresponding to

the required confidence level), is the population standard deviation (or the standard deviationof the population from which the sample is taken), and E is the margin of error around the mean

expressed in the units of the X variable.Note that we dont need to know the mean fordetermining the sample size. We need only the standard deviation.


2/7

2

Example 2: A personnel department analyst wishes to estimate the mean number of training

hours needed annually for supervisors in a division of the company within the margin of 3

hours(that is plus minus 3 hours) with a 95% confidencelevel. Based on a large data from

other similar companies the analyst estimates the standard deviationof required training hours

to be equal to 20 hours. Find the minimum sample size which will give the required estimate

with specified margin of error and level of confidence.

Answer: Here = 20 hours, Z/2= Z0.025= 1.96 (for 95% confidence level), and the margin oferror E = 3 hours. Therefore, n = (1.96*20/3)2= 170.7 or 171 observations (always rounded up).

The required sample size increases as the tolerable margin of error is reduced. Find the required

sample size for margin of error of only 1 hour (I bet it will be 9 times the sample size we just

obtained). Similarly, we can find the required sample size for other confidence levels, such as

90% (with Z/2= Z0.05 = 1.645) and 99% (with Z/2=Z0.005 = 2.576). Clearly the sample size

increases as the desired confidence level increases and conversely. Similarly you see that the

required sample size increases as the standard deviation of the population increases. This makes

sense, because you need a larger sample size to have the same level of confidence if the parent

population involves larger variability, other things remaining the same.

Example 3: A small town has 1000 families who make contributions to the only local church. A

poll of 144 randomly selected contributing families reveals that the mean annual family

contribution is $500 with a standard deviation of $72. Construct a 95% confidence interval for

the mean annual family contribution for this population of families who contribute to this

particular church.

Do you see a problem with this question? We are not given the population mean or standard

deviation like the previous example. This may be because there was no previous survey done forthis population. In such cases the sample results are used as surrogates to the unknown

population parameters in the formulas given above if the sample size is adequate. In this case the

sample size 144 is quite large. So we will use $500 and $72 as the surrogates for the unknown

population mean and the standard deviation, respectively. But do we need to use the formula for

large population or small population? At first sight the population of 1000 families seems to be

large. But remember the rule given above. The population is much smaller than 20 times the

sample. So we will use property #3 to find the standard error. We will use the formula given for

finite population on page 4 of Instructions for Chapter 7.

Therefore, =

= (72/) = 6*0.9257= 5.554

Therefore, a 95% confidence interval would be between 500-1.96*5.554 and 500+ 1.96*5.554 or

between 489 and 511 dollars per year (rounded to whole numbers ignoring cents. We could also

use the t distribution (discussed below) to build the confidence interval in this case since thepopulation standard deviation is not given. But the result would be very close to what we


3/7

3

obtained using Z distribution because the sample size is very large. I will discuss this issue in the

following section.

Confidence Intervals for the Sample Mean with Not Known

The pdf (probability density function discussed in my previous instructions) for the t-distributionlooks like the curve given below. Note that the t-distribution approaches the Z-distribution more

and more closely as the df gets larger (or equivalently, the sample size gets larger). In most

practical applications the t-distribution is considered to be close enough to the Z-distribution for

df 30. Therefore, many authors suggestas the practical rule of thumb that for df at least 30just use the Z-distribution (whose values for the three popular confidence levels are well

known and can be easily memorized) instead of the theoretically required t-distribution for

which we have to look at the table for the corresponding degree of freedom. Also note that

for df infinite the t-distribution exactly coincides with the Z-distribution.

Thus we will use t/2in place of Z/2in our calculation of the margin of error and the

confidence interval whenever df is less than 30 and is unknown (assuming ,however, that

the parent population is normal). We will follow exactly the same steps (shown above) as

the case when is known, except that we replace by s and Z by t. For df greater than or

equal to 30 it is a matter of researchers choice. Theoretically t would be more accurate than

Z,but that would involve reading from the t-table instead of using the popularly known Z-values. So it is up to you which one to use.

Example 4: The sample mean operating life for a random sample of 16 light bulbs of a particular

brand is calculated to be 4000 hours with the sample standard deviation of 200 hours. The

operating life of bulbs is generally assumed to be approximately normal. Estimate the mean


4/7

4

operating life for the population of bulbs from which the sample is taken using a 95% confidence

interval.

Here n=16, df = 15, the population is normal (approximately) and the population standard

deviation is not given. Therefore, we will use the t-distribution instead of the Z-distribution to

construct the confidence interval. We are given = 4000 hours and s = 200 hours. Therefore, thestandard deviation (or standard error) of the sample mean denoted by (from my previous

Instructions) is given by =

where we have replaced by s.

Or =

=50 hours. Now the confidence interval required is 95%. So = .05 and /2 = .025.

Therefore, we need to find t.025from the table for df = 15. This value is 2.131. Next,

The margin of error = t/2* = 2.131*50 = 106.55 hours. Therefore,

The 95% confidence interval for the mean is t/2* = t/2*

=

4000 106.55 or between 3893.45 and 4106.55 hours or between 3893 and 4107 hours rounded

(because the numbers are very large we can ignore the decimals and round to the nearest whole

number). If we had neglected the fact that the population standard deviation is not known and the

sample size is quite small (consequently the df is small), then we would be estimating a narrower

interval which would be questionable because it would be claiming more precision than

warranted by the nature of the sample.

Now can you build 90% and 99% confidence interval estimates of the mean life of bulbs for this

sample? (Hint: look for t.050and t.005, respectively).

Use of Computer

For the example of auditors sample of accounts receivable (Example 1above)

95% confidence level

2600 mean

450 std. dev.

36 n

1.960 z

146.997 half-width

2746.997

upper confidence

limit

2453.003lower confidencelimit

You can also find the required sample size for a given level of confidence and specified tolerable

margin of error. Let us work on the Second example of this instruction using MegaStat.


5/7

5

Example 2 from above: A personnel department analyst wishes to estimate the mean number of

training hours needed annually for supervisors in a division of the company within the margin of

3 hours(that is plus minus 3 hours) with a 95% confidencelevel. Based on a large data from

other similar companies the analyst estimates the standard deviationof required training hours

to be equal to 20 hours. Find the minimum sample size which will give the required estimate

with specified margin of error and level of confidence.

Go to MegaStat, select Confidence interval/Sample size, then select Sample size-mean in the

dialogue box, and fill 3 for E and 20 for std deviation.Then the sample size for 95%

confidence level is given by MegaStat as:

Sample size - mean

3 E, error tolerance

20 standard deviation


1.960 z

170.732 sample size

171 rounded up

This is exactly the same answer we derived above using the formula. I want you to learn

everything using formula as well as computer. Learning only one way will be half knowledge.

Similarly, you can use MegaStat to find confidence intervals using the t-distribution. Let us solve

the Example 3 of this instruction using MegaStat.

Example 4 from above: The sample mean operating life for a random sample of 16 light bulbs

of a particular brand is calculated to be 4000 hours with the sample standard deviation of 200

hours. The operating life of bulbs is generally assumed to be approximately normal. Estimate the

mean operating life for the population of bulbs from which the sample is taken using a 95%

confidence interval.

In this case we will select t instead of Z in the dialogue box and get:

Confidence interval - mean


4000 mean

200 std. dev.

16 n

2.131 t (df = 15)

106.572 half-width

4106.572 upper confidence limit

3893.428 lower confidence limit


6/7

6

Section B:Confidence Intervals for the Proportions

Example 5: A large population of older homes is known to have defective wiring in 30 percent

of such homes (from previous survey by the company responsible for servicing). A fresh random

sample of 250 homes from this population is collected to study this problem. Find the 95%

confidence interval estimate for the proportions.

The is given in this case. So we will use it to find the standard error and the confidence interval

p= = = 0.029

Therefore, the 95% confidence interval is 0.3 0.029*1.96 = 0.3 0.057 (rounded to three

decimals) or between 0.243 and 0.357 (also found in previous instructions). We could have used

MegaStat to find this (select proportion in the dialogue box instead of sample mean) as follows:

Confidence interval -proportion

95% confidence level0.3 proportion250 n

1.960 z0.057 half-width0.357 upper confidence limit0.243 lower confidence limit

Sometimes the population proportion is not given. Then we have to work with the estimated

sample proportion only as in the following example.

Example 6: A sample of 75 retail in-store purchases showed that 24 paid in cash. Construct a

95% confidence interval for the proportion of all retail in-store purchases that are paid in cash.

Here population proportion is not given. The sample p = 24/75 = 0.32 and n= 75. Thus np = 24

and n(1-p) = 51. Therefore, normal approximation can be satisfactorily applied. Do we need

continuity correction? We have n(p(1-p) =16.3> 10. So we dont need continuity correction. Weget p= {(.32)(.68)/75} = 0.0539. Now replacing by p we get the 95% confidence intervalas: p Z/2p= 0.32 1.96*0.0539 = 0.32 0.1056 or between 0.2144 and 0.4256.

Using MegaStat

Confidence interval - proportion


0.32 proportion75 n

1.960 z

0.106 half-width

0.426 upper confidence limit

0.214 lower confidence limit

You can easily find other confidence intervals.


7/7

7

Required Sample Size in the case of Proportions for given Accuracy

Example 8: An opinion poll last month suggested that 40% of people would vote for candidate

A. This month a new poll is being rerun. How many people must be interviewed for the poll tobe within 2 percentage points of actual voting intentions with a 95% level of confidence?

Since we are not given any other information we will assume that the previous poll estimates are

the surrogates for the population parameters (assuming that the previous sample was sufficiently

large).

The formula for required sample size is n = { Z/22* (1- )}/E

2.

Note that the symbol E is used for margin of error is the desired margin of error or tolerable

deviations from the mean value (expressed in decimal). The value Z= 1.96 for 95% confidence,

1.645 for 90% and 2.576 for 99% confidence levels. The value is the proportion in the

population. You see that the required sample size increases by square as the margin of error isreduced (or more accuracy is desired)

For the above example: n = ((1.96)2*0.24)/(0.02)2= 2304.96 or 2305 rounded. Thus for the

desired accuracy the new poll has to interview 2305 voters. This should give you an idea why the

polls generally dont strive for such accuracy but settle for a margin of 4 or 5 percentage points.

If in the above example the desired margin of error is 4 percentage points, then the sample size

would reduce by a factor of 4 (less than 580 interviews required). If a 1 percentage point margin

is set the sample size would jump to more than 9000. The next significant factor which affects

the sample size is the confidence interval (the higher the confidence interval the larger the

required sample size, other things remaining the same). But this does not have as dramatic

impact as the margin of error. The third factor is the . The farther it is from 0.5, the lower is therequired sample size. Does this make sense? Of course! If a population is almost equally divided

on some issue or candidate you need a larger sample to find which one is going to win. If the

population is highly biased on some issue or candidate you can easily find out the result even

with a smaller sample.

You can easily get the required sample size for proportions using the MegaStat by selecting

Sample size-pin the dialogue box and specifying the other parameters in the dialogue box.

Examples for Ch 8

Documents

Transcript of Examples for Ch 8