Examples for Ch 8

download Examples for Ch 8

of 7

Transcript of Examples for Ch 8

  • 8/10/2019 Examples for Ch 8

    1/7

    1

    Examples for Ch 8

    Confidence Intervals for the Sample Mean with Known

    Example 1:An auditor takes a random sample of size 36 from a population of 1000 accounts

    receivable. The mean value of the accounts receivable for the population is known (from aprevious large survey) to be $2600 with the standard deviation of $450. Find the 95% confidence

    interval for the sample mean.

    We are not told whether the population is normal or not. But the sample size is large enough to

    use the normal approximation. So we will use Z-values to construct the confidence interval. The

    Z-value for 95% confidence interval is 1.96. The standard error of the sample mean is 450/=$75.Therefore the margin of error is 1.96*75 = $147. Therefore, the confidence interval is:

    2600 147 or between $2453 and $2747

    You can similarly find the 90% and 99% confidence intervals using the corresponding Z-values.

    Now suppose I asked you to find the probability that the sample mean will fall in the interval

    which is within $150 around the mean? In this case we are given the margin of error in the units

    of the X values and asked to find the probability of the resulting confidence interval. This is a

    reverse process relative to above where we are given the confidence level (or probability) and

    asked to find the interval.

    You can simply convert this value to the Z-value by dividing by the std. error of $75 for the

    sample mean to give 2

    The probability that Z will be between 2 from the Z tables = 0.97720.0228 = 0.9544 or95.44% chance. Thus the confidence level increased slightly from 95% as the range of estimated

    interval (or tolerable margin of error) became wider ($150 compared to $147).

    Similarly, I could ask you to find the probability that the sample mean will be less than some

    value or greater than some value. But I should not ask the probability that the sample mean will

    be exactly equal to some value. Why?

    Required Sample size for a specified level of error

    The (minimum) required sample size for a random sample is n = Z2

    /2*2

    /E2

    , where Z/2isalready defined above in the context of confidence Intervals (it is the Z value corresponding to

    the required confidence level), is the population standard deviation (or the standard deviationof the population from which the sample is taken), and E is the margin of error around the mean

    expressed in the units of the X variable.Note that we dont need to know the mean fordetermining the sample size. We need only the standard deviation.

  • 8/10/2019 Examples for Ch 8

    2/7

    2

    Example 2: A personnel department analyst wishes to estimate the mean number of training

    hours needed annually for supervisors in a division of the company within the margin of 3

    hours(that is plus minus 3 hours) with a 95% confidencelevel. Based on a large data from

    other similar companies the analyst estimates the standard deviationof required training hours

    to be equal to 20 hours. Find the minimum sample size which will give the required estimate

    with specified margin of error and level of confidence.

    Answer: Here = 20 hours, Z/2= Z0.025= 1.96 (for 95% confidence level), and the margin oferror E = 3 hours. Therefore, n = (1.96*20/3)2= 170.7 or 171 observations (always rounded up).

    The required sample size increases as the tolerable margin of error is reduced. Find the required

    sample size for margin of error of only 1 hour (I bet it will be 9 times the sample size we just

    obtained). Similarly, we can find the required sample size for other confidence levels, such as

    90% (with Z/2= Z0.05 = 1.645) and 99% (with Z/2=Z0.005 = 2.576). Clearly the sample size

    increases as the desired confidence level increases and conversely. Similarly you see that the

    required sample size increases as the standard deviation of the population increases. This makes

    sense, because you need a larger sample size to have the same level of confidence if the parent

    population involves larger variability, other things remaining the same.

    Example 3: A small town has 1000 families who make contributions to the only local church. A

    poll of 144 randomly selected contributing families reveals that the mean annual family

    contribution is $500 with a standard deviation of $72. Construct a 95% confidence interval for

    the mean annual family contribution for this population of families who contribute to this

    particular church.

    Do you see a problem with this question? We are not given the population mean or standard

    deviation like the previous example. This may be because there was no previous survey done forthis population. In such cases the sample results are used as surrogates to the unknown

    population parameters in the formulas given above if the sample size is adequate. In this case the

    sample size 144 is quite large. So we will use $500 and $72 as the surrogates for the unknown

    population mean and the standard deviation, respectively. But do we need to use the formula for

    large population or small population? At first sight the population of 1000 families seems to be

    large. But remember the rule given above. The population is much smaller than 20 times the

    sample. So we will use property #3 to find the standard error. We will use the formula given for

    finite population on page 4 of Instructions for Chapter 7.

    Therefore, =

    = (72/) = 6*0.9257= 5.554

    Therefore, a 95% confidence interval would be between 500-1.96*5.554 and 500+ 1.96*5.554 or

    between 489 and 511 dollars per year (rounded to whole numbers ignoring cents. We could also

    use the t distribution (discussed below) to build the confidence interval in this case since thepopulation standard deviation is not given. But the result would be very close to what we

  • 8/10/2019 Examples for Ch 8

    3/7

    3

    obtained using Z distribution because the sample size is very large. I will discuss this issue in the

    following section.

    Confidence Intervals for the Sample Mean with Not Known

    The pdf (probability density function discussed in my previous instructions) for the t-distributionlooks like the curve given below. Note that the t-distribution approaches the Z-distribution more

    and more closely as the df gets larger (or equivalently, the sample size gets larger). In most

    practical applications the t-distribution is considered to be close enough to the Z-distribution for

    df 30. Therefore, many authors suggestas the practical rule of thumb that for df at least 30just use the Z-distribution (whose values for the three popular confidence levels are well

    known and can be easily memorized) instead of the theoretically required t-distribution for

    which we have to look at the table for the corresponding degree of freedom. Also note that

    for df infinite the t-distribution exactly coincides with the Z-distribution.

    Thus we will use t/2in place of Z/2in our calculation of the margin of error and the

    confidence interval whenever df is less than 30 and is unknown (assuming ,however, that

    the parent population is normal). We will follow exactly the same steps (shown above) as

    the case when is known, except that we replace by s and Z by t. For df greater than or

    equal to 30 it is a matter of researchers choice. Theoretically t would be more accurate than

    Z,but that would involve reading from the t-table instead of using the popularly known Z-values. So it is up to you which one to use.

    Example 4: The sample mean operating life for a random sample of 16 light bulbs of a particular

    brand is calculated to be 4000 hours with the sample standard deviation of 200 hours. The

    operating life of bulbs is generally assumed to be approximately normal. Estimate the mean

  • 8/10/2019 Examples for Ch 8

    4/7

    4

    operating life for the population of bulbs from which the sample is taken using a 95% confidence

    interval.

    Here n=16, df = 15, the population is normal (approximately) and the population standard

    deviation is not given. Therefore, we will use the t-distribution instead of the Z-distribution to

    construct the confidence interval. We are given = 4000 hours and s = 200 hours. Therefore, thestandard deviation (or standard error) of the sample mean denoted by (from my previous

    Instructions) is given by =

    where we have replaced by s.

    Or =

    =50 hours. Now the confidence interval required is 95%. So = .05 and /2 = .025.

    Therefore, we need to find t.025from the table for df = 15. This value is 2.131. Next,

    The margin of error = t/2* = 2.131*50 = 106.55 hours. Therefore,

    The 95% confidence interval for the mean is t/2* = t/2*

    =

    4000 106.55 or between 3893.45 and 4106.55 hours or between 3893 and 4107 hours rounded

    (because the numbers are very large we can ignore the decimals and round to the nearest whole

    number). If we had neglected the fact that the population standard deviation is not known and the

    sample size is quite small (consequently the df is small), then we would be estimating a narrower

    interval which would be questionable because it would be claiming more precision than

    warranted by the nature of the sample.

    Now can you build 90% and 99% confidence interval estimates of the mean life of bulbs for this

    sample? (Hint: look for t.050and t.005, respectively).

    Use of Computer

    For the example of auditors sample of accounts receivable (Example 1above)

    95% confidence level

    2600 mean

    450 std. dev.

    36 n

    1.960 z

    146.997 half-width

    2746.997

    upper confidence

    limit

    2453.003lower confidencelimit

    You can also find the required sample size for a given level of confidence and specified tolerable

    margin of error. Let us work on the Second example of this instruction using MegaStat.

  • 8/10/2019 Examples for Ch 8

    5/7

    5

    Example 2 from above: A personnel department analyst wishes to estimate the mean number of

    training hours needed annually for supervisors in a division of the company within the margin of

    3 hours(that is plus minus 3 hours) with a 95% confidencelevel. Based on a large data from

    other similar companies the analyst estimates the standard deviationof required training hours

    to be equal to 20 hours. Find the minimum sample size which will give the required estimate

    with specified margin of error and level of confidence.

    Go to MegaStat, select Confidence interval/Sample size, then select Sample size-mean in the

    dialogue box, and fill 3 for E and 20 for std deviation.Then the sample size for 95%

    confidence level is given by MegaStat as:

    Sample size - mean

    3 E, error tolerance

    20 standard deviation

    95% confidence level

    1.960 z

    170.732 sample size

    171 rounded up

    This is exactly the same answer we derived above using the formula. I want you to learn

    everything using formula as well as computer. Learning only one way will be half knowledge.

    Similarly, you can use MegaStat to find confidence intervals using the t-distribution. Let us solve

    the Example 3 of this instruction using MegaStat.

    Example 4 from above: The sample mean operating life for a random sample of 16 light bulbs

    of a particular brand is calculated to be 4000 hours with the sample standard deviation of 200

    hours. The operating life of bulbs is generally assumed to be approximately normal. Estimate the

    mean operating life for the population of bulbs from which the sample is taken using a 95%

    confidence interval.

    In this case we will select t instead of Z in the dialogue box and get:

    Confidence interval - mean

    95% confidence level

    4000 mean

    200 std. dev.

    16 n

    2.131 t (df = 15)

    106.572 half-width

    4106.572 upper confidence limit

    3893.428 lower confidence limit

  • 8/10/2019 Examples for Ch 8

    6/7

    6

    Section B:Confidence Intervals for the Proportions

    Example 5: A large population of older homes is known to have defective wiring in 30 percent

    of such homes (from previous survey by the company responsible for servicing). A fresh random

    sample of 250 homes from this population is collected to study this problem. Find the 95%

    confidence interval estimate for the proportions.

    The is given in this case. So we will use it to find the standard error and the confidence interval

    p= = = 0.029

    Therefore, the 95% confidence interval is 0.3 0.029*1.96 = 0.3 0.057 (rounded to three

    decimals) or between 0.243 and 0.357 (also found in previous instructions). We could have used

    MegaStat to find this (select proportion in the dialogue box instead of sample mean) as follows:

    Confidence interval -proportion

    95% confidence level0.3 proportion250 n

    1.960 z0.057 half-width0.357 upper confidence limit0.243 lower confidence limit

    Sometimes the population proportion is not given. Then we have to work with the estimated

    sample proportion only as in the following example.

    Example 6: A sample of 75 retail in-store purchases showed that 24 paid in cash. Construct a

    95% confidence interval for the proportion of all retail in-store purchases that are paid in cash.

    Here population proportion is not given. The sample p = 24/75 = 0.32 and n= 75. Thus np = 24

    and n(1-p) = 51. Therefore, normal approximation can be satisfactorily applied. Do we need

    continuity correction? We have n(p(1-p) =16.3> 10. So we dont need continuity correction. Weget p= {(.32)(.68)/75} = 0.0539. Now replacing by p we get the 95% confidence intervalas: p Z/2p= 0.32 1.96*0.0539 = 0.32 0.1056 or between 0.2144 and 0.4256.

    Using MegaStat

    Confidence interval - proportion

    95% confidence level

    0.32 proportion75 n

    1.960 z

    0.106 half-width

    0.426 upper confidence limit

    0.214 lower confidence limit

    You can easily find other confidence intervals.

  • 8/10/2019 Examples for Ch 8

    7/7

    7

    Required Sample Size in the case of Proportions for given Accuracy

    Example 8: An opinion poll last month suggested that 40% of people would vote for candidate

    A. This month a new poll is being rerun. How many people must be interviewed for the poll tobe within 2 percentage points of actual voting intentions with a 95% level of confidence?

    Since we are not given any other information we will assume that the previous poll estimates are

    the surrogates for the population parameters (assuming that the previous sample was sufficiently

    large).

    The formula for required sample size is n = { Z/22* (1- )}/E

    2.

    Note that the symbol E is used for margin of error is the desired margin of error or tolerable

    deviations from the mean value (expressed in decimal). The value Z= 1.96 for 95% confidence,

    1.645 for 90% and 2.576 for 99% confidence levels. The value is the proportion in the

    population. You see that the required sample size increases by square as the margin of error isreduced (or more accuracy is desired)

    For the above example: n = ((1.96)2*0.24)/(0.02)2= 2304.96 or 2305 rounded. Thus for the

    desired accuracy the new poll has to interview 2305 voters. This should give you an idea why the

    polls generally dont strive for such accuracy but settle for a margin of 4 or 5 percentage points.

    If in the above example the desired margin of error is 4 percentage points, then the sample size

    would reduce by a factor of 4 (less than 580 interviews required). If a 1 percentage point margin

    is set the sample size would jump to more than 9000. The next significant factor which affects

    the sample size is the confidence interval (the higher the confidence interval the larger the

    required sample size, other things remaining the same). But this does not have as dramatic

    impact as the margin of error. The third factor is the . The farther it is from 0.5, the lower is therequired sample size. Does this make sense? Of course! If a population is almost equally divided

    on some issue or candidate you need a larger sample to find which one is going to win. If the

    population is highly biased on some issue or candidate you can easily find out the result even

    with a smaller sample.

    You can easily get the required sample size for proportions using the MegaStat by selecting

    Sample size-pin the dialogue box and specifying the other parameters in the dialogue box.