Chapter 9

45
ght © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 9 - 1 Estimation and Confidence Intervals

description

Estimation and Confidence Intervals. Chapter 9. 1. 2. 3. 4. Chapter Goals. When you have completed this chapter, you will be able to:. Define a point estimator, a point estimate, and desirable properties of a point estimator such as unbiasedness, efficiency, and consistency. - PowerPoint PPT Presentation

Transcript of Chapter 9

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 1

Estimation and Confidence Intervals

Estimation and Confidence Intervals

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 2

Define a point estimator, a point estimate, and desirableproperties of a point estimator such asunbiasedness, efficiency, and consistency.

Define an interval estimator and an interval estimate

Define a confidence interval, confidence level, margin of error, and a confidence interval estimate

Construct a confidence interval for the population mean when the population standard deviation is known

When you have completed this chapter, you will be able to:

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 3

Construct a confidence interval for the population mean when the population is normally distributed and the population standard deviation is unknown

Construct a confidence interval for a population proportion

Determine the sample size for attribute and variable sampling

Construct a confidence interval for the population variance when the population is normally distributed

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 4Terminology

Point Estimate …is a single value (statistic) used to estimate a population value (parameter)

Confidence Interval …is a range of values within which the population parameter

is expected to occur

Interval Estimate …states the range within which a population parameter probably lies

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 5

Desirable properties of a point estimator

• unbiased• unbiased

… possible values are concentrated close to the value of the parameter

…unbiased when the expected value equals the value of the population parameter being estimated.

Otherwise, it is biased!

…values are distributed evenly on both sides of the value of the

parameter

• efficient• efficient

• consistent• consistent

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 6

Standard error of the sample mean Standard error of the sample mean …is the standard deviation of the sampling distribution of the

sample means

It is computed by

…is the symbol for the standard error of the sample mean

…is the standard deviation of the population

n …is the size of the sample

Terminology

x

n x

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 7Standard Error of the Means

Standard Error of the Means

If is not known and n > 30, the standard deviation of the

sample(s) is used to approximate the population standard

deviation

If is not known and n > 30, the standard deviation of the

sample(s) is used to approximate the population standard

deviation

ns x sComputed by…

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 8

1. The sample size, n

2. The variability in the population,

usually estimated by s3. The desired level of confidence

1. The sample size, n

2. The variability in the population,

usually estimated by s3. The desired level of confidence

…that determine the width of a confidence interval are:

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 9

IN GENERAL,

A confidence interval for a mean is computed by:

Constructing Confidence Intervals

Constructing Confidence Intervals

Interpreting…Interpreting…

zx

ns

α/2

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 10Interpreting Confidence Intervals

Interpreting Confidence Intervals

The Globe

Suppose that you read that

“…the average selling price of a family home in

York Region is $200 000 +/- $15000

at 95% confidence!”

“…the average selling price of a family home in

York Region is $200 000 +/- $15000

at 95% confidence!”

This means…what?This means…what?

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 11Interpreting Confidence Intervals

Interpreting Confidence Intervals

In statistical terms, this means:

…that we are 95% sure that the interval estimate obtained

contains the value of the population mean.

Lower confidence limit is $185 000

Upper confidence limit is $215 000

The Globe

“…the average selling price of a family home in York Region is

$200 000 +/- $15 000 at 95%

confidence!”

“…the average selling price of a family home in York Region is

$200 000 +/- $15 000 at 95%

confidence!”

Also…Also…

($200 000 - $15 000)

($200 000 + $15 000)

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 12Interpreting Confidence Intervals

Interpreting Confidence Intervals

The Globe

“…the mean time to sell a family home

in York Region

is 40 days.

“…the mean time to sell a family home

in York Region

is 40 days.

Your newspaper also reports that…

You select a random sample of 36 homes sold during the past year,

and determine a 90% confidence

interval estimate for the population mean to

be (31-39) days.Do your sample results support the paper’s claim?Do your sample results support the paper’s claim?

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 13Interpreting Confidence Intervals

Interpreting Confidence Intervals

You select a random sample of 36 homes sold during the past year, and determine a

90% confidence interval estimate

for the population

mean to be (31-39) days.

You select a random sample of 36 homes sold during the past year, and determine a

90% confidence interval estimate

for the population

mean to be (31-39) days.

There is a 10% chance (100%-90%) that the interval estimate

does not contain the value of the population mean!

Lower confidence limit is 31 days

Upper confidence limit is 39 days

Our evidence does not support the statement made by the newspaper, i.e., the population mean is not 40 days,

when using a 90% interval estimate

Our evidence does not support the statement made by the newspaper, i.e., the population mean is not 40 days,

when using a 90% interval estimate

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 14

31 39

i.e. = 0.10i.e. = 0.10.05 .05

90%

… 10% chance of falling outside this interval

Interpreting Confidence Intervals

Interpreting Confidence Intervals

90% Confidence Interval90% Confidence Interval

…or, focus on tail areas …

is the probability of a value falling outside the confidence interval

is the probability of a value falling outside the confidence interval

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 15

Find the appropriate value of z:Find the appropriate value of z:

n

zX

n

zXP

(

n

zX

n

zXP

(

0 1.75

Locate Area on the normal curve

Locate Area on the normal curve

1

This is a 92% confidence interval

2 Look up a= 0.46 in Table to get the corresponding

z-score

Look up a= 0.46 in Table to get the corresponding

z-score

Search in the centre of the table for the area of 0.46

Z = +/- 1.75Z = +/- 1.75

-1.75

0.9292. 92.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 16Constructing Confidence Intervals

n

sX 96.1

n

sX 96.195% C.I. for

the mean:

95% C.I. for the mean:

Common Confidence Intervals

Common Confidence Intervals

99% C.I. for the mean:

99% C.I. for the mean: X

s

n2 58.X

s

n2 58.

About 95% of the constructed intervals will contain the parameter being estimated.

Also, 95% of the sample means for a specified sample size will lie within

1.96 standard deviations of the

hypothesized population mean.

zx

ns

α/2

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 17Interval Estimates

Use the z table…

zx

ns

α/2

Use the t-table…Use the t-table…

tx

ns

α/2

If the population standard deviation is

known or n > 30

If the population standard deviation is unknown and n<30

More on this later…

More on this later…

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 18

Our best estimate is 24 hours.Our best estimate is 24 hours.

The Dean of the Business School wants to estimate the mean number of hours

worked per week by students.

The Dean of the Business School wants to estimate the mean number of hours

worked per week by students. A sample of 49 students

showed a mean of 24 hours with a standard deviation of 4 hours.

What is the population mean?

This is a point estimate.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 19Find the 95 percent confidence interval for the population mean.

95% Confidence95% Confidence Z = +/- 1.96Z = +/- 1.96

Substitute values:

Substitute values: 24 = 24 +/- 1.12= 24 +/- 1.12

The Confidence Limits range from 22.88 to 25.12The Confidence Limits range from 22.88 to 25.12

Commonly denoted as 1-Commonly denoted as 1-

95 percent confidence

+ 1.964

Mean = 24 SD = 4 N = 49 Mean = 24 SD = 4 N = 49

49

zx

ns

α/2

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 20

90% confidence level90% confidence level

1- = 0.9or = 0.10

1- = 0.99or = 0.010

99% confidence level99% confidence level

Interval Estimates

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 21Student’s t-distributionStudent’s t-distribution

….used for small sample sizesCharacteristicsCharacteristics

…like z, the t-distribution is continuous…takes values between –4 and +4…it is bell-shaped and symmetric about zero…it is more spread out and flatter at the centre

than the z-distribution…for larger and larger values of degrees of freedom, the t-distribution becomes closer and closer to the standard normal distribution

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 22

Chart 9-1 Comparison of The Standard Normal Distribution

and the Student’s t Distribution

Chart 9-1 Comparison of The Standard Normal Distribution

and the Student’s t Distribution

Z distribution

The t distribution should be flatter and more spread out

than the z distribution

The t distribution should be flatter and more spread out

than the z distribution

t distribution

Student’s t-distributionStudent’s t-distribution

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 23Student’s t-distributionStudent’s t-distribution

…with df = 9 and 0.10 area in the upper tail…

t = 1.383

t

0.10

T -tableT -table

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 24Student’s t-distributionStudent’s t-distribution

Confidence Intervals 80% 90% 95% 98% 99%

Level of Significance for One-Tailed Test 0.100 0.050 0.025 0.010 0.005

Level of Significance for Two-Tailed Test 0.20 0.10 0.05 0.02 0.01

0.10

1.3839

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 25

When?When?…to use the z Distribution or the t Distribution

NONO

Population standard deviation known?

Population standard deviation known?

Use a nonparametric test

(see Ch16)

Use a nonparametric test

(see Ch16)

Use the z distribution

Use the z distribution

Use the t

distribution

Use the t

distribution

YESYES

Population Normal?Population Normal?

NONO YESYES

n 30 or more?n 30 or more?

NONO YESYES

Use the z distribution

Use the z distribution

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 26Student’s t-distributionStudent’s t-distribution

The Dean of the Business School wants to estimate the mean number of hours worked per week by students. A sample of only 12 students showed a mean of 24 hours with a standard

deviation of 4 hours. Find the 95 percent confidence interval

for the population mean.Find the 95 percent confidence interval

for the population mean.

so use the t - Distributionso use the t - Distribution

n is small

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 27…sample of only 12 students …a mean of 24 hours

…a standard deviation of 4 hours

Data

X

= 24 n = 12 s = 4 df = 12-1 = 11 = 1 – 95% = .05

Looking up 5% level of significance for a two-tailed test with 11df, we find…

FormulaFormula tx

ns

α/2

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 28Student’s t-distributionStudent’s t-distribution

Confidence Intervals 80% 90% 95% 98% 99%

Level of Significance for One-Tailed Test 0.100 0.050 0.025 0.010 0.005

Level of Significance for Two-Tailed Test 0.20 0.10 0.05 0.02 0.010.05

2.20111

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 29…sample of only 12 students …a mean of 24 hours

…a standard deviation of 4 hours

Data

X

= 24 n = 12 s = 4 df = 12-1 = 11 = 1 – 95% = .05

Looking up 5% level of significance for a two-tailed test with 11df, we find… t0.025 = 2.201t0.025 = 2.201

Compare these with earlier limits of 22.88 to 25.12Compare these with earlier limits of 22.88 to 25.12

= 24 +/- 2.54 24 2.201 124

The confidence limits range from 21.46 to 26.54The confidence limits range from 21.46 to 26.54

FormulaFormula tx

ns

α/2

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 30

The manager of the college cafeteria wants to estimate the mean amount spent per customer

per purchase. A sample of 10 customers revealed the following amounts spent:

$4.45 $4.05 $4.95 $3.25 $4.68

$5.75 $6.01 $3.99 $5.25 $2.95

Determine the 99% confidence interval for the mean amount spent.

Student’s t-distributionStudent’s t-distribution

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 31

= $4.53 +/- $1.03

$4.45 $4.05 $4.95 $3.25 $4.68$5.75 $6.01 $3.99 $5.25 $2.95

Determine the sample mean and standard deviation.Determine the sample mean and standard deviation.Step 1Step 1

n = 10 – 1 = 9 1-99% = .01

Enter the key data into the appropriate formula.Enter the key data into the appropriate formula.Step 2Step 2

= 4.53 3.25

We are 99% confident that the mean amount spent per customer is between $3.50 and

$5.56

We are 99% confident that the mean amount spent per customer is between $3.50 and

$5.56

Student’s t-distributionStudent’s t-distribution

= $4.53 s = $1.00

X =df =10

1.00

10FormulaFormula tx

ns

α/2

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 32Constructing Confidence Intervals for Population Proportions

Constructing Confidence Intervals for Population Proportions

A confidence interval for a population proportion is estimated by:

…is the symbol for the sample proportionp

n

ppzp

)1(

n

ppzp

)1( FormulaFormula

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 33

A sample of 500 executives who own their own home revealed 175 planned to sell their homes and retire to Victoria.

Develop a 98% confidence interval for the proportion of executives that

plan to sell and move to Victoria.

Constructing Confidence Intervals for Population Proportions

Constructing Confidence Intervals for Population Proportions

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 34

A sample of 500 executives who own

their own home revealed 175

planned to sell their homes and retire to Victoria.

Develop a 98% confidence interval for the proportion

of executives…

A sample of 500 executives who own

their own home revealed 175

planned to sell their homes and retire to Victoria.

Develop a 98% confidence interval for the proportion

of executives…

n = p = z =n = p = z =500 175/500 = .35

2.33

Constructing Confidence Intervals for Population Proportions

Constructing Confidence Intervals for Population Proportions

1/2

p pp z

n

( )ˆ

1/2

p pp z

n

( )ˆ

FormulaFormula

98% CL =

500

)35.1(35.33.235.

0497.35.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 35Finite-Population Correction Factor

Finite-Population Correction Factor

Used when n/N is 0.05 or moreUsed when n/N is 0.05 or more

The attendance at the college hockey game last night was 2700. A random sample of 250 of those in

attendance revealed that the average number of

drinks consumed per person was 1.8 with a standard deviation of 0.40.

The attendance at the college hockey game last night was 2700. A random sample of 250 of those in

attendance revealed that the average number of

drinks consumed per person was 1.8 with a standard deviation of 0.40.

FormulaFormula

x n

N - n

N -

1

Develop a 90% confidence interval estimate for the mean number of drinks consumed per person.

Correction Factor

Correction Factor

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 36

N = n = x =

s = /2 =

N = n = x =

s = /2 =

Finite-Population Correction Factor

Finite-Population Correction Factor

The attendance at the college hockey game last night was 2700. A sample of 250 of those in attendance

revealed that the average number of

drinks consumed per person was 1.8 with a standard deviation of 0.40.

Develop a 90% confidence interval

estimate.…

The attendance at the college hockey game last night was 2700. A sample of 250 of those in attendance

revealed that the average number of

drinks consumed per person was 1.8 with a standard deviation of 0.40.

Develop a 90% confidence interval

estimate.…

2700 250 1.8

0.40 0.05

Since 250/2700 >.05, use the correction factor

)12700

2502700)(

250

4.(645. 18.1

04.08.1 90% CL =

FormulaFormula N - n

N - 1n

sZα/2

X

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 37

Selecting the Sample

Size

Selecting the Sample

Size

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 38

1. The degree of confidence selected

2. The maximum allowable error

3. The variation in the population

1. The degree of confidence selected

2. The maximum allowable error

3. The variation in the population

…that determine the sample size are:

FactorsFactors

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 39

E … is the allowable error

Z …is the z-score for the chosen level of confidence

S …is the sample deviation of the pilot survey

Selecting the

Sample Size

Selecting the

Sample Size

2

Eszα/2n =FormulaFormula

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 40

A consumer group would like to estimate the mean monthly electricity charge for a single

family house in July (within $5) using a 99 percent level of confidence. Based on similar studies the

standard deviation is estimated to be $20.00.

A consumer group would like to estimate the mean monthly electricity charge for a single

family house in July (within $5) using a 99 percent level of confidence. Based on similar studies the

standard deviation is estimated to be $20.00.

Selecting the

Sample Size

Selecting the

Sample Size

How large a sample is required?How large a sample is required?

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 41Selecting the

Sample Size

Selecting the

Sample Size

A consumer group would like to estimate the mean monthly electricity

charge for a single family house in July (within $5) using a 99 percent level

of confidence. Based on similar studies the

standard deviation is estimated to be $20.00.

A consumer group would like to estimate the mean monthly electricity

charge for a single family house in July (within $5) using a 99 percent level

of confidence. Based on similar studies the

standard deviation is estimated to be $20.00.

= (10.32)2

= 106.5

A minimum of 107 homes must be sampled.

A minimum of 107 homes must be sampled.

90% CL =

2

5.00202.58

FormulaFormula 2

Eszα/2

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 42Selecting the

Sample Size

Selecting the

Sample Size

The Kennel Club wants to estimate the proportion of children that have a dog as a pet. Assume a 95% level of confidence and that the club

estimates that 30% of the children have a dog as a pet.

If the club wants the estimate to be within 3% of the population proportion,

how many children would they need to contact?

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 43

The Kennel Club wants to estimate the

proportion of children that have a dog as a

pet. Assume a

95% level of confidence and that the club estimates that 30% of the children

have a dog as a pet.

The Kennel Club wants to estimate the

proportion of children that have a dog as a

pet. Assume a

95% level of confidence and that the club estimates that 30% of the children

have a dog as a pet.

Selecting the

Sample Size

Selecting the

Sample SizeNew

FormulaFormula n p pZE

( )1

2

2

03.96.1

)3.1(3.

233.65)21(.

n = 896.4

A minimum of 897 children must be sampled.

A minimum of 897 children must be sampled.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 44

Test your learning…Test your learning…

www.mcgrawhill.ca/college/lindClick on…Click on…

Online Learning Centrefor quizzes

extra contentdata setssearchable glossaryaccess to Statistics Canada’s E-Stat data…and much more!

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

9 - 45

This completes Chapter 9This completes Chapter 9