ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates:...

32
ESTIMATION • Estimation: process of using sample values to estimate population values • Point Estimates: parameter is estimated as single point – Examples: x, s, p • Careful statisticians dislike point estimates

Transcript of ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates:...

Page 1: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

ESTIMATION

• Estimation: process of using sample values to estimate population values

• Point Estimates: parameter is estimated as single point– Examples: x, s, p

• Careful statisticians dislike point estimates

Page 2: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

Interval Estimates

• Example: there is a 90% probability that somewhere between 58 and 68% of Americans oppose same-sex marriage

• Draws explicit attention to the fact of variability in the sample results; avoids putting too much weight on one number

Page 3: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

Think about the interval 42 – 1.64 * 1 to 42 + 1.64 * 1: This interval contains 90% of the sample means that could ever

be drawn from this population

37 42 47

Page 4: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

• The interval 40.36 to 43.64 contains 90% of the sample means possible from this population

• No sample mean in this interval differs from μ by more than 1.64 grams

• Hence, there is a 90% probability that any arbitrary x differs from μ by no more than 1.64

• Thus, there is a 90% probability that μ is in the interval x 1.64

Page 5: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

• Look at this intervalx 1.64 * 1

• 1.64 is a z value, chosen to correspond to 90%, the confidence level

• 1 is the standard error of the mean

• So the width of the interval is set by the confidence level (which determines number of standard errors in interval) and the standard error, the measure of variation in sample means

Page 6: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

A C% Confidence Interval for the Population Mean When σ Is Known

xCzx

Page 7: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

Examples:

• A population of Christmas trees has unknown mean with σ = 4. For a sample of 25 trees, the sample mean = 16.6 ft. Calculate a 95% confidence interval for the population mean.

Page 8: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

• Same data: calculate a 90% confidence interval

• Same data: suppose that we increase sample size to 81

Page 9: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

Width of Confidence Interval Depends On

• Confidence level: as C increases, width decreases

• Sample size: as n increases, width decreases

• Variability in population: as σ increases, standard error increases and width of interval increases

Page 10: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

• The quantity zC * σX is called the maximum error in the estimate

• The quantity 2 * zC * σX is called the precision in the estimate– this quantity is the width of the confidence

interval

Page 11: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

FINDING THE RIGHT SAMPLE SIZE

• Sometimes we wish to hold the error in the estimate within some limit

• Define e = zC * σX or substituting

ne

Solve this expression for n, yielding

Page 12: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

2

e

zn C

Example: With σ = 4 and 95% confidence level, we require that the maximum error in the estimate be no more than 0.5 ft. What sample size is necessary?

Page 13: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

Examples:

• Expectations of inflation are known to be normally distributed with standard deviation = 1.2%. A survey of sixty households found a sample average expectation of 4% inflation for the coming year. Calculate a 98% confidence interval for the population’s expectation of inflation in the coming year.

Page 14: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

• If we require a maximum error in the estimate of 0.1%, how large a sample must we take?

• Cigarette filters have a “process” standard deviation of 0.3 mm with normal distribution. The current mean is unknown, but a sample of 25 filters have a mean of 20 mm. – Calculate a 90% confidence interval for the

population mean– Find the sample size necessary to hold the

error in the estimate to 0.04 mm

Page 15: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

Student’s t distribution

• Suppose σ is NOT known; then we are not entitled to use a z value in calculating confidence intervals

• If, however– The population is known to be normally

distributed OR– The sample size is large enough to invoke the

Central Limit Theorem, then we use

• A value drawn from the t distribution

Page 16: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

Hey, Prof, what’s a t distribution?

• Characteristics– Symmetric about its mean of zero– Values tend to cluster in the center, producing

a bell shaped curve

• Differences from z:– Fatter tails and less mass in the center– There is a family of t distributions, based on

“degrees of freedom”

Page 17: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

• Degrees of freedom: the sample size minus number of parameters to be estimated before estimating a variance

1

)( 22

n

xxs

Before estimating the variance, we must first calculate x-bar, an estimate of the population mean: we lose one degree of freedom, leaving us with n – 1 degrees of freedom

Page 18: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

Confidence Intervals with the t distribution:

n

ss

stx

x

x

Where t is chosen for the desired confidence level and has n – 1 degrees of freedom

Page 19: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

Examples:

• Seven male students are allowed to imbibe their favorite beverage until they are visibly inebriated. The amounts consumed in ounces are: 3.7, 2.9, 3.2, 4.1, 4.6, 2.3, 2.5. Calculate a 95% confidence interval for the amount of the drink it would take to get the average member of the population drunk.

Page 20: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

• Calculate x-bar and s

• Then calculate the sample standard error

• Find t for 6 degrees of freedom and = 0.025

• Finally, calculate the confidence interval

Page 21: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

• In a sample of 41 students who work, the sample mean is 16.561 hours and s = 5.7128 hours. The distribution appears to be somewhat skewed upwards. Find a 90% confidence interval for the average hours worked by all ASU students who work.

Page 22: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

USE OF THE t DISTRIBUTION

• Footnote: Who was “Student”? A pseudonym for William Gosset

• The t is often thought of as a small-sample technique

• But, STRICTLY SPEAKING, the t should be used whenever the population standard deviation σ is NOT KNOWN

• Some practitioners use z whenever the sample is large– Central Limit Theorem– There isn’t much difference between t and z

Page 23: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

Population standard deviation known?

Yes

Population normal?

Yes No

Sample Size

n >= 30 n < 30

No

Population normal?

z value

z or t (see note)

Yes No

Sample Size

n >= 30 n < 30

t value z or t (see note) ERROR

ERROR

Page 24: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

Notes:

• For large samples with σ unknown, different practitioners may proceed differently. Some argue for using a z, appealing to CLT. Others use a t since it gives a less precise estimate. For this course: use a t whenever the population standard deviation is not known.

• Small samples from non-normal populations are beyond the scope of this course

Page 25: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

Confidence intervals for the population proportion

• Sample proportion p = x/n• E(p) = and

np)1(

In general is not know, so must be estimated with p and we use

n

ppsp

)1(

Page 26: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

• Then the confidence interval is

• p zC sp

• Note that proportion problems always use a z value– Normal approximates binomial

• EXAMPLE: Of 112 students in a sample, 70 have paying jobs. Calculate a 95% confidence interval for the proportion in the population with paying jobs.

Page 27: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

• p = 70/112 = 0.625

• 0.625 1.96 * 0.045 etc.• 0.625 0.089660819 or 0.625 0.09• We are 95% confident that

0.54 0.71

045745315.0112

375.0625.0

ps

Page 28: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

• EXAMPLE:

• In a sample of 320 professional economists, 251 agreed that “offshoring” jobs is good for the American economy. Calculate a 90% confidence interval for the proportion in the population of professional economists who hold this view.

Page 29: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

Finding the Right Sample Size

• The error in the estimate is given by zC σp or, substituting

nze C

)1(

Solving for n yields:

2

2 )1(

e

zn C

Page 30: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

• In general is not known

• Two solutions:– Assume = 0.5

• Result is the largest sample that would ever be needed

– Conduct a pilot study and use the resulting p as an estimate of

• May give a somewhat smaller sample size if p is much different from 0.5

• Saves sampling cost

Page 31: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

Example:

• Above we had a 95% confidence interval with n = 112 of 0.625 0.09 or a 9% error. Suppose we require a maximum error of 3%.

• Approach 1: let = 0.5

106811.106703.

5.05.096.12

2

n

Page 32: ESTIMATION Estimation: process of using sample values to estimate population values Point Estimates: parameter is estimated as single point –Examples:

• Approach 2: assume = 0.625

100141.100003.

375.0625.096.12

2

n

The difference is more dramatic if p is much different from 0.5. In a random sample of 300 students in NC, 30 have experienced “study” abroad. A 95% confidence interval for the population proportion is 10% 3.4%. Suppose we require a maximum error of 2%. Approach 1 gives _______ and approach 2 gives _________.