Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

28
Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1

Transcript of Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

Page 1: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

Chapter 12Confidence Intervals and Hypothesis

Tests for Means

© 2010 Pearson Education

1

Page 2: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

2

12.1 The Sampling Distribution for the Mean

We found confidence intervals for proportions to be

MEpˆ

where the ME was equal to a critical value, z*,times SE( ).

Our confidence intervals for means will be

MEy

where the ME will be a critical value times SE( ).y

Page 3: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

3

12.1 The Sampling Distribution for the Mean

The standard deviation of the sample mean is given below.

nySD

)(

So we need know the true value of the population standard deviation σ.

Instead of σ, we will use s, the sample standard deviation from the data. We get the following formula for standard error.

n

sySE )(

Page 4: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

4

12.1 The Sampling Distribution for the Mean

Gosset’s t

William S. Gosset discovered that when he used the standard error the shape of the curve was no longer Normal.

He called the new model the Student’s t, which is a model that is always bell-shaped, but the details change with the sample sizes.

The Student’s t-models form a family of related distributions depending on a parameter known as degrees of freedom.

,/ ns

Page 5: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

5

12.2 A Confidence Interval for Means

Page 6: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

6

12.2 A Confidence Interval for Means

Page 7: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

7

Student’s t-models are unimodal, symmetric, and bell-shaped, just like the Normal model.

But t-models (solid curve below) with only a few degrees of freedom have a narrower peak than the Normal model (dashed curve below) and have much fatter tails.

12.2 A Confidence Interval for Means

As the degrees of freedom increase, the t-models look more and more like the Normal model.

Page 8: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

8

Independence Assumption

There is no way to check independence of the data, but we should think about whether the assumption is reasonable.

Randomization Condition: The data arise from a random sample or suitably randomized experiment.

10% Condition: The sample size should be no more than 10% of the population. For means our samples generally are, so this condition will only be a problem if our population is small.

12.3 Assumptions and Conditions

Page 9: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

9

Normal Population Assumption

Student’s t-models won’t work for data that are badly skewed. We assume the data come from a population that follows a Normal model. Data being Normal is idealized, so we have a “nearly normal” condition we can check.

Nearly Normal Condition: The data come from a distribution that is unimodal and symmetric. This can be checked by making a histogram.

12.3 Assumptions and Conditions

Page 10: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

10

Normal Population Assumption

• For very small samples (n < 15), the data should follow a Normal model very closely. If there are outliers or strong skewness, t methods shouldn’t be used.

• For moderate sample sizes (n between 15 and 40), t methods will work well as long as the data are unimodal and reasonably symmetric.

• For sample sizes larger than 40, t methods are safe to use unless the data are extremely skewed. If outliers are present, analyses can be performed twice, with the outliers and without.

12.3 Assumptions and Conditions

Page 11: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

11

Normal Population Assumption

In business, the mean is often the value of consequence.

Even when we must sample from a very skewed distribution, the Central Limit Theorem tells us that the sampling distribution of our sample mean will be close to Normal.

We can use Student’s t methods without much worry as long as the sample size is large enough.

12.3 Assumptions and Conditions

Page 12: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

12

Normal Population Assumption

Example: The histogram below displays the compensation of 500 CEO’s. We see an extremely skewed distribution.

12.3 Assumptions and Conditions

Page 13: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

13

Normal Population Assumption

Example (continued): Taking a sample of 100 CEO’s, we obtain the nearly Normal plot below for the sample means.

12.3 Assumptions and Conditions

Page 14: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

14

• If the confidence interval is for the mean, then do not interpret the results in terms of individuals.

• Don’t forget that the true mean does not vary, but the confidence interval will vary based on the sample.

• Don’t suggest that a particular confidence interval somehow sets the standard for every other interval.

12.4 Cautions About Interpreting Confidence Intervals

Page 15: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

15

12.5 One-Sample t-Test

Page 16: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

16

Finding t-Values by Hand

The Student’s t-model is different for each value of degrees of freedom.

Typically we limit ourselves to 80%, 90%, 95%, and 99% confidence levels.

We can use technology to give critical values for any number of degrees of freedom and for any confidence levels we need. More precision won’t necessarily help make good business decisions.

12.5 One-Sample t-Test

Page 17: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

17

Finding t-Values by Hand

A typical t-table is shown here. The table shows the critical values for varying degrees of freedom, df, and for varying confidence intervals.

Since the t-models get closer to the normal as df increases, the final row has critical values from the Normal model and is labeled “∞”.

12.5 One-Sample t-Test

Page 18: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

18

Finding t-Values by Hand

For example, suppose we’ve performed a one-sample t-test with 19 df and a critical value of 1.639, and we want the upper tail P-value.

From the table, we see that 1.639 falls between 1.328 and 1.729. All we can say is that the P-value lies between P-values of these two critical values, so 0.05 < P < 0.10.

12.5 One-Sample t-Test

Page 19: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

19

We know that a larger sample will almost always give better results, but more data costs money, effort, and time.

We know how to find the margin of error for the mean.

12.6 Sample Size

)(*1 ySEtME n

We also know how to find the standard error for the mean.

n

sySE )(

n

stME n

*1

From these equations we obtain an equation for the sample size n.

Page 20: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

20

The equation has several values that we don’t know.

We need to know s, but we won’t know s until we collect some data, and we want to calculate the sample size before we collect the data.

Often a “good guess” for s is sufficient.

If we have no idea what the value for s is, we could run a small pilot study to get some feeling for the size of the standard deviation.

12.6 Sample Size

Page 21: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

21

Without knowing n, we don’t know the degrees of freedom, and we can’t find the critical value,

One common approach is to use the corresponding z* value from the Normal model.

For example, if you’ve chosen a 95% confidence interval, then use 1.96 (or 2).

If your estimated sample size is 60 or more, your z* was probably a good guess. If it’s smaller, use z* at first, finding n, and then replacing z* with the corresponding and calculating the sample size once more.

12.6 Sample Size

.*1nt

*1nt

Page 22: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

22

Sample size calculations are never exact.

The margin of error you find after collecting the data won’t match exactly the one you used to find n.

Before you collect data, it’s always a good idea to know whether the sample size is large enough to give you a good chance of being able to tell you what you want to know.

12.6 Sample Size

Page 23: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

23

If we know the true population mean, μ, we can find the standard deviation using n instead of n – 1.

*12.7 Degrees of Freedom – Why n – 1?

n

ys

2)(

We use instead of μ. For any sample, will be as close to the data values as possible, and the population mean μ will be farther away.

If we use instead of in the equation to calculate s, our standard deviation will be too small.

We compensate for this by dividing by n – 1 instead of by n.

yy

2)( y2)( yy

Page 24: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

24

What Can Go Wrong?

First, you must decide when to use Student’s t methods.

•Don’t confuse proportions and means. Use Normal models with proportions. Use Student’s t methods with means.

• Be careful of interpretation when confidence intervals overlap. Don’t assume that the means of overlapping confidence intervals are equal.

Page 25: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

25

What Can Go Wrong?

Student’s t methods work only when the Normal Population Assumption is true.

• Beware of multimodality. If you see this, try to separate the data into groups.

• Beware of skewed data. If it is skewed, try re-expressing the data

• Investigate outliers. If they are clearly in error, remove them. If they can’t be removed, you might run the analysis with and without the outlier.

Page 26: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

26

What Can Go Wrong?

The are other risks when doing inferences about means.

• Watch out for bias. Measurements can be biased.

• Make sure data are independent. Consider whether there are likely violations of independence in the data collection methods.

• Make sure that data are from an appropriately randomized sample.

Page 27: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

27

What Have We Learned?

• What we can say about a population mean is inferred from the data using the mean and standard deviation of a representative random sample.

• To describe the sampling distribution of sample means using a new model we select from the Student’s t family based on our degrees of freedom.

• Our ruler for measuring the variability in sample means is the standard error.

n

sySE )(

Page 28: Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.

© 2010 Pearson Education

28

What Have We Learned?

• To find the margin of error for a confidence interval using that standard error ruler and a critical value based on a Student’s t-model.

• To use that standard error ruler to test hypotheses about the population mean.

• The reasoning of inference, the need to verify that the appropriate assumptions are met, and the proper interpretation of confidence intervals and P-values all remain the same regardless of whether we are investigating means or proportions.