Chapter 10 Inferences from Two Samples

49
1 Chapter 10 Inferences from Two Samples Inferences about Two Means: Independent and Large Samples Inferences about Two Means: Independent and Small Samples Inferences about Two Means: Matched Pairs Inferences about Two Proportions

description

Inferences about Two Means: Independent and Large Samples Inferences about Two Means: Independent  and Small Samples Inferences about Two Means: Matched Pairs Inferences about Two Proportions. Chapter 10 Inferences from Two Samples. - PowerPoint PPT Presentation

Transcript of Chapter 10 Inferences from Two Samples

Page 1: Chapter 10     Inferences from Two Samples

1

Chapter 10 Inferences from Two Samples

• Inferences about Two Means: Independent and Large Samples

• Inferences about Two Means: Independent  and Small Samples

• Inferences about Two Means: Matched Pairs

• Inferences about Two Proportions

Page 2: Chapter 10     Inferences from Two Samples

2

Overview

There are many important and meaningful situations in which it becomes necessary

to compare two sets of sample data.

Page 3: Chapter 10     Inferences from Two Samples

3

10-2

Inferences about Two Means:Independent and

Large Samples

Page 4: Chapter 10     Inferences from Two Samples

4

Definitions

Two Samples: IndependentThe sample values selected from one population are not related or somehow paired with the sample values selected from the other population.

If the values in one sample are related to the values in the other sample, the samples are dependent. Such samples are often referred to as matched pairs or paired samples.

Page 5: Chapter 10     Inferences from Two Samples

5

Assumptions

1. The two samples are independent.

2. The two sample sizes are large. That is, n1 > 30 and n2 > 30.

3. Both samples are simple random samples.

Page 6: Chapter 10     Inferences from Two Samples

6

Characteristics of theSampling Distribution XA-XB

• The mean of the sampling distribution of all possible XA-XB is µA- µB.

• The standard deviation of the sampling distribution of all the possible values of

BA XX

Page 7: Chapter 10     Inferences from Two Samples

7

Hypothesis Tests

Test Statistic for Two Means: Independent and Large Samples

Page 8: Chapter 10     Inferences from Two Samples

8

Hypothesis Tests

Test Statistic for Two Means: Independent and Large Samples

(x1 - x2) - (µ1 - µ2)z =n1 n2

+1

. 2

22

Page 9: Chapter 10     Inferences from Two Samples

9

Hypothesis Tests

Test Statistic for Two Means: Independent and Large Samples

and If and are not known, use s1 and s2 in their places. provided that both samples are large.

P-value: Use the computed value of the test

statistic z, and find the P-value .

Critical values: Based on the significance level , find critical values .

Page 10: Chapter 10     Inferences from Two Samples

10

Coke Versus PepsiSample statistics are shown. Use the 0.01 significance level to test the claim that the mean weight of regular Coke is different from the mean weight of regular Pepsi.

Page 11: Chapter 10     Inferences from Two Samples

11

Coke Versus PepsiSample statistics are shown. Use the 0.01 significance level to test the claim that the mean weight of regular Coke is different from the mean weight of regular Pepsi.

Regular Coke Regular Pepsi

n 36 36

x 0.81682 0.82410

s 0.007507 0.005701

Page 12: Chapter 10     Inferences from Two Samples

12

Coke Versus Pepsi

Page 13: Chapter 10     Inferences from Two Samples

13

Claim: 1 2

Ho : 1 = 2

H1 : 1 2

= 0.01

Coke Versus Pepsi

Fail to reject H0Reject H0Reject H0

Z = - 2.575 Z = 2.5751 - = 0

or Z = 0

Page 14: Chapter 10     Inferences from Two Samples

14

Test Statistic for Two Means: Independent and Large Samples

2

(x1 - x2) - (µ1 - µ2)z =n1 n2

+1

. 2

2

Coke Versus Pepsi

Page 15: Chapter 10     Inferences from Two Samples

15

Test Statistic for Two Means: Independent and Large Samples

(0.81682 - 0.82410) - 0z =36

+

Coke Versus Pepsi

0.0075707 2 0.005701 2

36

= - 4.63

Page 16: Chapter 10     Inferences from Two Samples

16

Claim: 1 2

Ho : 1 = 2

H1 : 1 2

= 0.01

Coke Versus Pepsi

Fail to reject H0Reject H0 Reject H0

Z = - 2.575 Z = 2.5751 - = 0

or Z = 0

sample data:z = - 4.63

Page 17: Chapter 10     Inferences from Two Samples

17

Claim: 1 2

Ho : 1 = 2

H1 : 1 2

= 0.01

Coke Versus Pepsi

Fail to reject H0Reject H0 Reject H0

Z = - 2.575 Z = 2.5751 - = 0

or Z = 0

sample data:z = - 4.63

Reject Null

There is significant evidence to support the claim that there is a difference

between the mean weight of Coke and the mean weight of Pepsi.

Page 18: Chapter 10     Inferences from Two Samples

18

Confidence Intervals

Page 19: Chapter 10     Inferences from Two Samples

19

Confidence Intervals

(x1 - x2) - E < (µ1 - µ2) < (x1 - x2) + E

Page 20: Chapter 10     Inferences from Two Samples

20

Confidence Intervals

(x1 - x2) - E < (µ1 - µ2) < (x1 - x2) + E

n1 n2

+1 2where E = z

22

Page 21: Chapter 10     Inferences from Two Samples

21

Confidence Intervals

(x1 - x2) - E < (µ1 - µ2) < (x1 - x2) + E

where E = z n1 n2

+1 2

22

For Coke versus Pepsi, x1 - x2 = .00728, and z

Find an 80% confidence interval for the difference for Cokeand Pepsi.

.00728 + 1.28(.00157) = (.00527, .00929)

Page 22: Chapter 10     Inferences from Two Samples

22

t-Distribution Model

•The degrees of freedom are nA+ nB – 2•A pooled variance is the weighted mean of the sample variances. and is used if the the data is not normally distributed.

2nn

s1)(ns1)(ns

BA

2BB

2AA2

p

2

(x1 - x2) - (µ1 - µ2)t =n1 n2

+1

. 2

2

Page 23: Chapter 10     Inferences from Two Samples

23

Two groups were tested to see whether calcium reduces bloodpressure. The following data was collected. Is there evidence atthe .1 level that calcium reduces blood pressure?

Group 1 (calcium) 7 -4 18 17 -3 -5 1 10 11 –2Group 2 (placebo) -1 12 -1 -3 3 -5 5 2 -11 -1 -3

Group Treatment n x s 1 Calcium 10 5.00 8.743 2 Placebo 11 -.273 5.901

1. HO: µ1 - µ2 > 0 HA: µ1 - µ2 < 0

2. One tail t test, n < 30 11 + 10 – 2 = 19 d.f.

3. tcritical = -1.328

1.604

115.901

108.743

.273)(5.00

ns

ns

xxt 4.

22

2

22

1

21

21

Page 24: Chapter 10     Inferences from Two Samples

24

5. There is not enough evidence at the .1 level that calcium reduces blood pressure.

-1.328 1.604

Page 25: Chapter 10     Inferences from Two Samples

25

Inferences about Two Proportions

Assumptions

1. We have proportions from two     independent simple random samples.

2. For both samples, the conditions np 5 and nq 5 are satisfied.

Page 26: Chapter 10     Inferences from Two Samples

26

Notation for Two ProportionsFor population 1, we let:

p1 = population proportion

n1 = size of the sample

x1 = number of successes in the sample

Page 27: Chapter 10     Inferences from Two Samples

27

Notation for Two ProportionsFor population 1, we let:

p1 = population proportion

n1 = size of the sample

x1 = number of successes in the sample

p1 = x1/n1 (the sample proportion)^

Page 28: Chapter 10     Inferences from Two Samples

28

Notation for Two ProportionsFor population 1, we let:

p1 = population proportion

n1 = size of the sample

x1 = number of successes in the sample

p1 = x1/n1 (the sample proportion)

q1 = 1 - p

1

^^

^

Page 29: Chapter 10     Inferences from Two Samples

29

Notation for Two ProportionsFor population 1, we let:

π1 = population proportion

n1 = size of the sample

x1 = number of successes in the sample

p1 = x1/n1 (the sample proportion)

q1 = 1 - p

1

to π2, n

2 , x

2 , p

2. and q

2 , which come from

population 2.

The corresponding meanings are attached

Page 30: Chapter 10     Inferences from Two Samples

30

Test Statistic for Two Proportions

For H0: π1 = π

2 , H0: π1

π2 , H0: π1

π2

HA:π

1 π

2 , HA: π

1 < π

2 , HA: π

1> π

2

Page 31: Chapter 10     Inferences from Two Samples

31

where π1 - π

2 = 0 (assumed in the null hypothesis)

Test Statistic for Two Proportions

For H0: p1 = p

2 , H0: p1

p2 , H0: p1

p2

H1: p1

p2 , H1: p1

< p2 , H1: p

1> p

2

Page 32: Chapter 10     Inferences from Two Samples

32

Test Statistic for Two Proportions

x1

n1

x2

n2

p1

p2

where p1 - p

2 = 0 (assumed in the null hypothesis)

= =and

For H0: p1 = p

2 , H0: p1

p2 , H0: p1

p2

H1: p1

p2 , H1: p1

< p2 , H1: p

1> p

2

1n

)P(1P

1n

)P(1PSE

B

BB

A

AA

Page 33: Chapter 10     Inferences from Two Samples

33

(p1 - p

2 ) - E < (π

1 - π

2 ) < (p

1 - p

2 ) + E

Confidence Interval Estimate of π1 - π

2

If 0 is not in the interval, one may beC% confident that the two population proportions are different.

Page 34: Chapter 10     Inferences from Two Samples

34

A sample of households in urban and rural homes displayed thefollowing data for preference of artificial or natural Christmas trees:

Population n X p = X/n 1(urban) 261 89 .341 2(rural 160 64 .400

Is there a difference in preference between urban and rural homes?with a confidence interval of 90%?

1n

)P(1P

1n

)P(1PSE

B

BB

A

AA

.048159

.400(.600)

260

.341(.659)SE

Page 35: Chapter 10     Inferences from Two Samples

35

SEZ)( critical21

.04851.645.400)(.341

(-.139, .021)

We are 90% confident that the difference in proportions is between-.14 and .02. Because the interval contains 0, we are not confidentthat either group has a stronger preference for natural trees thanthe other group.

Page 36: Chapter 10     Inferences from Two Samples

36

Assumptions

1. The sample data consist of matched pairs.

2. The samples are simple random samples.

3. If the number of pairs of sample data is small (n 30), then the population of differences in the paired values must be approximately normally distributed.

Page 37: Chapter 10     Inferences from Two Samples

37

Notation for Matched Pairs

µd = mean value of the differences d for the

population of paired data

Page 38: Chapter 10     Inferences from Two Samples

38

sd = standard deviation of the differences d for

the paired sample data

n = number of pairs of data.

µd = mean value of the differences d for the

population of paired data

d = mean value of the differences d for the

paired sample data (equal to the mean of the x - y values)

Notation for Matched Pairs

Page 39: Chapter 10     Inferences from Two Samples

39

Test Statistic for Matched Pairs of Sample Data

Page 40: Chapter 10     Inferences from Two Samples

40

t = d - µdsd

n

Test Statistic for Matched Pairs of Sample Data

where degrees of freedom = n - 1

Page 41: Chapter 10     Inferences from Two Samples

41

Critical Values

If n 30, critical values are found in Table A-4 (t-distribution).

If n > 30, critical values are found in Table A- 2 (normal distribution).

Page 42: Chapter 10     Inferences from Two Samples

42

Confidence Intervals

Page 43: Chapter 10     Inferences from Two Samples

43

Confidence Intervals

d - ME < µd < d + ME

Page 44: Chapter 10     Inferences from Two Samples

44

Confidence Intervals

degrees of freedom = n -1

d - ME < µd < d + ME

where ME = t sd

n

Page 45: Chapter 10     Inferences from Two Samples

45

How Much Do Male Statistics Students Exaggerate Their Heights?

Using the sample data from Table 8-1 with the outlier excluded, construct a 95% confidence interval estimate of d, which is the mean of the differences between reported heights and measured heights of male statistics students.

Page 46: Chapter 10     Inferences from Two Samples

46

Reported and Measured Heights (in inches) of Male Statistics Students

Student A B C D E F G H I J K L

Reported 68 74 82.25 66.5 69 68 71 70 70 67 68 70 Height

Measured 66.8 73.9 74.3 66.1 67.2 67.9 69.4 69.9 68.6 67.9 67.6 68.8Height

Difference 1.2 0.1 7.95 0.4 1.8 0.1 1.6 0.1 1.4 -0.9 0.4 1.2

Page 47: Chapter 10     Inferences from Two Samples

47

d = 1.279

s = 2.243

n = 12

t = 2.201 (found from Table A-3 with 11 degrees of freedom and 0.05 in two tails)

How Much Do Male Statistics Students Exaggerate Their Heights?

Page 48: Chapter 10     Inferences from Two Samples

48

E = t sd

n

How Much Do Male Statistics Students Exaggerate Their Heights?

E = (2.201)( )2.244

12

= 1.426

Page 49: Chapter 10     Inferences from Two Samples

49

How Much Do Male Statistics Students Exaggerate Their Heights?

1.279 - 1.464 < µd < 1.279 + 1.464

In the long run, 95% o f such samples will lead to confidence intervals that actually do contain the true population mean of the differences. Since

the interval does contain 0, the true value of µd is

not significantly different from 0. There is not sufficient evidence to support the claim that there is a difference between the reported heights and the measured heights of male statistics students.