Comparing Two Population Means The Two-Sample T-Test and T-Interval.

27
Comparing Two Population Means The Two-Sample T-Test and T-Interval

Transcript of Comparing Two Population Means The Two-Sample T-Test and T-Interval.

Page 1: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

Comparing Two Population Means

The Two-Sample

T-Test and T-Interval

Page 2: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

Example

Population of all female college students

Sample of n2 = 21 females report average of 85.7 mph

Population of all male college students

Sample of n1 = 17 males report average of 102.1 mph

Do male and female college students differ with respect to their fastest reported driving speed?

Page 3: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

Comparative Observational Study

• A research study in which two or more groups are compared with respect to some measurement or response.

• The groups, determined by their natural characteristics, are merely “observed.”

Page 4: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

Graphical summary of sample data

75 85 95 105 115 125 135 145

Fastest Driving Speed (mph)

Gender

female

male

Page 5: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

Numerical summary ofsample data

Gender N Mean Median TrMean StDevfemale 21 85.71 85.00 85.26 9.39male 17 102.06 100.00 101.00 17.05

Gender SE Mean Minimum Maximum Q1 Q3female 2.05 75.00 105.00 77.50 92.50male 4.14 75.00 145.00 90.00 115.00

The difference in the sample means is 102.06 - 85.71 = 16.35 mph

Page 6: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

The Question in Statistical Notation

Let M = the average fastest speed of all male students.and F = the average fastest speed of all female students.

Then we want to know whether M F.

This is equivalent to knowing whether M - F 0

Page 7: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

All possible questions in statistical notation

In general, we can always compare two averages by seeing how their difference compares to 0:

This comparison… is equivalent to …

1 2

1 - 2

0

1 > 2

1 - 2 > 0

1 < 2

1 - 2 < 0

Page 8: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

Set up hypotheses

• Null hypothesis: – H0: M = F [equivalent to M - F = 0]

• Alternative hypothesis:– Ha: M F [equivalent to M - F 0]

Page 9: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

Make initial assumption

• Assume null hypothesis is true.

• That is, assume M = F

• Or, equivalently, assume M - F = 0

Page 10: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

Determine the P-value

• P-value = “How likely is it that our sample means would differ by as much as 16.35 m.p.h. if the difference in population means really is 0?”

• The P-value, 0.001, is small. Our sample result is not likely if the null hypothesis is true.

• Reject the null hypothesis.

Page 11: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

Make a decision

• There is sufficient evidence, at the 0.05 level of significance, to conclude that the average reported fastest driving speed of all male college students differs from the average reported fastest driving speed of all female students.

Page 12: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

How the P-value is calculated

The P-value is determined by standardizing, that is, by calculating the two-sample test statistic...

differencetheoferrorstandarddifferenceedhypothesizmeanssampleindifferencet

…and comparing the value of the test statistic to the appropriate sampling distribution.

The sampling distribution depends on how you estimate the standard error of the differences.

Page 13: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

If variances of the measurements of the two groups are not equal...

Estimate the standard error of the difference as:

2

2

1

1ns

ns 22

Then the sampling distribution is an approximate t distribution with a complicated formula for d.f.

Page 14: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

If variances of the measurements of the two groups are equal...

Estimate the standard error of the difference using the common pooled variance:

21

n1

n1s2

p

Then the sampling distribution is a t distribution with n1+n2-2 degrees of freedom.

where 2nn1)s(n1)s(n

s21

2211p

222

Assume variances are equal only if neither sample standard deviation is more than twice that of the other sample standard deviation.

Page 15: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

Two-sample t-test in Minitab

• Select Stat. Select Basic Statistics.

• Select 2-sample t to get a Pop-Up window.

• Click on the radio button before Samples in one Column. Put the measurement variable in Samples box, and put the grouping variable in Subscripts box.

• Specify your alternative hypothesis.

• If appropriate, select Assume Equal Variances.

• Select OK.

Page 16: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

Pooled two-sample t-test

Two sample T for Fastest

Gender N Mean StDev SE Meanfemale 21 85.71 9.39 2.0male 17 102.1 17.1 4.1

95% CI for mu (female) - mu (male ): ( -25.2, -7.5)T-Test mu (female) = mu (male ) (vs not =): T = -3.75 P = 0.0006 DF = 36Both use Pooled StDev = 13.4

Page 17: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

(Unpooled) two-sample t-test

Two sample T for Fastest

Gender N Mean StDev SE Meanfemale 21 85.71 9.39 2.0male 17 102.1 17.1 4.1

95% CI for mu (female) - mu (male ): ( -25.9, -6.8)T-Test mu (female) = mu (male ) (vs not =): T = -3.54 P = 0.0017 DF = 23

Page 18: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

Assumptions for correct P-values

• Data in each group follow a normal distribution.

• If use pooled t-test, the variances for each group are equal.

• The samples are independent. That is, who is in the second sample doesn’t depend on who is in the first sample (and vice versa).

Page 19: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

Confidence interval for difference in two means

error)standard*(tmeanssampleindifference

We can be “such-and-such” confident that the difference in the population means falls in the interval...

where the t* multiplier depends on the confidence level and is obtained either from the appropriate t distribution.

Page 20: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

Interpreting a confidence interval for the difference in two means…

If the confidenceinterval contains… then, we conclude …

zero the two means maynot differ

only positivenumbers

first mean is largerthan second mean

only negativenumbers

first mean is smallerthan second mean

Page 21: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

Two-sample confidence interval in Minitab

• Select Stat. Select Basic Statistics.

• Select 2-sample t to get a Pop-Up window.

• Click on the radio button before Samples in one Column. Put the measurement variable in Samples box, and put the grouping variable in Subscripts box.

• Specify confidence level.

• If appropriate, select Assume Equal Variances.

• Select OK.

Page 22: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

Example

Two sample T for laundry

gender N Mean StDev SE MeanM 44 3.07 1.81 0.27F 44 3.89 3.88 0.58

95% CI for mu (M) - mu (F): ( -2.11, 0.47)T-Test mu (M) = mu (F) (vs not =): T = -1.27 P = 0.21 DF = 60

Page 23: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

Example

Population of all people seeing “10 million”

Sample of n2 = 33 people

Population of all people seeing “80 million”

Sample of n1 = 34 people

Do the average guesses of the population of Turkey differ depending on preliminary information received?

Page 24: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

Randomized comparative experiment

• A study in which two or more groups are randomly assigned to a “treatment” to see how the treatment affects some “response.”

• If each “experimental unit” has the same chance of receiving any treatment, then the experiment is called a “completely randomized design.”

Page 25: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

Graphical summary of data

0 100 200 300

Guess of Population of Turkey

Form

10

80

Page 26: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

Two-sample t-test results

Two sample T for Turkey

Form2 N Mean StDev SE Mean10 33 12.50 8.50 1.580 34 62.8 54.8 9.4

95% CI for mu (10) - mu (80): ( -69.6, -30.9)T-Test mu (10) = mu (80) (vs <): T = -5.28 P = 0.0000 DF = 34

Page 27: Comparing Two Population Means The Two-Sample T-Test and T-Interval.

Conclusions of Turkey experiment

• There is sufficient evidence, at the 0.05 level, to conclude that the average guesses of the population of Turkey differ between the two forms.

• The population mean guess of the “10 million” form is lower than the population mean guess of the “80 million” form.