Lesson Four: Student t Distribution and Comparing Samples.

39
Lesson Four: Student t Distribution and Comparing Samples

Transcript of Lesson Four: Student t Distribution and Comparing Samples.

Lesson Four: Student t Distribution and

Comparing Samples

Do You Remember the Coefficient of Variation? We looked at the three samples of BRUS, comparing to the total.

We wondered if they were significantly different from each other.

Hypotheses

Books R US Sample Coefficient of Variation

WK 1-14 WK 15-28

WK 29-42

Standard Deviation 8241 11502 7557

Mean 9084 19543 20574

Coefficient of Var.

0.91 0.59 0.37

Total all Weeks

Mean= 16,400

S.D. = 10,324

C.V. = 0.63

We asked: Why the Change in Variation?

The first sample looks VERY different from the other two. Let’s develop another formula to compare the two.

Test Statistic Test Statistic Formula #4Formula #4 KnownKnown Formula #5Formula #5 UnknownUnknown

Rejection RuleRejection Rule Right-Tailed Left-Right-Tailed Left-TailedTailed

z xn

0

/z x

n

0

/z xs n

0

/z xs n

0

/

Reject Reject HH0 if 0 if zz > > Reject Reject HH0 if 0 if zz < <

One-Tailed Tests about a One-Tailed Tests about a Population Mean: Large-Sample Population Mean: Large-Sample

Case (Case (nn >> 30) 30)

Example: Two Tail Test

To be 95% confident, you have 5% chance of error. Divide this between both tails: 2.5%

on each tail

To be 95% confident, you have 5% chance of error. Divide this between both tails: 2.5%

on each tail

00 1.96 1.96

Do Not Reject H0

0.95 of Area Under the

Curve

Do Not Reject H0

0.95 of Area Under the

Curve

zz

Reject H0

-1.96-1.96

Reject H0

0.0250.025 0.0250.025

Example: One Tail Test

To be 95% confident, you have 5% chance of error. All 5% is in One Tail.

H1 >

To be 95% confident, you have 5% chance of error. All 5% is in One Tail.

H1 >

00 1.65 1.65

Do Not Reject H0

0.95 of Area Under the

Curve

Do Not Reject H0

0.95 of Area Under the

Curve

zz

Reject H0

0.050.05

0.45051.6

0.05

Example: Books R US

Left tail hypothesis, Reject Reject HH0 if 0 if zz < < = .05, = .05, find by .5000 - .4500find by .5000 - .4500

““Critical Value” of Z = -1.65Critical Value” of Z = -1.65

A value of 1.65 for a “Z Statistic” is found by locating the value closest to .4500 (.4505 in this case, round up) and find 1.6 on the row heading, and 0.5 on the column heading.

Left tail hypothesis, Reject Reject HH0 if 0 if zz < < = .05, = .05, find by .5000 - .4500find by .5000 - .4500

““Critical Value” of Z = -1.65Critical Value” of Z = -1.65

A value of 1.65 for a “Z Statistic” is found by locating the value closest to .4500 (.4505 in this case, round up) and find 1.6 on the row heading, and 0.5 on the column heading.

=

Z = -7,316 2,760

Applying the Formula to Calculate Z

z xn

0

/z x

n

0

/14

(9,084 – 16,400

10,324 /

Z = -2.65

Example: Books R US

00 -1.65 -1.65

Do Not Reject H0Do Not Reject H0

zz-2.65-2.65

Reject H0

Left tail hypothesis, Reject Reject HH0 if 0 if zz < < is less than (further to the is less than (further to the left of) -1.65left of) -1.65

Left tail hypothesis, Reject Reject HH0 if 0 if zz < < is less than (further to the is less than (further to the left of) -1.65left of) -1.65

Problem StatementsTells what is going on Tells when it is happening Tells who is impacted by it Tells where the problem occurs Tells how the problem occurs

A problem statement is a question about possible relationship between the manipulated and responding variables in a situation that implies something to do or try

The Five-Step Process for Hypothesis Testing (Thinking Stages)

State the null and alternative hypotheses Find the level of significance .01 = scientific research .05 = consumer and product research .10 = political polling

Hypothesis TestDevelop the null and alternative hypotheses. Develop the null and alternative hypotheses. “Ho: The sample mean of the first sample “Ho: The sample mean of the first sample >> H1: Thesample mean of the first sample < sample mean of the first sample < ”

Specify the level of significance (.05)Specify the level of significance (.05)Select the test statistic: (Z statistic).elect the test statistic: (Z statistic).

Determine the critical value for Determine the critical value for HHo: -1.65o: -1.65Collect sample data, compute test statistic..

Computed value = Computed value = -2.65, smaller than -1.65

Decision: Reject Ho, the first sample is Decision: Reject Ho, the first sample is statistically, significantly smaller than statistically, significantly smaller than

HypothesesIn our last lesson, we were dealing with the following:

• H0 = no effect, chance differences x =

• H1 = effect or difference exists x =

This is for a two – tail test.

We’ve set a critical value of 1.96, but let’s say that the This is the critical value of the p-value.

Hypotheses

For a one tail test, we might want to see if something is GREATER THAN the mean, or LESS THAN the mean.H0 = no effect, chance differences x > H1 = is an effect, it is likely that x < our Books R US data.Let’s combine this with what we just did with the first sample. Remember that we had Z = -2.65On the Z table, this gives us 0.4960

Figure 6-16: The P -value of a z statistic can be approximated by noting which levels from Table D it

falls between. Here, P lies between 0.20 and 0.25.

-1.65 -1.65

Do Not Reject H0Do Not Reject H0

-2.65-2.65

Reject H0

0.49602.6

0.05

P - ValueIn order to REJECT the null, the p-value must be less than the level, in this example, .05..5000 - .4960 = .004

The smaller the P, the stronger the evidence that H0 is false

Now, we REJECT the null, ACCEPT H1

Why? Because the P value is smaller than the critical value of

What if all we have is a sample standard deviation, and a sample mean?

In that case, we use this formula:In that case, we use this formula: UnknownUnknown

But with the BRUS data, n < 30 in our sample, so we must use the t Statistic.

z xs n

0

/z xs n

0

/

Point and Interval Estimates

If the population standard deviation is unknown and the sample is less than 30 we use the t distribution. Formula #6

n

stX

Problem Using Formula #6

In the second 14 weeks of the Books R Use data, the mean total sales were $19,543. The standard deviation was $11,502. At the .05 level of significance, what was the confidence interval? Did the value $16,400 fall inside our outside the confidence interval?

Test StatisticTest Statistic Known Known UnknownUnknown

ThThis test statistic has a t distribution with n - 1 degrees of freedom, or “DF”.

Rejection RuleRejection Rule Right-Tailed Left-Right-Tailed Left-TailedTailed

Reject Reject HH0 if 0 if tt > >t t Reject Reject HH0 if 0 if tt <- <-t t

One-Tailed Tests about a Population One-Tailed Tests about a Population Mean: Small-Sample Case (Mean: Small-Sample Case (nn < 30) < 30)

tx

n

0

/tx

n

0

/txs n

0

/txs n

0

/

Formula 7

Formula 8

So, Do we Choose the Z or the t Statistic?

Remember our three sets of weeks? There were 1 in each set.Since there are fewer than 30 observations in a sample, we’ll use the t test. Use this formula:

ThThis test statistic has a t distribution with n - 1 degrees of freedom, or “DF”.For weeks 1 – 14, the X was $9,084, s was $8,241 was $16,400, and. n = 14

txs n

0

/txs n

0

/

So, Do we Choose the Z or the t Statistic? We Use t.

$9,084 - $16,400

( $8,241 / 14 )

= - $7,316

2,202.5=t

t = -3.32

We need to find the “critical value” of t:

-2.160 See next slide

txs n

0

/txs n

0

/

STUDENT’S T

DISTRIBU-TION

DF = n – 1DF = 13

2.160

We can use a 2 – tail test

We can use a 1 – tail test

13

df

Hypothesis TestDevelop the null and alternative hypotheses. Develop the null and alternative hypotheses. “Ho: The sample mean of the first sample = “Ho: The sample mean of the first sample = H1: Thesample mean of the first sample = sample mean of the first sample = ”

Specify the level of significance (.05)Specify the level of significance (.05)Select the test statistic: (t statistic).elect the test statistic: (t statistic).

Determine the critical value for Determine the critical value for HHo: 2.160o: 2.160Collect sample data, compute test statistic..

Calculated t = Calculated t = -3.32, further to left of -2.160

Decision: Reject Ho, the first sample is Decision: Reject Ho, the first sample is statistically, significantly different from statistically, significantly different from

Using a P – Value Calculator

For the P-Value of a t statistic, go to http://www.danielsoper.com/statcalc Choose the Student t Distribution. In this case, it is 0.005531 for a two tail test, which is < the critical value of .05, so we reject the null.Use the same website to calculate the P-Value for a Z statistic.In many cases, modern computer programs will print the p-Value, so it is important to be able to understand its meaning.

Summary of Formulas

CV = s

X

Z = (X –

z xn

0

/z x

n

0

/

KnownKnown

UnKnownKnown

z xs n

0

/z xs n

0

/

tx

n

0

/tx

n

0

/

txs n

0

/txs n

0

/

KnownKnown

UnKnownKnown n

stX

Confidence RangeConfidence RangeFor a t StatisticFor a t Statistic

Comparing Two Samples

 Apply hypothesis testing to different populations and

samples in business research situations.

Comparing Two Samples

We want to apply hypothesis testing to different populations & samples in bus. Research situations.Examples of when do 2 independent samples when sample size is 30 or greater. Ex. When do 2 independent samples when sample size is less than 30.

Hypothesis TestingSingle Samples (<30)We compared the results of a single sample to a population valueWe determined whether the proposed population value was reasonableWe used the ‘Steps in Hypothesis Testing” (handout) to answer our research question about our sampleOne-tailed vs. Two-tailed

Hypothesis Testing Population Means: Large Samples

Is there a difference in the mean amount to residential real estate sold by male agents and female agents in south Florida?Let’s select random samples from 2 populations. We wish to investigate if these populations have the same meanWant to determine whether the samples are from the same or equal populationsIf the 2 populations are the same, we would expect the difference between the 2 sample means to be zero2 assumptions needed: Both samples are at least 30 The samples are from independent populations

Population Means: Large SamplesFormulas

Example A financial analyst wants to compare the turnover rates, in percent, for shares of oil-related stocks versus other stocks, such as GE and IBM. She selected 32 oil-related stocks and 49 other stocks. The mean turnover rate of oil-related stocks is 31.4 percent and the standard deviation 5.1 percent. For the other stocks, the mean rate was computed to be 34.9 percent and the standard deviation 6.7 percent. Is there a significant difference in the turnover rates of the two types of stock? Use the .01 significance level.

Is the mean salary of nurses larger than that of school teachers?The sample size is less than 30 ‘Small sample test of means’The 2 sample variances are pooled to estimate population variance; weighted meanThe weights are the degrees of freedom that each sample provides Assumptions: 1. The sampled populations follow the normal distribution 2. The two samples are from independent populations 3. The standard deviations of the two populations are equal

Hypothesis Testing Population Means: Small Samples

Population Means: Small SamplesFormulas

Example

A recent study compared the time spent together by single- and dual-earner couples. According to the records kept by the wives during the study, the mean amount of time spent together watching television among the single-earner couples was 61 minutes per day, with a standard deviation of 15.5 minutes. For the dual-earner couples, the mean number of minutes spent watching television was 48.4 minutes, with a standard deviation of 18.1 minutes. At the .01 significance level, can we conclude that the single-earner couples on average spend more time watching television together? There were 15 single-earner and 12 dual-earner couples studied.

Application to Lemonade Stand

Results