The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle...

46
The t-distribution The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote under the name "Student".

Transcript of The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle...

Page 1: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

The t-distributionThe t-distribution

William Gosset lived from 1876 to 1937

Gosset invented the t -test to handle small samples for quality control in brewing. He wrote under the name "Student".

Page 2: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

t test-origint test-origin

Founder WS GossetFounder WS Gosset

Wrote under the pseudonym “Student”Wrote under the pseudonym “Student”

Mostly worked in tea (t) timeMostly worked in tea (t) time

? Hence known as Student's ? Hence known as Student's t t test. test.

Certainly if n Certainly if n << 30 30

Page 3: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

Is there a difference?Is there a difference?

between you…means,

who is meaner?

Page 4: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

TypesTypes

One sample One sample

compare with populationcompare with population

UnpairedUnpaired

compare with controlcompare with control

PairedPaired

same subjects: pre-postsame subjects: pre-post

Page 5: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

T-testT-test1.1. Test for single meanTest for single mean Whether the sample mean is equal to the predefined

population mean ?

2. Test for difference in means. Test for difference in means Whether the CD4 level of patients taking treatment A is

equal to CD4 level of patients taking treatment B ?

3. Test for paired observationTest for paired observation Whether the treatment conferred any significant benefit ?

Page 6: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

Test directionTest direction

One tailed t testOne tailed t test

Two tailed testTwo tailed test

Page 7: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

Developing the Pooled-Variance t Test (Part 1)

•Setting Up the Hypothesis:

H0: 1 -2 = 0

H1: 1 - 2

0

H0: 1 = 2

H1: 1 2 OR

Two Tail

Page 8: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

Developing the Pooled-Variance t Test (Part 1)

•Setting Up the Hypothesis:

H0: 1 2

H1: 1 > 2

H0: 1 -2 = 0

H1: 1 - 2

0

H0: 1 = 2

H1: 1 2

H0: 1 - 2 0

H1: 1 - 2 > 0

OR

OR Right Tail

Two Tail

Page 9: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

Developing the Pooled-Variance t Test (Part 1)

•Setting Up the Hypothesis:

H0: 1 2

H1: 1 > 2

H0: 1 -2 = 0

H1: 1 - 2

0

H0: 1 = 2

H1: 1 2

H0: 1

2

H0: 1 - 2 0

H1: 1 - 2 > 0

H0: 1 - 2

H1: 1 -

2 < 0

OR

OR

OR Left Tail

Right Tail

Two Tail

H1: 1 < 2

Page 10: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

Mean systolic BP in nephritis is significantly Mean systolic BP in nephritis is significantly higher than of normal personhigher than of normal person

100 110 120 130 140

0.050.05

Page 11: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

Mean systolic BP in nephritis is significantly Mean systolic BP in nephritis is significantly different from that of normal person different from that of normal person

0.025 0.025

100 110 120 130 140

Page 12: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

Statistical AnalysisStatistical Analysis

controlgroupmean

treatmentgroupmean

Is there a difference?

Page 13: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

What doesWhat does differencedifference mean?mean?

mediumvariability

highvariability

lowvariability

The mean differenceis the same for all

three cases

Page 14: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

What doesWhat does differencedifference mean?mean?

mediumvariability

highvariability

lowvariability

Which one showsthe greatestdifference?

Page 15: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

What doesWhat does differencedifference mean?mean?

a statistical difference is a function of the a statistical difference is a function of the difference between meansdifference between means relative to the relative to the variabilityvariabilitya small difference between means with a small difference between means with large variability could be due to large variability could be due to chancechancelike a like a signal-to-noisesignal-to-noise ratio ratio

lowvariability

Which one showsthe greatestdifference?

Page 16: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

So we estimateSo we estimate

lowvariability

signal

noise

difference between group means

variability of groups=

XT - XC

SE(XT - XC)=

= t-value

_ _

_ _

Page 17: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

Two sample t-testTwo sample t-test

Difference between means

Sample size

Variability of data

t-test t t ++

Page 18: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

Probability - pProbability - pWith t we check the probability With t we check the probability

Reject or do not reject Null hypothesisReject or do not reject Null hypothesis

You reject if p < 0.05 or still lessYou reject if p < 0.05 or still less

Difference between means (groups) is Difference between means (groups) is more & more significant if p is less & lessmore & more significant if p is less & less

Page 19: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

-1.9

6-1

.96 00

Area = .025Area = .025

Area =.005Area =.005

ZZ

-2.5

75-2

.575

Area = .025Area = .025

Area = .005Area = .005

1.96

1.96

2.57

52.

575

Determining the p-ValueDetermining the p-Value

Page 20: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

.95

t0

f(t)

-1.96 1.96

.025

red area = rejection region for 2-sided test

Page 21: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

AssumptionsAssumptions

Normal distributionNormal distribution

Equal varianceEqual variance

Random samplingRandom sampling

Page 22: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

t-Statistict-Statistic

ns

xt

/

When the sampled population is normally When the sampled population is normally distributed, the t statistic is Student t distributed, the t statistic is Student t distributed with n-1 degrees of freedom.distributed with n-1 degrees of freedom.

Page 23: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

T- test for single T- test for single meanmeanThe following are the weight (mg) of each of 20

rats drawn at random from a large stock. Is it likely that the mean weight of these 20 rats are similar to the mean weight ( 24 mg) of the whole stock ?

9 18 21 2614 18 22 2715 19 22 2915 19 24 3016 20 24 32

Page 24: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

Steps for test for single meanSteps for test for single mean

1. Questioned to be answered Is the Mean weight of the sample of 20 rats is 24 mg?

N=20, =21.0 mg, sd=5.91 , =24.0 mg

2. Null Hypothesis The mean weight of rats is 24 mg. That is, The

sample mean is equal to population mean.

3. Test statistics --- t (n-1) df

4. Comparison with theoretical value if tab t (n-1) < cal t (n-1) reject Ho, if tab t (n-1) > cal t (n-1) accept Ho,5. Inference

ns

xt

/

x

Page 25: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

t –test for single mean t –test for single mean Test statisticsTest statistics

n=20, =21.0 mg, sd=5.91 , n=20, =21.0 mg, sd=5.91 , =24.0 mg=24.0 mg

tt = t = t .05, 19 .05, 19 = 2.093 = 2.093 Accept H Accept H00 if t < 2.093 if t < 2.093 Reject HReject H00 if t if t

>= 2.093>= 2.093

x

30.22091.5240.21 ll

t

Inference :Inference :

There is no evidence that the sample is taken There is no evidence that the sample is taken from the population with mean weight of 24 gmfrom the population with mean weight of 24 gm

Page 26: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.
Page 27: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

Given below are the 24 hrs total energy Given below are the 24 hrs total energy expenditure (MJ/day) in groups of lean and expenditure (MJ/day) in groups of lean and obese women. Examine whether the obese obese women. Examine whether the obese women’s mean energy expenditure is women’s mean energy expenditure is significantly higher ?.significantly higher ?.

Lean Lean

6.1 7.0 7.56.1 7.0 7.5

7.5 5.5 7.67.5 5.5 7.6

7.9 8.1 8.17.9 8.1 8.1

8.1 8.4 10.28.1 8.4 10.2

10.9 10.9

T-test for difference in means

ObeseObese 8.8 9.2 9.28.8 9.2 9.2 9.7 9.7 10.09.7 9.7 10.0 11.5 11.8 12.811.5 11.8 12.8

Page 28: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

Null HypothesisNull Hypothesis

Obese women’s mean energy expenditure Obese women’s mean energy expenditure is equal to the lean women’s energy is equal to the lean women’s energy expenditure.expenditure.

Data SummaryData Summary

lean Obeselean Obese

N 13 9N 13 9

8.10 10.308.10 10.30

S 1.38 1.25S 1.38 1.25

Page 29: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

HH00: : 1 1 - - 22 = 0 = 0 ((1 1 = = 22))

HH11: : 1 1 - - 22 0 0 ((1122))

= 0.05= 0.05

df = 13 + 9 - 2 = 20df = 13 + 9 - 2 = 20

Critical Value(s):Critical Value(s):

t0 2.086-2.086

.025

Reject H0 Reject H0

.025

Solution

Page 30: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

tX X

Sn S n S

n n

df n n

P

1 2 1 2

2 1 12

2 22

1 2

1 2

1 1

1 1

2

Hypothesized Difference (usually zero when testing for equal means)

•Compute the Test Statistic:

( ))(

( ) ( )( ) ( )

112pS

n1 n2

_ _

Calculating the Test Statistic:

Page 31: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

Developing the Pooled-Variance t Test

•Calculate the Pooled Sample Variances as an Estimate of the Common Populations Variance:

)n()n(

S)n(S)n(Sp 11

11

21

222

2112

2pS

21S

22S

1n

2n

= Pooled-Variance

= Variance of Sample 1

= Variance of sample 2

= Size of Sample 1

= Size of Sample 2

Page 32: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

Sn S n S

n nP

2 1 12

2 22

1 2

2 2

1 1

1 1

13 1 1 38 9 1 125

13 1 9 11 765

. ..

((((

((

( (

)

))

))

)))

First, estimate the common variance as a weighted average of the two sample variances using the degrees of freedom as weights

Page 33: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

tX X

Sn nP

1 2 1 2

2

1 2

10.3 0

176 1 +13

19

3.82 8.1

.

Calculating the Test Statistic:

( (( )) )

11

tab t 9+13-2 =20 dff = t 0.05,20 =2.086

Page 34: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.
Page 35: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

T-test for difference in meansT-test for difference in means

Inference : The cal t (3.82) is higher than tab t at 0.05, 20. ie 2.086 . This implies that there is a evidence that the mean energy expenditure in obese group is significantly (p<0.05) higher than that of lean group

Page 36: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

ExampleExampleSuppose we want to test the effectiveness Suppose we want to test the effectiveness

of a program designed to increase of a program designed to increase scores on the quantitative section of the scores on the quantitative section of the Graduate Record Exam (GRE). We test Graduate Record Exam (GRE). We test the program on a group of 8 students. the program on a group of 8 students. Prior to entering the program, each Prior to entering the program, each student takes a practice quantitative student takes a practice quantitative GRE; after completing the program, each GRE; after completing the program, each student takes another practice exam. student takes another practice exam. Based on their performance, was the Based on their performance, was the program effective?program effective?

Page 37: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

Each subject contributes 2 scores: repeated Each subject contributes 2 scores: repeated measures designmeasures design

Student Before Program After Program

1 520 555

2 490 510

3 600 585

4 620 645

5 580 630

6 560 550

7 610 645

8 480 520

Page 38: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

Can represent each student with a single Can represent each student with a single score: the difference (D) between the scoresscore: the difference (D) between the scores

StudentBefore Program After Program

D

1 520 555 35

2 490 510 20

3 600 585 -15

4 620 645 25

5 580 630 50

6 560 550 -10

7 610 645 35

8 480 520 40

Page 39: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

Approach: test the effectiveness of program Approach: test the effectiveness of program by testing significance of Dby testing significance of D

Null hypothesis: There is no difference in the Null hypothesis: There is no difference in the scores of before and after programscores of before and after program

Alternative hypothesis: program is effective Alternative hypothesis: program is effective → scores after program will be higher than → scores after program will be higher than scores before program → average D will be scores before program → average D will be greater than zerogreater than zero

HH00: µ: µDD = 0 = 0

HH11: µ: µDD > 0 > 0

Page 40: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

StudentBefore

ProgramAfter

Program D D2

1 520 555 35 1225

2 490 510 20 400

3 600 585 -15 225

4 620 645 25 625

5 580 630 50 2500

6 560 550 -10 100

7 610 645 35 1225

8 480 520 40 1600

∑D = 180 ∑D2 = 7900

So, need to know ∑D and ∑D2:

Page 41: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

Recall that for single samples:Recall that for single samples:

error standard

mean - score

X

obt s

Xt

For related samples:For related samples:

D

Dobt s

Dt

where:

N

ss DD

and

1

2

2

N

N

DD

sD

Page 42: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

45.23

188

1807900

1

22

2

N

N

DD

sD

5.228

180

N

DD

Standard deviation of D:Standard deviation of D:

Mean of D:Mean of D:

Standard error:Standard error:

2908.88

45.23

N

ss DD

Page 43: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

D

Dobt s

Dt

Under H0, µD = 0, so:

714.22908.8

5.22

D

obt s

Dt

From Table B.2: for α = 0.05, one-tailed, with df = 7,

t critical = 1.895

2.714 > 1.895 → reject H0

The program is effective.

Page 44: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.
Page 45: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

t-Valuet-Valuet is a measure of:How difficult is it to believe the null hypothesis?

High t Difficult to believe the null hypothesis -

accept that there is a real difference.

Low t Easy to believe the null hypothesis -

have not proved any difference.

Page 46: The t-distribution William Gosset lived from 1876 to 1937 Gosset invented the t -test to handle small samples for quality control in brewing. He wrote.

In Conclusion !In Conclusion !

Student ‘s t-test will be used:Student ‘s t-test will be used: --- When Sample size is small--- When Sample size is small and for the following situations:and for the following situations: (1) to compare the single sample mean(1) to compare the single sample mean with the population meanwith the population mean (2) to compare the sample means of (2) to compare the sample means of two indpendent samplestwo indpendent samples (3) to compare the sample means of (3) to compare the sample means of

paired samples paired samples