1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15...

30
1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book

Transcript of 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15...

Page 1: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

1

Experimental Statistics - week 2Experimental Statistics - week 2

Review: 2-sample t-tests paired t-tests

Thursday: Meet in 15 Clements!! Bring Cody and Smith book

Page 2: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

2

p-Value p-Value

(observed value of t)

-2.39

p-value

0 0 0 : : vs. aH H

0H t t Reject if

Suppose t = - 2.39 is observed from data for test above

Note: “Large negative values” of t make us believe alternative is true

the probability of an observation as extreme or more extreme than the one observed when the null is true

Page 3: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

3

Note:Note:-- if p-value is less than or equal to then we reject null at the significance level 

-- the p-value is the smallest level of significance at which the null hypothesis would be rejected

Page 4: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

4

Find the p-values for Examples 1 and 2

Page 5: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

5

Page 6: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

6

Two Independent SamplesTwo Independent Samples

• Assumptions: Measurements from each population are

– Mutually Independent Independent within each sample

Independent between samples

– Normally distributed (or the Central Limit Theorem can be invoked)

• Analysis differs based on whether the 2 populations have the same standard deviation

Page 7: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

7

Two CasesTwo Cases

• Population standard deviations equal– Can obtain a better estimate of the common

standard deviation by combining or “pooling” individual estimates

• Population standard deviations unequal– Must estimate each standard deviation

– Very good approximate tests are available

If Unsure, Do Not AssumeEqual Standard Deviations

Page 8: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

8

Equal Population Standard Deviations

Equal Population Standard Deviations

Test Statistic

df = n1 + n2 - 2

nns

)μ(μ)yy( t=

p21

2121

11

s= s

+nn

sn + sn=s

pp

p

2

21

222

2112

2

)1()1(

where

Page 9: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

9

Behrens-Fisher ProblemBehrens-Fisher Problem

y

2

22

1

21

2121 t~

ns

ns

)(y

1 2 If

Page 10: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

10

Satterthwaite’s Approximate t Statistic

Satterthwaite’s Approximate t Statistic

y

1 t

ns

ns

)(y

2

22

1

21

212

1 2 If

2 2 21 2

2 21 2

1 2

( ), ,

1 1

a b s sa b

a b n nn n

df = (Approximate t df)

(i.e. approximate t)

Page 11: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

11

Often-Recommended Strategy for Tests on Means

Often-Recommended Strategy for Tests on Means

Test whether 1 = 2 (F-test )– If the test is not rejected, use the 2-sample t statistics,

assuming equal standard deviations– If the test is rejected, use Satterthwaite’s approximate t

statistic

NOTE: This is Not a good strategy– the F-test is highly susceptible to non-normality

Recommended Strategy:– If uncertain about whether the standard deviations are

equal, use Satterthwaite’s approximate t statistic

Page 12: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

12

Example 3: Comparing the Mean Breaking Strengths of 2 PlasticsExample 3: Comparing the Mean Breaking Strengths of 2 Plastics

Plastic A:

Plastic B:

.= , s.=y , = n AAA 3332835

Assumptions:Mutually independent measurementsNormal distributions for measurements from each type of plastic

.= , s.=y , = n AAA 9472640

Question:Question: Is there a difference between the 2 plastics in terms of mean breaking strength?

Page 13: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

13

Example 3 - solution

Page 14: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

14

Page 15: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

15

Design:Design:

50 people: randomly assign 25 to go on diet and 25 to eat normally for next month.

Assess results by comparing weights at end of 1 month.

Diet: No Diet:Diet: No Diet:

D

D

X

SND

ND

X

S

Run 2-sample t-test using guidelines we have discussed.

Is this a good design?

New diet – Is it effective?New diet – Is it effective?

Page 16: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

16

Better Design:Better Design:

Randomly select subjects and measure them before and after 1-month on the diet.

Subject Before After 1 150 147 2 210 195 : : :

n 187 190

Difference 3 15 :

-3

Procedure: Calculate differences, and analyze differences using a 1-sample test

““Paired t-Test”Paired t-Test”

Page 17: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

17

Example 4: International Gymnastics Judging

Example 4: International Gymnastics Judging

Contestant 1 2 3 4 5 6 7 8 9 10 11 12Native J udge 6.8 4.5 8.0 7.2 8.7 4.5 6.6 5.8 6.0 8.8 8.7 4.4Foreign J udges 6.7 4.3 8.1 7.2 8.3 4.6 5.4 5.9 6.1 9.1 8.7 4.3

Question: Do judges from a contestant’s country rate their own contestant higher than do foreign judges?

0 : N FH i.e. test

:a N FH

Data:

Page 18: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

18

Example 4 solution

Page 19: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

19

Page 20: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

Introduction to SAS Introduction to SAS Programming LanguageProgramming Language

Page 21: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

21

Fertilizer Data

Brand 1 Brand 2 51.0 cm 54.0 cm 53.3 56.1 55.6 52.1 51.0 56.4 55.5 54.0 53.0 52.9 52.1

A researcher studies the effect of two fertilizer brands on the growth of plants. Thirteen plants grown under identical conditions except that 7 plants are randomly selected to receive Brand 1 and the remaining 6 are fertilized using Brand 2. The data for this experiment are as follows where the outcome measurement is the height of the plant after 3 weeks of growth (you may assume the heights to be normally distributed):

Page 22: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

22

The Fertilizer data set as SAS needs to see it

A 51.0A 53.3A 55.6A 51.0A 55.5A 53.0A 52.1B 54.0B 56.1B 52.1B 56.4B 54.0B 52.9

Page 23: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

23

Case 1:  Data within SAS FILE : DATA one;INPUT brand$ height;DATALINES;A 51.0A 53.3 . . . B 54.0E 52.9 ;PROC TTEST; CLASS brand; VAR height; TITLE ‘Fertilizer Data – 2-sample t-test';RUN;

SAS file for FERTILIZER data

Page 24: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

24

Brief Discussion of Components of the SAS File:

DATA Step

  DATA STATEMENT - the first DATA statement names the data set whose variables are defined in the INPUT statement -- in the above, we create data set 'one'

   INPUT STATEMENT - 2 forms

1.  Freefield - can be used when data values are separated by 1 or more blanks

       INPUT   NAME $  AGE SEX $   SCORE;          ($ indicates character variable)

  2.  Formatted - data occur in fixed columns

       INPUT    NAME $ 1-20  AGE 22-24  SEX  $ 26   SCORE 28-30;  

DATALINES STATEMENT       -  used to indicate that the next records in the file contain the actual data and the semicolon after the data indicates the end of the data itself  

Page 25: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

25

SPECIFYING THE ANALYSISSPECIFYING THE ANALYSIS --  PROC STATEMENTS

 GENERAL FORM   PROC xxxxx; implies procedure is to be run on most recently created data set  PROC xxxxx  DATA = data set name; Note:  I did not have to specify DATA=one in the above example

  Example PROCs:

PROC REG - regression analysisPROC ANOVA - analysis of variance PROC GLM - general linear model PROC MEANS - basic statistics, t-test for H0:

PROC PLOT - plottingPROC TTEST - t-tests PROC UNIVARIATE - descriptive stats, box-plots, etc.

PROC BOXPLOT - boxplots

Page 26: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

26

PROC TTESTPROC TTEST

• Proc TTEST data = fn ;

Class … ; (specify the classification variable)

Var … / options; (specify the variable for which the means are compared)

Run;

Page 27: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

27

SAS SyntaxSAS Syntax

• Every command MUSTMUST end with a semicolon– Commands can continue over two or more lines

• Variable names are 1-8 characters (letters and numerals, beginning with a letter or underscore), but no blanks or special characters

– Note: values for character variables can exceed 8 characters

• Comments – Begin with *, end with ;

Page 28: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

28

Titles and LabelsTitles and Labels

• TITLE ‘…’ ;– Up to 10 title lines: TITLE ‘include your title here’;

– Can be placed in Data Steps or Procs

• LABEL name = ‘…’ ;– Can be in a DATA STEP or PROC PRINT

– Include ALL labels, then a single ;

Note: For class assignments, place descriptive titles and labels on the output.

Page 29: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

29

Case 2:  Data in External File : 

FILENAME f1 ‘complete directory/file specification’;  

FILENAME f1 ‘fertilizer.data';DATA one;INFILE f1; INPUT brand$ height;PROC TTEST; CLASS brand; VAR height; TITLE ‘Fertilizer Data – 2-sample t-test';RUN;

Page 30: 1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.

30

PC SAS on Campus

Library

BIC

Student Center

http://support.sas.com/rnd/le/index.html

SAS Learning Edition $125