Ttests Programming in R. The first part of these notes will address ttesting basics. The second part...

22
Ttests Programming in R

Transcript of Ttests Programming in R. The first part of these notes will address ttesting basics. The second part...

Page 1: Ttests Programming in R. The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion.

Ttests

Programming in R

Page 2: Ttests Programming in R. The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion.

The first part of these notes will address ttesting basics.

The second part of these notes will address z test (or proportion testing) basics.

Page 3: Ttests Programming in R. The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion.

Ttests in R

The term “Ttest” comes from the application of the t-distribution to evaluate a hypothesis. The t-distribution is used when the sample size is too small (less than 30) to use s/SQRT(n) as a substitute for the population std.

In practice, even hypothesis tests with sample sizes greater than 30, which utilize the normal distribution, are commonly referred to as “ttests”.

Note: a “t-statistic” and a “z-score” are conceptually similar – both convert measurements into standardized scores which follow a roughly normal distribution.

Page 4: Ttests Programming in R. The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion.

A side note of interest from Wikipedia:

The t-statistic was introduced in 1908 by William Sealy Gosset, a chemist working for the Guiness Brewery in Dublin, Ireland. Gosset had been hired due to Claude Guinness's innovative policy of recruiting the best graduates from Oxford and Cambridge to apply biochemistry and statistics to Guinness' industrial processes. Gosset devised the t-test as a way to cheaply monitor the quality of beer. He published the test in Biometrika in 1908, but was forced to use a pen name by his employer, who regarded the fact that they were using statistics as a trade secret.

Ttests in R

Page 5: Ttests Programming in R. The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion.

Ttests take three forms:1.One Sample Ttest - compares the mean of the sample

to a given number. • e.g. Is average monthly revenue per customer

who switches >$50 ?

Formal Hypothesis Statement examples:

H0: $50

H1: > $50H0: = $50

H1: $50

Ttests in R

Page 6: Ttests Programming in R. The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion.

Example:

After a massive outbreak of salmonella, the CDC determined that the source was from a particular manufacturer of ice cream. The CDC sampled 9 production runs if the manufacturer, with the following results (all in MPN/g):

.593 .142 .329 .691 .231 .793 .519 .392 .418

Use this data to determine if the avg level of salmonella is greater than .3 MPN/g, which is considered to be dangerous.

Ttests in R

Page 7: Ttests Programming in R. The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion.

First, Identify the Hypothesis Statements, including the Type I and Type II errors…and your assignment of alpha.

Then, do the computation by hand…

Ttests in R

Page 8: Ttests Programming in R. The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion.

#here, the syntax is:

t.test(vector to be analyzed, vector to be analyzed, * alternative hypothesis)

* paired = TRUE for a paired ttest

One sample t test is the default

Ttests in R

Page 9: Ttests Programming in R. The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion.

2. Two Sample Ttest - compares the mean of the first sample minus the mean of the second sample to a given number.

• e.g. Is there a difference in the production output of two facilities?

Formal Hypothesis Statement examples:

H0: a - b =0

H1: a - b 0

Ttests in R

Page 10: Ttests Programming in R. The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion.

When dealing with two sample or paired ttests, it is important to check the following assumptions:

1. The samples are independent2. The samples have approximately equal variance3. The distribution of each sample is approximately

normal

Note – if the assumptions are violated and/or if the sample sizes are very small, we first try a transformation (e.g., take the log or the square root). If this does not work, then we engage in non-parametric analysis: Wilcoxan Rank Sum or Wilcoxan Signed Rank tests.

Ttests in R

Page 11: Ttests Programming in R. The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion.

Ttests in R

# here the syntax is:

t.test(vector to be tested~two level factor, data = data, var.equal=FALSE*)

plot(t.test(vector to be tested~two level factor, data = data)

*If the variances are similar, this would be set to TRUE

Page 12: Ttests Programming in R. The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion.

3. Paired Sample Ttest - compares the mean of the differences in the observations to a given number.

e.g. Is there a difference in the production output of a facility after the implementation of new procedures?

Formal Hypothesis Statement example:

H0: diff=0

H1: diff 0

Ttests in R

Page 13: Ttests Programming in R. The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion.

Ttests in R

#here, the syntax is:

t.test (vector to be analyzed, vector to be analyzed, paired = TRUE for a paired ttest, alternative = “greater”*)

*the alternative hypothesis could also be “less than”. The default is not equal.

Page 14: Ttests Programming in R. The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion.

Z testing…or proportion based testing…

Page 15: Ttests Programming in R. The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion.

The testing formula for a one sample proportion is a simple z calculation:

Z = (sample estimate – Null value)/Null Standard Error

For a proportion, this would be:

Z=(p-po)/SQRT((po(1-po)/n)

Proportion tests in R

Page 16: Ttests Programming in R. The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion.

Example of a one sample proportion test:

If 30% of cars on a street are found to be speeding, the city will install “traffic calming” devices.

John used his radar gun to measure the speeds of 400 cars on his street. He found that 32% were speeding. Will John get “traffic calming” devices on his street?

Proportion tests in R

Page 17: Ttests Programming in R. The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion.

Proportion tests in R

Table object1<-table(factor)Sum(object1)

Prop.test(object1[factor level],totaln, correct=FALSE, p= null hypothesis)

Example:loveatfirst.count <- table(PSU$atfirst)

prop.test(loveatfirst.count[3],227, correct=FALSE, p=0.45)

Note that the “3” indicates the third level of the factor – which is “Yes”.

Page 18: Ttests Programming in R. The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion.

Answer the following:

1. Identify the Null and Alternative Hypotheses2. Identify the Type I and Type II errors, including the

implications3. What is an appropriate alpha value?4. What is the associated p-value?5. What is your conclusion?

Proportion tests in R

Page 19: Ttests Programming in R. The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion.

2. Two Sample Test - compares the proportion of the first sample minus the proportion of the second sample to a given number. It is of common interest to test of two population proportions are equal.

• e.g. Is there a difference in the percentage of students who pass a standardized test between those who took a prep course and those who did not?

Formal Hypothesis Statement examples:

H0: pa - pb =0 H0: pa - pb <0

H1: pa - pb 0 H1: pa - pb > 0

Proportion tests in R

Page 20: Ttests Programming in R. The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion.

Before you undertake a two sample test, there are few things to be determined:

1. The two samples must be independent

2. The number of individuals with each trait of interest and the number without the trait of interest must be at least 10 in each sample.

Proportion tests in R

Page 21: Ttests Programming in R. The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion.

Proportion tests in R#here, the code is pretty easy…just make the 2x2 table and then apply the prop.test function:

FactorVar1.by.FactorVar2<-table(FactorVar1,FactorVar2)prop.test(FactorVar1.by.FactorVar2, correct=FALSE)

Example: PSU$Wt <- ifelse(PSU$WtFeel=="RightWt","Right",

ifelse(PSU$WtFeel=="OverWt"|PSU$WtFeel=="UnderWt", "Wrong","" ,))

PSU <- PSU[-which(PSU$Wt==""),]

sex.by.wt <- table(PSU$Sex, PSU$Wt)prop.test(sex.by.wt, correct=FALSE)

Page 22: Ttests Programming in R. The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion.

Answer the following:

1. Identify the Null and Alternative Hypotheses2. Identify the Type I and Type II errors, including the

implications3. What is an appropriate alpha value?4. Using the formula on page 529, determine the test statistic.

What is the associated p-value?5. What is your conclusion?

Proportion tests in R