Imagine a a bag that contained 90 white marbles and 10 black marbles. If you drew 10 marbles, how...

18
Imagine a a bag that contained 90 white marbles and 10 black marbles. If you drew 10 marbles, how many would you expect to come up white, and how many black? We expect 9 white marbles and 1 black. But there is some probability that we will get 8/2 and some probability we

Transcript of Imagine a a bag that contained 90 white marbles and 10 black marbles. If you drew 10 marbles, how...

Page 1: Imagine a a bag that contained 90 white marbles and 10 black marbles. If you drew 10 marbles, how many would you expect to come up white, and how many.

Imagine a a bag that contained 90 white marbles and 10 black marbles.

If you drew 10 marbles, how many would you expect to come up white, and how many black?

We expect 9 white marbles and 1 black. But there is some probability that we will get 8/2 and some probability we will get 7/3 …

Page 2: Imagine a a bag that contained 90 white marbles and 10 black marbles. If you drew 10 marbles, how many would you expect to come up white, and how many.

What do we do?What do we do?

ExpectedExpected ObservedObserved

WhiteWhite 99 ??????

BlackBlack 11 ??????

•We can compare what we would expect by chance to what we actually observed.

•We can make a probabilistic statement about the chances of observing what we did based on our expectations.

•Finally, we test the hypothesis that there is no real difference between what we observed and what we expected.

Page 3: Imagine a a bag that contained 90 white marbles and 10 black marbles. If you drew 10 marbles, how many would you expect to come up white, and how many.

Basic Assumption of the Null Basic Assumption of the Null HypothesisHypothesis

• There is no difference in the population—the difference you observe is just the chance variation of your sample.

• Expected score-Observed score = zero + SE

• We are comparing observed values (“frequency actually observed in our sample, written “f0”) to some set of expected by chance frequencies (written “fe”).

Page 4: Imagine a a bag that contained 90 white marbles and 10 black marbles. If you drew 10 marbles, how many would you expect to come up white, and how many.

Chi Square (Chi Square (χχ22) )

•The test statistic for testing hypothesis comparing 2 or more nominal or ordinal categories

•The Chi Square Statistic compares nominal values in a cross-tabulation table—making what are called row by column comparisons or “r x c” tables.

Page 5: Imagine a a bag that contained 90 white marbles and 10 black marbles. If you drew 10 marbles, how many would you expect to come up white, and how many.

DON’T WRITE ANY FORMULAS ON THE FOLLOWING SLIDES. JUST FOLLOW MY DISCUSSION.

Page 6: Imagine a a bag that contained 90 white marbles and 10 black marbles. If you drew 10 marbles, how many would you expect to come up white, and how many.

The formula for 2 is:

OR, sometimes written

Where f0 is the observed frequency of

each category in each cell of a table and fe

is the expected value of each category in each cell.

efeff )( 0

2

2

E

EO 22 )(

Page 7: Imagine a a bag that contained 90 white marbles and 10 black marbles. If you drew 10 marbles, how many would you expect to come up white, and how many.

NOTE that 2 works with frequencies in each cell.

We will compare what we observe in each cell to what we would expect in that cell by chance alone.

Page 8: Imagine a a bag that contained 90 white marbles and 10 black marbles. If you drew 10 marbles, how many would you expect to come up white, and how many.

The Chi Square statistic The Chi Square statistic tests :tests :

Whether the difference between what Whether the difference between what you observe and what chance would you observe and what chance would predict is due to sampling error. predict is due to sampling error.

The greater the deviation of what we observe The greater the deviation of what we observe to what we would expect by chance, the to what we would expect by chance, the greater the probability that the difference is greater the probability that the difference is NOT due to chance. Doesn’t this sound like NOT due to chance. Doesn’t this sound like “Principle #1” from the “Principle #1” from the 300Reader300Reader for for significance tests? What is “Principle #2”?significance tests? What is “Principle #2”?

Page 9: Imagine a a bag that contained 90 white marbles and 10 black marbles. If you drew 10 marbles, how many would you expect to come up white, and how many.

DIFFERENCE BETWEEN DIFFERENCE BETWEEN EXPENSIVE AND CHEAP SODAEXPENSIVE AND CHEAP SODA

Consumer Reports routinely finds that Consumer Reports routinely finds that many people who claim they can taste the many people who claim they can taste the difference between brands actually can’t—difference between brands actually can’t—they are influenced by the label.they are influenced by the label.

How would you test the ideal that people How would you test the ideal that people cannot really tell the difference, that they cannot really tell the difference, that they are really responding to the price label are really responding to the price label information, that people in general lack the information, that people in general lack the ability to taste a difference. How do we ability to taste a difference. How do we disentangle the label effect from taste?disentangle the label effect from taste?

Page 10: Imagine a a bag that contained 90 white marbles and 10 black marbles. If you drew 10 marbles, how many would you expect to come up white, and how many.

Study Design: Sample 150 Dr. Pepper drinkers. Place before them 3 bottles filled with Wal-Mart’s “Dr. Thunder” drink. However, one bottle is labeled with “Dr. Pepper”, another is labeled as “Wal-Mart’s Dr. Thunder”, and the third is labeled “Caramel and spices with carbonation”.

All 150 Subjects taste each bottle and state preference.

What is the null? ==> No more likely to choose one than the other. In other words, 1/3 of subjects (50 people) choose each option

Page 11: Imagine a a bag that contained 90 white marbles and 10 black marbles. If you drew 10 marbles, how many would you expect to come up white, and how many.

Dr. Dr. PepperPepper

Dr. Dr. ThunderThunder

CarbonateCarbonated Carameld Caramel

Observed Observed ffoo

Expected fExpected fee 5050 5050 5050

Page 12: Imagine a a bag that contained 90 white marbles and 10 black marbles. If you drew 10 marbles, how many would you expect to come up white, and how many.

The Full TableThe Full Table

Dr. Dr. PepperPepper

Dr. Dr. ThunderThunder

Carb. Carb. CaramelCaramel

OBSERVEDOBSERVED 7777 4141 3232

EXPECTEDEXPECTED 5050 5050 5050

Page 13: Imagine a a bag that contained 90 white marbles and 10 black marbles. If you drew 10 marbles, how many would you expect to come up white, and how many.

Step 1. Hypothesis:Null = the proportions preferring each bottle

should be equal IF indeed the drinks are equal, that is, IF One’s preference is NOT influenced by label. Here, chance would predict 50 people in each group if label did not matter. The ratios of O to E values should be the same across all 3 comparisons if label does not matter. The O : E ratios in each column should be the same. Our alternative hypothesis is that preferences will follow the status of Label # 1 > #2 > #3.

Page 14: Imagine a a bag that contained 90 white marbles and 10 black marbles. If you drew 10 marbles, how many would you expect to come up white, and how many.

Step 2. The Distribution. Since we are interested in analyzing a crosstab, the 2 distribution is appropriate -- we are doing a row by column [r * c] analysis. Step 3. Level of Significance:Set the significance level at .05The chi square distribution changes shape by degrees of freedom (i.e., How “free” is each cell entry?). If the sum of ten numbers is known to be 400 and numbers 1-9 sum to 360, then the tenth number is NOT “free to vary” (i.e., it must be 40). In the “ten numbers” situation there are 9 degrees of freedom (i.e., all ten numbers are NOT “free to vary). You need this information to use a chi square table.

Page 15: Imagine a a bag that contained 90 white marbles and 10 black marbles. If you drew 10 marbles, how many would you expect to come up white, and how many.

FORMULA for df2 degrees of freedom = (# row -1) * (#columns - 1)

We have a 3 by 2 table, so df = (row - 1) = (2 - 1) 1 and (column -1) =

(3 - 1) 2. So df = 2 TIMES 1 = 2.

Step 4: Compute χ2 from your data:

E

EO 22 )(

Page 16: Imagine a a bag that contained 90 white marbles and 10 black marbles. If you drew 10 marbles, how many would you expect to come up white, and how many.

  

Dr. Pepper Dr. Thunder  

Carb. Caramel

 

OBSERVED 

77 

41 

32

 

EXPECTED 

50 

50 

50

 

(O - E) 

27 

-9 

-18

 

(O - E)2

 

729 

81 

324

 

(O - E)2/E 729/50 = 14.58

 

81/50 = 1.62  324/50 =6.48

  

  

     14.58 + 1.62

+ 6.48 = 2 = = 22.68

Page 17: Imagine a a bag that contained 90 white marbles and 10 black marbles. If you drew 10 marbles, how many would you expect to come up white, and how many.
Page 18: Imagine a a bag that contained 90 white marbles and 10 black marbles. If you drew 10 marbles, how many would you expect to come up white, and how many.

Step 5: Decision: Look up the “critical” value of the Chi Square value in a chi-square table. In our situation, the critical value (i.e., the .05 level of significance) is 5.991. Since the chi-square value we attained is 22.68 it is beyond the critical value of 5.991;

Step 6: State Conclusion. Reject the null hypothesis because there is less than a 5% chance of committing a type 1 error. We will never know if we have committed a type 1 error. Could we have committed a type 2 error?