Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy...

36
Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Transcript of Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy...

Page 1: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Making Inferences and

Justifying Conclusions

Roxy Peck

Cal Poly, San Luis Obispo

NCTM 2016 San Francisco

1

Page 2: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Common Core State Standard in Mathematics

S-IC 3 Recognize the purposes of and difference among sample surveys, experiments, and observational studies; explain how randomization relates to each.

S-IC 4 Use data from a sample survey to estimate a population mean or proportion; develop a margin of error through the use of simulation models for random sampling.

S-IC 5 Use data from a randomized experiment to compare two treatments; use simulation to decide if difference between parameters are significant.

S-IC 6 Evaluate reports based on data.

NCTM 2016 San Francisco

2

Page 3: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Common Core State Standard in Mathematics

These standards include difficult (but important)

statistical concepts.

Concepts of random selection, random assignment,

study design, sampling variability, margin of error,

statistical significance are not just for AP Statistics

anymore! They are now part of the “for all” part of the

high school curriculum.

In most Common Core schools (and “Common Core like

schools”), every high school teacher of mathematics is

now being asked to develop students’ statistical thinking

as well as their mathematical thinking.

This is a big challenge!NCTM 2016 San Francisco

3

Page 4: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

So Where Do We Start??

In this session, we will consider class activities/demonstrations (depending on your access to technology) that address

The difference between observational studies and experiments.

The difference between random selection and random assignment.

How study design relates to the types of conclusions that can be drawn.

Using simulation to develop the concept of margin of error.

Using simulation to develop the concept of statistical significance.

But we won’t have time, so will go very quickly through the first three and then focus on the last two.

NCTM 2016 San Francisco

4

Page 5: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Observational Studies versus Experiments

Observational study

Observe characteristics of a sample selected from one or more populations

Goal is to use sample data to learn about the corresponding population

Important that the sample be representative of the population

Experiment

Study how a response variable behaves under different experimental conditions

Person conducting the experiment decides what the experimental conditions will

be and who will be in each experimental group

Important to have comparable experimental groups

NCTM 2016 San Francisco

5

Page 6: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Observational Studies versus Experiments

Observational studies (includes sample surveys)

Want random selection from population of interest since it is important to have a

sample that is representative of the population.

Random selection enables generalizing from sample to the population.

Experiments

Want random assignment of “subjects” to experimental conditions to create

comparable experimental groups.

Random assignment enables drawing a cause and effect conclusion (changes

in the experimental conditions cause change in response).

Experiments may or may not include random selection of subjects.

NCTM 2016 San Francisco

6

Page 7: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

So What is Randomization??

Random selection

Random assignment

Randomization

Let’s keep it simple and not confuse students!

NCTM 2016 San Francisco

7

Page 8: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Give students lots of practice doing

things like this…

NCTM 2016 San Francisco

8

Page 9: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

NCTM 2016 San Francisco

9

Page 10: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

NCTM 2016 San Francisco

10

Page 11: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

NCTM 2016 San Francisco

11

Page 12: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Margin of Error and Statistical

Significance

Observational Studies and Sample Surveys

Question of interest: How far off might my estimate

be?

Experiments

Question of interest: Could this have happened by

chance when there is no difference in the response to

the different experimental conditions?

NCTM 2016 San Francisco

12

Margin of Error

Statistical Significance

CCSS limits these conceptsMargin of errorobservational studies

Statistical significanceexperiments

Page 13: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Using Simulation to Develop Concept

of Margin of Error

How far off might my estimate be?

Study on facial stereotyping (thanks to Allan

Rossman and Beth Chance for this example).

Reference: Psychonomic Bulletin & Review, 2007

14(5), 901-907.

NCTM 2016 San Francisco

13

Page 14: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Bob or Tim?

One of these men is named Bob and one is named Tim.

They were asked “Which man is named Tim and which is

named Bob?”

NCTM 2016 San Francisco

14

Page 15: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Bob or Tim?

Want to use data to estimate the proportion of U.S. adults that

would choose the man on the left as Tim.

We will assume that it is reasonable to assume that this group is

representative of the population of adults in the U.S.

For this group, the proportion who chose the man of the left as Tim

is:

But I don’t have an internet connection so I am going to pretend

that we are a group of 100 people and that 78 picked the man on

the left as Tim. With a class where I would have internet access, I

would use the real class data. The proportion who choose the man

on the left as Tim is pretty consistently around 0.80.

NCTM 2016 San Francisco

15

Page 16: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Motivating Margin of Error

Based on my sample of 100 people, my estimate of the proportion

of U.S. adults who would choose the man on the left as Tim is 0.78.

But I don’t expect this to be exactly equal to the actual population

proportion. How close can I expect my estimate to be to the actual

value?

Margin of error is the maximum likely error. It would not be likely that

my estimate would be off by more than this amount. “Likely” is

defined in terms of 95%--If I were to takes samples from the

population and use each sample to estimate the population value, 95% of these estimates would differ from the actual value by less

than this amount.

NCTM 2016 San Francisco

16

Page 17: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Motivating Margin of Error

How do we get a sense of how far off my estimate is likely to be?

Create a BIG hypothetical population with a proportion of “successes”

that is equal to my sample proportion.

Take a random sample of the same size as my original sample from the

big hypothetical population and calculate the proportion for this

simulated sample.

Repeat many times to get a collection of simulated sample proportions.

Look at the simulated sample proportions to see how far off they

tended to be from the known proportion for my BIG hypothetical

population.

The margin of error based on the simulated sample proportions is a

reasonable estimate of the margin of error I should associate with my

original estimate.

NCTM 2016 San Francisco

17

Page 18: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Motivating Margin of Error

http://www.rossmanchance.com/ISIapplets.html

NCTM 2016 San Francisco

18

Page 19: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Motivating Margin of Error

NCTM 2016 San Francisco

19

Page 20: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Motivating Margin of Error

NCTM 2016 San Francisco

20

Page 21: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Motivating Margin of Error

95% of simulated sample proportions were between 0.71

and 0.84. Since the actual proportion in the BIG

hypothetical population was 0.78, we could say that

about 95% of the simulated sample proportions were

within about 0.07 of the actual population value.

Margin of error is 0.07.

So we think that our estimate of 0.78 is probably within

about 0.07 of the actual proportion of adults who would

choose the man on the left as Tim.

NCTM 2016 San Francisco

21

Page 22: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Motivating Margin of Error

Extension—If you have access to technology, have students each

carry out a simulation to get their own margin of error estimates. Compare with other students so that they see that the simulation

method tends to produce consistent results.

Also works for simulating margin of error for estimating a population

mean. But to create the BIG hypothetical population to sample

from, we create a population that consists of a large number of

copies of our sample (which we think is representative of the

population). Sampling from this BIG hypothetical population is equivalent to sampling with replacement from the original sample.

NCTM 2016 San Francisco

22

Page 23: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Motivating Margin of Error

“For all” stops here.

For AP Stat, can motivate margin of error this way and

then move on to more traditional approach. For

example, the formula for margin of error for estimating a

population proportion using large samples, the estimate

for the Bob or Tim example based on n = 100 and a

sample proportion of 0.78 is 0.08, compared to the 0.07

from the simulation.

NCTM 2016 San Francisco

23

Page 24: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Using Simulation to Develop Concept

of Statistical Significance

Could this have happened by chance when…?

Study to determine if reducing body temperature for three days would improve survival for newborn babies whose brains were

temporarily deprived of oxygen as a result of complications at birth.

Reference: The New England Journal of Medicine, October 13, 2005

1574-1584.

Infants were randomly assigned to a cooling group (102 infants) or a control group (103 infants).

NCTM 2016 San Francisco

24

Experiment

Page 25: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Using Simulation to Develop Concept

of Statistical Significance

Death or moderate to severe disability occurred in 45 of

102 infants in the cooling group (44%)

Death or moderate to severe disability occurred in 64 of

103 infants in the control group (62%).

Could this difference have happened just by chance if

there is no real difference in the death and disability

rates for the two experimental conditions? If not, we say

that the difference is statistically significant.

NCTM 2016 San Francisco

25

Page 26: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Using Simulation to Develop Concept

of Statistical Significance

Could this have happened by chance?

By chance, we mean due just to the way people were

assigned to the two groups.

If the cooling treatment has no effect, then the

difference in the survival rates is just because more of

the infants who were going to survive happened to be

assigned to the cooling group. Is this a plausible

explanation for the difference?

Let’s explore…

NCTM 2016 San Francisco

26

Page 27: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Using Simulation to Develop Concept

of Statistical Significance

Start with a simpler version just to demonstrate method

so that students understand the method

4 of 10 in cooling group (40%)

6 of 10 in control group (60%)

Applet from

http://www.rossmanchance.com/ISIapplets.html

NCTM 2016 San Francisco

27

Page 28: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Using Simulation to Develop Concept

of Statistical Significance

NCTM 2016 San Francisco

28

Data From Original Groups

Page 29: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Using Simulation to Develop Concept

of Statistical Significance

NCTM 2016 San Francisco

29

20 infants re-randomized into 2 groups

Page 30: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Using Simulation to Develop Concept

of Statistical Significance

Could have happened by chance.

NCTM 2016 San Francisco

30

Page 31: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Using Simulation to Develop Concept

of Statistical Significance Now with real data

Unlikely to have occurred just by chance

NCTM 2016 San Francisco

31

Page 32: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Using Simulation to Develop Concept

of Statistical Significance

Conclusions

The difference between the death and disability proportions for

the two experimental conditions (cooling, control) is statistically significant.

By statistically significant, we mean that it is unlikely that we

would observe a difference this large just due to chance.

Sample size plays an important role—difference of -0.20 was not

significant with sample sizes of 10, but difference of -0.18 is

significant with samples sizes of around 100.

NCTM 2016 San Francisco

32

Page 33: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Using Simulation to Develop Concept

of Statistical Significance

Extension—If you have access to technology, have students each carry out a simulation to draw their own conclusion about statistical significance. Compare with other students so that they see that the simulation method tends to produce consistent results.

Also works for simulating difference in means for numerical data. If treatment has no effect, assumes numerical response would be the same no matter which treatment group the subject was assigned to. Investigates question “could this have happened by chance when there is no treatment effect?” by randomly reassigning the observed response values to experimental groups and calculating the difference in means.

NCTM 2016 San Francisco

33

Page 34: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Using Simulation to Develop Concept

of Statistical Significance

“For all” stops here.

For AP Stat, can motivate idea of significance and the

meaning of p-value using this approach and then move

on to more traditional approach. For example, the p-

value for the large sample two proportions z test for the

cooling experiment data is 0.005, compared to 0.01 from

the simulation.

NCTM 2016 San Francisco

34

Page 35: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Concluding Remarks

The two class activities/demonstrations (depending on

your access to technology in the classroom) can be

used to develop an understanding of margin of error

and statistical significance that is consistent with the

intent of the Common Core State Standards.

In more advanced setting, such as AP Statistics, these

activities can be used as a starting point to develop an

understanding of the concepts before jumping in to

more formal methods for computing margin of error and

p-values.

NCTM 2016 San Francisco

35

Page 36: Making Inferences and Justifying Conclusions · Making Inferences and Justifying Conclusions Roxy Peck Cal Poly, San Luis Obispo NCTM 2016 San Francisco 1

Thanks for attending this session!

Comments or questions?

[email protected]

NCTM 2016 San Francisco

36