CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San...

30
CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008

Transcript of CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San...

Page 1: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

CAUSE Webinar:Introducing Math Majors to Statistics

Allan Rossman and Beth Chance

Cal Poly – San Luis Obispo

April 8, 2008

Page 2: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 2

Outline

Goals Guiding principles Content of an example course Assessment Examples (four)

Page 3: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 3

Goals

Redesign introductory statistics course for mathematically inclined students in order to: Provide balanced introduction to the practice

of statistics at appropriate mathematical level Better alternative than “Stat 101” or “Math

Stat” sequence for math majors’ first statistics course

Page 4: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 4

Guiding principles (Overview)1. Put students in role of active investigator

2. Motivate with real studies, genuine data

3. Repeatedly experience entire statistical process from data collection to conclusion

4. Emphasize connections among study design, inference technique, scope of conclusions

5. Use variety of computational tools

6. Investigate mathematical underpinnings

7. Introduce probability “just in time”

Page 5: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 5

Principle 1: Active investigator Curricular materials consist of investigations

that lead students to discover statistical concepts and methods Students learn through constructing own

knowledge, developing own understanding Need direction, guidance to do that

Students spend class time engaged with these materials, working collaboratively, with technology close at hand

Page 6: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 6

Principle 2: Real studies, genuine data Almost all investigations focus on a recent

scientific study, existing data set, or student collected data Statistics as a science Frequent discussions of data collection issues

and cautions Wide variety of contexts, research questions

Page 7: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 7

Real studies, genuine data

Popcorn and lung cancer Historical smoking studies Night lights and myopia Effect of observer with

vested interest Kissing the right way Do pets resemble their

owners Who uses shared armrest Halloween treats Heart transplant mortality

Lasting effects of sleep deprivation

Sleep deprivation and car crashes

Fan cost index Drive for show, putt for

dough Spock legal trial Hiring discrimination Comparison shopping Computational linguistics

Page 8: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 8

Principle 3: Entire statistical process First two weeks:

Data collection Observation vs. experiment (Confounding, random assignment vs.

random sampling, bias) Descriptive analysis

Segmented bar graph Conditional proportions, relative risk, odds ratio

Inference Simulating randomization test for p-value, significance Hypergeometric distribution, Fisher’s exact test

Repeat, repeat, repeat, … Random assignment dotplots/boxplots/means/medians

randomization test Sampling bar graph binomial normal approximation

Page 9: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 9

Principle 4: Emphasize connections Emphasize connections among study design,

inference technique, scope of conclusions Appropriate inference technique determined by

randomness in data collection process Simulation of randomization test (e.g., hypergeometric) Repeated sampling from population (e.g., binomial)

Appropriate scope of conclusion also determined by randomness in data collection process Causation Generalizability

Page 10: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 10

Principle 5: Variety of computational tools For analyzing data, exploring statistical concepts Assume that students have frequent access to

computing Not necessarily every class meeting in computer lab

Choose right tool for task at hand Analyzing data: statistics package (e.g., Minitab) Exploring concepts: Applets (interactivity,

visualization) Immediate updating of calculations: spreadsheet

(Excel)

Page 11: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 11

Principle 6: Mathematical underpinnings Primary distinction from “Stat 101” course

Some use of calculus but not much Assume some mathematical sophistication

E.g., function, summation, logarithm, optimization, proof Often occurs as follow-up homework exercises

Examples Counting rules for probability

Hypergeometric, binomial distributions Principle of least squares, derivatives to find minimum

Univariate as well as bivariate setting Margin-of-error as function of sample size, population

parameters, confidence level

Page 12: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 12

Principle 7: Probability “just in time” Whither probability?

Not the primary goal Studied as needed to address statistical issues Often introduced through simulation

Tactile and then computer-based Addressing “how often would this happen by chance?”

Examples Hypergeometric distribution: Fisher’s exact test for 2×2

table Binomial distribution: Sampling from random process Continuous probability models as approximations

Page 13: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 13

Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6

Data Collection Observation vs. experiment, confounding, randomization

Random sampling, bias, precision, nonsampling errors

Paired data Independent random samples

Bivariate

Descriptive Statistics

Conditional proportions, segmented bar graphs, odds ratio

Quantitative summaries, transformations, z-scores, resistance

Bar graph Models, Probability plots, trimmed mean

Scatterplots, correlation, simple linear regression

Probability Counting, random variable, expected value

empirical rule Bermoulli processes, rules for variances, expected value

Normal, Central Limit Theorem

Sampling/ Randomization Distribution

Randomization distribution for

Randomization distribution for

Sampling distribution for X,

Large sample sampling distributions for

,

Sampling distributions of , OR,

Chi-square statistic, F statistic, regression coefficients

Model Hypergeometric Binomial Normal, t Normal, t, log-normal

Chi-square, F, t

Statistical Inference

p-value, significance, Fisher’s Exact Test

p-value, significance, effect of variability

Binomial tests and intervals, two-sided p-values, type I/II errors

z-procedures for proportions t-procedures, robustness, bootstrapping

Two-sample z- and t-procedures, bootstrap, CI for OR

Chi-square for homogeneity, independence, ANOVA, regression

21 ˆˆ pp 21 xx p̂

x p̂21 ˆˆ pp

21 xx

Content of Example Course (ISCAM)

Page 14: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

Assessments

Investigations with summaries of conclusions Worked out examples Practice problems

Quick practice, opportunity for immediate feedback, adjustment to class discussion

Homework exercises Technology explorations (labs)

e.g., comparison of sampling variability with stratified sampling vs. simple random sampling

Student projects Student-generated research questions, data collection

plans, implementation, data analyses, report

April 8, 2008 CAUSE Webinar 14

Page 15: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 15

Example 1: Friendly Observers Psychology experiment

Butler and Baumeister (1998) studied the effect of observer with vested interest on skilled performance

A: vested interest

B: no vested interest

Total

Beat threshold

3 8 11

Do not beat threshold

9 4 13

Total 12 12 24

How often would such an extreme experimental difference occur by chance, if there was no vested interest effect?

667.ˆ

250.ˆ

B

A

p

p

Page 16: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 16

Example 1: Friendly Observers Students investigate this question through

Hands-on simulation (playing cards) Computer simulation (Java applet) Mathematical model

counting techniques

0498.

12

24

12

13

0

11

11

13

1

11

10

13

2

11

9

13

3

11

)3(

XPvaluep

Page 17: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 17

Example 1: Friendly Observers Focus on statistical process

Data collection, descriptive statistics, inferential analysis Arising from genuine research study

Connection between the randomization in the design and the inference procedure used

Scope of conclusions depends on study design Cause/effect inference is valid

Use of simulation motivates the derivation of the mathematical probability model Investigate/answer real research questions in first two weeks

Page 18: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 18

Example 2: Sleep Deprivation Physiology Experiment

Stickgold, James, and Hobson (2000) studied the long-term effects of sleep deprivation on a visual discrimination task

sleep condition n Mean StDev Median IQR deprived 11 3.90 12.17 4.50 20.7unrestricted 10 19.82 14.73 16.55 19.53

How often would such an extreme experimental difference occur by chance, if there was no sleep deprivation effect?

(3 days later!)

Page 19: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 19

Example 2: Sleep Deprivation Students investigate this question through

Hands-on simulation (index cards) Computer simulation (Minitab) Mathematical model

p-value=.0072

15.92

p-value .002

Page 20: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 20

Example 2: Sleep Deprivation Experience the entire statistical process

again Develop deeper understanding of key ideas

(randomization, significance, p-value) Tools change, but reasoning remains same

Tools based on research study, question – not for their own sake

Simulation as a problem solving tool Empirical vs. exact p-values

Page 21: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

Example 3: Infants’ Social Evaluation Sociology study

Hamlin, Wynn, Bloom (2007) investigated whether infants would prefer a toy showing “helpful” behavior to a toy showing “hindering” behavior

Infants were shown a video with these two kinds of toys, then asked to select one

14 of 16 10-month-olds selected helper Is this result surprising enough (under null model of

no preference) to indicate a genuine preference for the helper toy?

Page 22: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

Example 3: Infants’ Social Evaluation Simulate with coin flipping Then simulate with applet

Page 23: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

Example 3: Infants’ Social Evaluation Then learn binomial distribution, calculate exact p-

value

0021.

5.15.16

165.15.

15

165.15.

14

16

)14(

016115214

XPvaluep

0.20

0.15

0.10

0.05

0.00

X = number who choose helper toy

Pro

bability

14

0.00209

2

Distribution PlotBinomial, n=16, p=0.5

Page 24: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

Example 3: Infants’ Social Evaluation Learn probability distribution to answer inference

question from research study Again the analysis is completed with

Tactile simulation Technology simulation Mathematical model

Modeling process of statistical investigation Examination of methodology, further questions in study

Follow-ups Different number of successes Different sample size

Page 25: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 25

Example 4: Sleepless Drivers

Sociology case-control study Connor et al (2002) investigated whether those in

recent car accidents had been more sleep deprived than a control group of drivers

  No full night’s sleep in past week

At least one full night’s sleep in

past week

Sample sizes

“case” drivers (crash)  61 510  571

“control” drivers (no crash) 44  544  588

Page 26: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 26

Example 4: Sleepless DriversSample proportion that were in a car crash

Sleep deprived: .581Not sleep deprived: .484

Odds ratio: 1.48

How often would such an extreme observed odds ratio occur by chance, if there was no sleep deprivation effect?

0%10%20%30%40%50%60%70%80%90%

100%

No full night’s sleep in pastweek

At least one full night’ssleep in past week

no crash

crash

Page 27: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 27

Example 4: Sleepless Drivers

Students investigate this question through Computer simulation (Minitab)

Empirical sampling distribution of odds-ratio Empirical p-value

Approximate mathematical model

1.48

Page 28: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 28

Example 4: Sleepless Drivers

SE(log-odds) =

Confidence interval for population log odds: sample log-odds + z* SE(log-odds) Back-transformation

90% CI for odds ratio: 1.05 – 2.08

dcba

1111

Page 29: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 29

Example 4: Sleepless Drivers

Students understand process through which they can investigate statistical ideas

Students piece together powerful statistical tools learned throughout the course to derive new (to them) procedures Concepts, applications, methods, theory

Page 30: CAUSE Webinar: Introducing Math Majors to Statistics Allan Rossman and Beth Chance Cal Poly – San Luis Obispo April 8, 2008.

April 8, 2008 CAUSE Webinar 30

For more information

Investigating Statistical Concepts, Applications, and Methods (ISCAM), Cengage Learning, www.cengage.com

Instructor resources: www.rossmanchance.com/iscam/ Solutions to investigations, practice problems,

homework exercises Instructor’s guide Sample syllabi Sample exams