Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the...
-
Upload
elfreda-farmer -
Category
Documents
-
view
217 -
download
1
Transcript of Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the...
![Page 1: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/1.jpg)
Tests for Continuous Outcomes II
![Page 2: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/2.jpg)
Overview of common statistical tests
Outcome Variable
Are the observations correlated?
Assumptionsindependent correlated
Continuous(e.g. blood pressure, age, pain score)
TtestANOVALinear correlationLinear regression
Paired ttestRepeated-measures ANOVAMixed models/GEE modeling
Outcome is normally distributed (important for small samples).Outcome and predictor have a linear relationship.
Binary or categorical(e.g. breast cancer yes/no)
Chi-square test Relative risksLogistic regression
McNemar’s testConditional logistic regressionGEE modeling
Chi-square test assumes sufficient numbers in each cell (>=5)
Time-to-event(e.g. time-to-death, time-to-fracture)
Kaplan-Meier statisticsCox regression
n/a Cox regression assumes proportional hazards between groups
![Page 3: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/3.jpg)
Overview of common statistical tests
Outcome Variable
Are the observations correlated?
Assumptionsindependent correlated
Continuous(e.g. blood pressure, age, pain score)
TtestANOVALinear correlationLinear regression
Paired ttestRepeated-measures ANOVAMixed models/GEE modeling
Outcome is normally distributed (important for small samples).Outcome and predictor have a linear relationship.
Binary or categorical(e.g. breast cancer yes/no)
Chi-square test Relative risksLogistic regression
McNemar’s testConditional logistic regressionGEE modeling
Sufficient numbers in each cell (>=5)
Time-to-event(e.g. time-to-death, time-to-fracture)
Kaplan-Meier statisticsCox regression
n/a Cox regression assumes proportional hazards between groups
![Page 4: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/4.jpg)
Continuous outcome (means)
Outcome Variable
Are the observations correlated? Alternatives if the normality assumption is violated (and small n):
independent correlated
Continuous(e.g. blood pressure, age, pain score)
Ttest: compares means between two independent groups
ANOVA: compares means between more than two independent groups
Pearson’s correlation coefficient (linear correlation): shows linear correlation between two continuous variables
Linear regression: multivariate regression technique when the outcome is continuous; gives slopes or adjusted means
Paired ttest: compares means between two related groups (e.g., the same subjects before and after)
Repeated-measures ANOVA: compares changes over time in the means of two or more groups (repeated measurements)
Mixed models/GEE modeling: multivariate regression techniques to compare changes over time between two or more groups
Non-parametric statisticsWilcoxon sign-rank test: non-parametric alternative to paired ttest
Wilcoxon sum-rank test (=Mann-Whitney U test): non-parametric alternative to the ttest
Kruskal-Wallis test: non-parametric alternative to ANOVA
Spearman rank correlation coefficient: non-parametric alternative to Pearson’s correlation coefficient
![Page 5: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/5.jpg)
Divalproex vs. placebo for treating bipolar depression
Davis et al. “Divalproex in the treatment of bipolar depression: A placebo controlled study.” J Affective Disorders 85 (2005) 259-266.
![Page 6: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/6.jpg)
Repeated-measures ANOVAStatistical question: Do subjects in the treatment
group have greater reductions in depression scores over time than those in the control group?
What is the outcome variable? Depression score What type of variable is it? Continuous Is it normally distributed? Yes Are the observations correlated? Yes, there are
multiple measurements on each person How many time points are being compared? >2 repeated-measures ANOVA
![Page 7: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/7.jpg)
Repeated-measures ANOVA
For before and after studies, a paired ttest will suffice.
For more than two time periods, you need repeated-measures ANOVA.
Serial paired ttests is incorrect, because this strategy will increase your type I error.
![Page 8: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/8.jpg)
Repeated-measures ANOVA Answers the following questions,
taking into account the fact the correlation within subjects: Are there significant differences across
time periods? Are there significant differences between
groups (=your categorical predictor)? Are there significant differences between
groups in their changes over time?
![Page 9: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/9.jpg)
Two groups (e.g., treatment placebo)
id group time1 time2 time3 time4
1 A 31 29 15 262 A 24 28 20 323 A 14 20 28 304 B 38 34 30 345 B 25 29 25 296 B 30 28 16 34
Hypothetical data: measurements of depression scores over time in treatment (A) and placebo (B).
![Page 10: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/10.jpg)
Profile plots by group
B
A
![Page 11: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/11.jpg)
Mean plots by group
B
A
Repeated measures ANOVA tells you if and how these two profile plots differ…
![Page 12: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/12.jpg)
Possible questions…
Overall, are there significant differences between time points?
From plots: looks like some differences (time3 and 4 look different)
Do the two groups differ at any time points? From plots: certainly at baseline; some difference
everywhere Do the two groups differ in their responses over
time?** From plots: their response profile looks similar over time,
though A and B are closer by the end.
![Page 13: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/13.jpg)
repeated-measures ANOVA…
Overall, are there significant differences between time points? Time factor
Do the two groups differ at any time points? Group factor
Do the two groups differ in their responses over time?** Group x time factor
![Page 14: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/14.jpg)
From rANOVA analysis…
Overall, are there significant differences between time points? No, Time not statistically significant (p=.1743)
Do the two groups differ at any time points? No, Group not statistically significant (p=.1408)
Do the two groups differ in their responses over time?** No, not even close; Group*Time (p-value>.60)
![Page 15: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/15.jpg)
rANOVA
Time is significant.
Group*time is significant.
Group is not significant.
![Page 16: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/16.jpg)
rANOVA
Time is not significant.
Group*time is not significant.
Group IS significant.
![Page 17: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/17.jpg)
rANOVA
Time is significant.
Group is not significant.
Time*group is not significant.
![Page 18: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/18.jpg)
Copyright ©1995 BMJ Publishing Group Ltd. Lokken, P. et al. BMJ 1995;310:1439-1442
Day of surgery
Days 1-7 after surgery
(morning and evening)
Mean pain assessments by visual analogue scales (VAS)
Homeopathy vs. placebo in treating pain after surgery
p>.05; rANOVA
(Group x Time)
![Page 19: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/19.jpg)
Copyright ©1997 BMJ Publishing Group Ltd. Cadogan, J. et al. BMJ 1997;315:1255-1260
Mean (SE) percentage increases in total body bone mineral and bone
density over 18 months. P values are for the differences between groups by repeated measures analysis of variance
Pint of milk vs. control on bone acquisition in adolescent females
![Page 20: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/20.jpg)
Copyright ©2000 BMJ Publishing Group Ltd. Hovell, M. F et al. BMJ 2000;321:337-342
Counseling vs. control on smoking in pregnancy
P<.05; rANOVA
![Page 21: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/21.jpg)
Review Question 1In a study of depression, I measured depression score (a continuous, normally distributed variable) at baseline; 1 month; 6 months; and 12 months. What statistical test will best tell me whether or not depression improved between baseline and the end of the study?
a. Repeated-measures ANOVA.b. One-way ANOVA.c. Two-sample ttest.d. Paired ttest.e. Wilcoxon sum-rank test.
![Page 22: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/22.jpg)
Review Question 1In a study of depression, I measured depression score (a continuous, normally distributed variable) at baseline; 1 month; 6 months; and 12 months. What statistical test will best tell me whether or not depression improved between baseline and the end of the study?
a. Repeated-measures ANOVA.b. One-way ANOVA.c. Two-sample ttest.d. Paired ttest.e. Wilcoxon sum-rank test.
![Page 23: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/23.jpg)
Review Question 2In the same depression study, what statistical test will best tell me whether or not two treatments for depression had different effects over time?
a. Repeated-measures ANOVA.b. One-way ANOVA.c. Two-sample ttest.d. Paired ttest.e. Wilcoxon sum-rank test.
![Page 24: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/24.jpg)
Review Question 2In the same depression study, what statistical test will best tell me whether or not two treatments for depression had different effects over time?
a. Repeated-measures ANOVA.b. One-way ANOVA.c. Two-sample ttest.d. Paired ttest.e. Wilcoxon sum-rank test.
![Page 25: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/25.jpg)
Continuous outcome (means)
Outcome Variable
Are the observations correlated? Alternatives if the normality assumption is violated (and small n):
independent correlated
Continuous(e.g. blood pressure, age, pain score)
Ttest: compares means between two independent groups
ANOVA: compares means between more than two independent groups
Pearson’s correlation coefficient (linear correlation): shows linear correlation between two continuous variables
Linear regression: multivariate regression technique when the outcome is continuous; gives slopes or adjusted means
Paired ttest: compares means between two related groups (e.g., the same subjects before and after)
Repeated-measures ANOVA: compares changes over time in the means of two or more groups (repeated measurements)
Mixed models/GEE modeling: multivariate regression techniques to compare changes over time between two or more groups
Non-parametric statisticsWilcoxon sign-rank test: non-parametric alternative to paired ttest
Wilcoxon sum-rank test (=Mann-Whitney U test): non-parametric alternative to the ttest
Kruskal-Wallis test: non-parametric alternative to ANOVA
Spearman rank correlation coefficient: non-parametric alternative to Pearson’s correlation coefficient
![Page 26: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/26.jpg)
Example: class dataPolitical Leanings and Rating of Ob
ama
R=.79, p<.0001
![Page 27: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/27.jpg)
Example 2: pain and injection pressure
r=.75, p<.0001
![Page 28: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/28.jpg)
Correlation coefficient
Statistical question: Is injection pressure related to pain?
What is the outcome variable? VAS pain score
What type of variable is it? Continuous Is it normally distributed? Yes Are the observations correlated? No Are groups being compared? No—the
independent variable is also continuous correlation coefficient
![Page 29: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/29.jpg)
New concept: Covariance
1n
)YY)(XX()Y,X(cov
n
1iii
![Page 30: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/30.jpg)
Covariance between two random variables:
cov(X,Y) > 0 X and Y tend to move in the same direction
cov(X,Y) < 0 X and Y tend to move in opposite directions
cov(X,Y) = 0 X and Y are independent
Interpreting Covariance
![Page 31: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/31.jpg)
Correlation coefficient
Pearson’s Correlation Coefficient is standardized covariance (unitless):
yx
yxariancer
varvar
),(cov
![Page 32: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/32.jpg)
Corrrelation Measures the relative strength of the linear
relationship between two variables Unit-less Ranges between –1 and 1 The closer to –1, the stronger the negative linear
relationship The closer to 1, the stronger the positive linear
relationship The closer to 0, the weaker any positive linear
relationship
![Page 33: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/33.jpg)
Scatter Plots of Data with Various Correlation Coefficients
Y
X
Y
X
Y
X
Y
X
Y
X
r = -1 r = -.6 r = 0
r = +.3r = +1
Y
Xr = 0
** Next 4 slides from “Statistics for Managers”4 th Edition, Prentice-Hall 2004
![Page 34: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/34.jpg)
Y
X
Y
X
Y
Y
X
X
Linear relationships Curvilinear relationships
Linear Correlation
![Page 35: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/35.jpg)
Y
X
Y
X
Y
Y
X
X
Strong relationships Weak relationships
Linear Correlation
![Page 36: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/36.jpg)
Linear Correlation
Y
X
Y
X
No relationship
![Page 37: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/37.jpg)
Review Problem 3 What’s a good guess for the
Pearson’s correlation coefficient (r) for this scatter plot?
a. –1.0b. +1.0c. 0d. -.5e. -.1
![Page 38: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/38.jpg)
Review Problem 3 What’s a good guess for the
Pearson’s correlation coefficient (r) for this scatter plot?
a. –1.0b. +1.0c. 0d. -.5e. -.1
![Page 39: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/39.jpg)
Continuous outcome (means)
Outcome Variable
Are the observations correlated? Alternatives if the normality assumption is violated (and small n):
independent correlated
Continuous(e.g. blood pressure, age, pain score)
Ttest: compares means between two independent groups
ANOVA: compares means between more than two independent groups
Pearson’s correlation coefficient (linear correlation): shows linear correlation between two continuous variables
Linear regression: multivariate regression technique when the outcome is continuous; gives slopes or adjusted means
Paired ttest: compares means between two related groups (e.g., the same subjects before and after)
Repeated-measures ANOVA: compares changes over time in the means of two or more groups (repeated measurements)
Mixed models/GEE modeling: multivariate regression techniques to compare changes over time between two or more groups
Non-parametric statisticsWilcoxon sign-rank test: non-parametric alternative to paired ttest
Wilcoxon sum-rank test (=Mann-Whitney U test): non-parametric alternative to the ttest
Kruskal-Wallis test: non-parametric alternative to ANOVA
Spearman rank correlation coefficient: non-parametric alternative to Pearson’s correlation coefficient
![Page 40: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/40.jpg)
Example: class dataPolitical Leanings and Rating of Ob
ama
Expected Obama Rating = 3.0 + .66*political bent
![Page 41: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/41.jpg)
Example 2: pain and injection pressure
R-squared = correlation coefficient squared. Meaning: the percent of variance in Y that is “explained by” X.
![Page 42: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/42.jpg)
Simple linear regression
Statistical question: Does injection pressure “predict” pain?
What is the outcome variable? VAS pain score
What type of variable is it? Continuous Is it normally distributed? Yes Are the observations correlated? No Are groups being compared? No—the
independent variable is also continuous simple linear regression
![Page 43: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/43.jpg)
Linear regression
In correlation, the two variables are treated as equals. In regression, one variable is considered independent (=predictor) variable (X) and the other the dependent (=outcome) variable Y.
![Page 44: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/44.jpg)
What is “Linear”?
Remember this: Y=mX+B?
B
m
![Page 45: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/45.jpg)
What’s Slope?
A slope of 0.66 means that every 1-unit change in X yields a .66-unit change in Y.
![Page 46: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/46.jpg)
Simple linear regression
The linear regression model:
Ratings of Obama = 3.0 + 0.66*(political bent)slope
intercept
![Page 47: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/47.jpg)
Simple linear regression
Expected Sleep = 7.5 - 0.03*Hours of homework/week
Every additional hour of weekly homework costs you about 2 minutes of sleep per night (14 minutes of sleep per week. (p=.12)
Sleep versus Homework
![Page 48: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/48.jpg)
Simple linear regression
Expected Wake-up Time = 8:06 - 0:11*Hours of exercise/week
Every additional hour of weekly exercise costs you about 11 minutes of sleep in the morning (p=.0015).
Wake-up Time versus Exercise
![Page 49: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/49.jpg)
More about the model… The distribution of baby weights at
Stanford ~ N(3400, 360000)
Your “Best guess” at a random baby’s weight, given no information about the baby, is what?
3400 grams
But, what if you have relevant information? Can you make a better guess?
![Page 50: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/50.jpg)
Predictor variable X=gestation time
Assume that babies that gestate for longer are born heavier, all other things being equal.
Pretend (at least for the purposes of this example) that this relationship is linear.
Example: suppose a one-week increase in gestation, on average, leads to a 100-gram increase in birth-weight
![Page 51: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/51.jpg)
Y depends on X
Y=birth- weight
(g)
X=gestation time (weeks)
Best fit line is chosen such that the sum of the squared (why squared?) distances of the points (Yi’s) from the line is minimized:
![Page 52: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/52.jpg)
Prediction
A new baby is born that had gestated for just 30 weeks. What’s your best guess at the birth-weight?
Are you still best off guessing 3400? NO!
![Page 53: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/53.jpg)
Y=birth- weight
(g)
X=gestation time (weeks)
At 30 weeks…
3000
30
![Page 54: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/54.jpg)
Y=birth weight
(g)
X=gestation time (weeks)
At 30 weeks…
(x,y)=
(30,3000)
3000
30
![Page 55: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/55.jpg)
At 30 weeks…
The babies that gestate for 30 weeks appear to center around a weight of 3000 grams.
In Math-Speak… E(Y/X=30 weeks)=3000 grams
Note the conditional expectation
![Page 56: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/56.jpg)
But…Note that not every Y-value (Yi) sits on the line. There’s variability.
Yi=3000 + random errori
In fact, babies that gestate for 30 weeks have birth-weights that center at 3000 grams, but vary around 3000 with some variance 2
Approximately what distribution do birth-weights follow? Normal. Y/X=30 weeks ~ N(3000, 2)
![Page 57: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/57.jpg)
Y=birth- weight
(g)
X=gestation time (weeks)
And, if X=20, 30, or 40…
20 30 40
![Page 58: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/58.jpg)
Y=baby weights
(g)
X=gestation times (weeks)
If X=20, 30, or 40…
20 30 40
Y/X=40 weeks ~ N(4000, 2)
Y/X=30 weeks ~ N(3000, 2)
Y/X=20 weeks ~ N(2000, 2)
![Page 59: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/59.jpg)
Y=baby weights
(g)
X=gestation times (weeks)
20 30 40
The standard error of Y given X is the average variability around the regression line at any given value of X. It is assumed to be equal at all values of X.
Sy/x
Sy/x
Sy/x
Sy/x
Sy/x
Sy/x
![Page 60: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/60.jpg)
Linear Regression Model
Y’s are modeled…
Yi= 100*X + random errori
Follows a normal distribution
Fixed – exactly on the line
![Page 61: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/61.jpg)
Review Problem 4
Using the regression equation: Y/X = 100 grams/week*X weeksWhat is the expected weight of a baby born at
22 weeks?
a. 2000gb. 2100gc. 2200gd. 2300ge. 2400g
![Page 62: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/62.jpg)
Review Problem 4
Using the regression equation: Y/X = 100 grams/week*X weeksWhat is the expected weight of a baby born at
22 weeks?
a. 2000gb. 2100gc. 2200gd. 2300ge. 2400g
![Page 63: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/63.jpg)
Review Problem 5
Our model predicts that:
a. All babies born at 22 weeks will weigh 2200 grams.
b. Babies born at 22 weeks will have a mean weight of 2200 grams with some variation.
c. Both of the above.d. None of the above.
![Page 64: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/64.jpg)
Review Problem 5
Our model predicts that:
a. All babies born at 22 weeks will weigh 2200 grams.
b. Babies born at 22 weeks will have a mean weight of 2200 grams with some variation.
c. Both of the above.d. None of the above.
![Page 65: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/65.jpg)
Assumptions (or the fine print) Linear regression assumes that…
1. The relationship between X and Y is linear
2. Y is distributed normally at each value of X
3. The variance of Y at every value of X is the same (homogeneity of variances)
![Page 66: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/66.jpg)
Non-homogenous variance
Y=birth-weight
(100g)
X=gestation time (weeks)
![Page 67: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/67.jpg)
Residual
Residual = observed value – predicted value
At 33.5 weeks gestation, predicted baby weight is 3350 grams
33.5 weeks
This baby was actually 3380 grams.
His residual is +30 grams:
3350 grams
![Page 68: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/68.jpg)
Review Problem 6
A medical journal article reported the following linear regression equation:
Cholesterol = 150 + 2*(age past 40)Based on this model, what is the expected
cholesterol for a 60 year old?
a. 150b. 370c. 230d. 190e. 200
![Page 69: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/69.jpg)
Review Problem 6
A medical journal article reported the following linear regression equation:
Cholesterol = 150 + 2*(age past 40)Based on this model, what is the expected
cholesterol for a 60 year old?
a. 150b. 370c. 230d. 190e. 200
![Page 70: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/70.jpg)
Review Problem 7
If a particular 60 year old in your study sample had a cholesterol of 250, what is his/her residual?
a. +50b. -50c. +60d. -60e. 0
![Page 71: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/71.jpg)
Review Problem 7
If a particular 60 year old in your study sample had a cholesterol of 250, what is his/her residual?
a. +50b. -50c. +60d. -60e. 0
![Page 72: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/72.jpg)
A ttest is linear regression! In our class the average drinking in the
Democrats (politics 6-10, n=17) was 2.4 drinks/week; in Republicans (n=4), this value was 0.3 drinks/week.
We can evaluate these data with a ttest *assuming alcohol consumption is normally distributed*:
036.0
3.296.0
1.2
96.0..
3.04.219
pes
t
![Page 73: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/73.jpg)
As a linear regression…
alcohol = 0.3 + 2.1*(1=Democrat; 0=not)
![Page 74: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/74.jpg)
ANOVA is linear regression!
A categorical variable with more than two groups:E.g.: very right, middle, very left (mutually
exclusive)
= (=value for very right) + 1*(1 if middle) + 2 *(1 if very left)
This is called “dummy coding”—where multiple binary variables are created to represent being in each category (or not) of a categorical variable
![Page 75: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/75.jpg)
Multiple Linear Regression More than one predictor…
= + 1*X + 2 *W + 3 *Z
Each regression coefficient is the amount of change in the outcome variable that would be expected per one-unit change of the predictor, if all other variables in the model were held constant.
![Page 76: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/76.jpg)
Functions of multivariate analysis:
Control for confounders Test for interactions between predictors
(effect modification) Improve predictions
![Page 77: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/77.jpg)
Example: multivariate linear regression
What predicts wake-up time? Fit a multivariate model with both
sleep and alcohol in the model… Expected Wake-up Time = 7:54 - 0:10*Hours of exercise/week +:04*drinks/week
-R2=44% (we’ve explained 44% of the variance in wakeup time)-After adjusting for alcohol, you lose 10 minutes of sleep in the morning for each additional hour of exercise (p<.05)...-After adjusting for exercise, you gain 4 minutes of sleep in the morning for every weekly drink (p>.05)...
![Page 78: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/78.jpg)
Review Problem 8A medical journal article reported the following linear
regression equation:Cholesterol = 150 + 2*(age past 40) +
10*(gender: 1=male, 0=female)Based on this model, what is the expected cholesterol
for a 60 year-old man?
a. 150b. 370c. 230d. 190e. 200
![Page 79: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/79.jpg)
Review Problem 8A medical journal article reported the following linear
regression equation:Cholesterol = 150 + 2*(age past 40) +
10*(gender: 1=male, 0=female)Based on this model, what is the expected cholesterol
for a 60 year-old man?
a. 150b. 370c. 230d. 190e. 200
![Page 80: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/80.jpg)
Linear Regression Coefficient (z Score)
Variable SBP DBPModel 1 Total protein, % kcal -0.0346 (-1.10) -0.0568 (-3.17) Cholesterol, mg/1000 kcal 0.0039 (2.46) 0.0032 (3.51) Saturated fatty acids, % kcal 0.0755 (1.45) 0.0848 (2.86) Polyunsaturated fatty acids, % kcal 0.0100 (0.24) -0.0284 (-1.22) Starch, % kcal 0.1366 (4.98) 0.0675 (4.34) Other simple carbohydrates, % kcal 0.0327 (1.35) 0.0006 (0.04)Model 2 Total protein, % kcal -0.0344 (-1.10) -0.0489 (-2.77) Cholesterol, mg/1000 kcal 0.0034 (2.14) 0.0029 (3.19) Saturated fatty acids, % kcal 0.0786 (1.73) 0.1051 (4.08) Polyunsaturated fatty acids, % kcal 0.0029 (0.08) -0.0230 (-1.07) Starch, % kcal 0.1149 (4.65) 0.0608 (4.35)
Models controlled for baseline age, race (black, nonblack), education, smoking, serum cholesterol.
Table 3. Relationship of Combinations of Macronutrients to BP (SBP and DBP) for 11 342 Men, Years 1 Through 6 of MRFIT: Multiple Linear Regression Analyses
Circulation. 1996 Nov 15;94(10):2417-23.
![Page 81: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/81.jpg)
Total protein, % kcal -0.0346 (-1.10) -0.0568 (-3.17)
Linear Regression Coefficient (z Score)
Variable SBP DBP
Translation: controlled for other variables in the model (as well as baseline age, race, etc.), every 1 % increase in the percent of calories coming from protein correlates with .0346 mmHg decrease in systolic BP. (NS)
In math terms: SBP= -.0346*(% protein) + age *(Age) …+….
Also (from a separate model), every 1 % increase in the percent of calories coming from protein correlates with a .0568 mmHg decrease in diastolic BP. (significant)DBP= - 05568*(% protein) + age *(Age) …+….
![Page 82: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/82.jpg)
Other types of multivariate regression
Multiple linear regression is for normally distributed outcomes
Logistic regression is for binary outcomes
Cox proportional hazards regression is used when time-to-event is the outcome
![Page 83: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/83.jpg)
Cautions about multivariate modeling…
Overfitting Residual confounding
![Page 84: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/84.jpg)
Overfitting In multivariate modeling, you can get highly significant but
meaningless results if you put too many predictors in the model. The model is fit perfectly to the quirks of your particular sample,
but has no predictive ability in a new sample. Example (hypothetical): In a randomized trial of an intervention
to speed bone healing after fracture, researchers built a multivariate regression model to predict time to recovery in a subset of women (n=12). An automatic selection procedure came up with a model containing age, weight, use of oral contraceptives, and treatment status; the predictors were all highly significant and the model had a nearly perfect R-square of 99.5%.
This is likely an example of overfitting. The researchers have fit a model to exactly their particular sample of data, but it will likely have no predictive ability in a new sample.
Rule of thumb: You need at least 10 subjects for each additional predictor variable in the multivariate regression model.
![Page 85: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/85.jpg)
Overfitting
Pure noise variables still produce good R2 values if the model is overfitted. The distribution of R2 values from a series of simulated regression models containing only noise variables. (Figure 1 from: Babyak, MA. What You See May Not Be What You Get: A Brief, Nontechnical Introduction to Overfitting in Regression-Type Models. Psychosomatic Medicine 66:411-421 (2004).)
![Page 86: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/86.jpg)
Overfitting example, class data…PREDICTORS OF EXERCISE HOURS PER WEEK (multivariate
model):
Variable Beta p-VALUE
Intercept -14.74660 0.0257Coffee 0.23441 0.0004 wakeup -0.51383 0.0715engSAT -0.01025 0.0168mathSAT 0.03064 0.0005writingLove 0.88753 <.0001sleep 0.37459 0.0490
R-Square = 0.8192
N=20, 7 parameters in the model!
![Page 87: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/87.jpg)
Univariate models…
Variable Beta p-value Coffee 0.05916 0.3990 Wakeup -0.06587 0.8648 MathSAT -0.00021368 0.9731 EngSAT -0.01019 0.1265 Sleep -0.41185 0.4522 WritingLove 0.38961 0.0279
![Page 88: Tests for Continuous Outcomes II. Overview of common statistical tests Outcome Variable Are the observations correlated? Assumptions independentcorrelated.](https://reader030.fdocuments.us/reader030/viewer/2022013011/56649dd95503460f94ace2be/html5/thumbnails/88.jpg)
Residual confounding You cannot completely wipe out confounding simply by
adjusting for variables in multiple regression unless variables are measured with zero error (which is usually impossible).
Residual confounding can lead to significant effect sizes of moderate size if measurement error is high.
Hypothetical Example: In a case-control study of lung cancer, researchers identified a link between alcohol drinking and cancer in smokers only. The OR was 1.3 for 1-2 drinks per day (compared with none) and 1.5 for 3+ drinks per day. Though the authors adjusted for number of cigarettes smoked per day in multivariate (logistic) regression, we cannot rule out residual confounding by level of smoking (which may be tightly linked to alcohol drinking).
Questions to ask yourself: Is the effect moderate in size? Are there strong confounders in play? Was the exposure, outcome, or strong confounder measured with considerable error/lack of precision?