Missing Data in Randomized Control Trials
-
Upload
otto-faulkner -
Category
Documents
-
view
46 -
download
1
description
Transcript of Missing Data in Randomized Control Trials
Missing Data in Randomized Control Trials
John W. GrahamThe Prevention Research Center
andDepartment of Biobehavioral Health
Penn State University
[email protected]/NCER Summer Research Training Institute, July 2008
Sessions in Four Parts
(1) Introduction: Missing Data Theory (2) Attrition: Bias and Lost Power (3) A brief analysis demonstration
Multiple Imputation with NORM and Proc MI
(4) Hands-on Intro to Multiple Imputation
Recent Papers
Graham, J. W., Cumsille, P. E., & Elek-Fisk, E. (2003). Methods for handling missing data. In J. A. Schinka & W. F. Velicer (Eds.). Research Methods in Psychology (pp. 87_114). Volume 2 of Handbook of Psychology (I. B. Weiner, Editor-in-Chief). New York: John Wiley & Sons.
Graham, J. W., (2009, in press). Missing data analysis: making it work in the real world. Annual Review of Psychology, 60.
Collins, L. M., Schafer, J. L., & Kam, C. M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6, 330_351.
Schafer, J. L., & Graham, J. W. (2002). Missing data: our view of the state of the art. Psychological Methods, 7, 147-177.
Part 1:A Brief Introduction to
Analysis with Missing Data
Problem with Missing Data
Analysis procedures were designed for complete data
. . .
Solution 1
Design new model-based procedures
Missing Data + Parameter Estimation in One Step
Full Information Maximum Likelihood (FIML)
SEM and Other Latent Variable Programs(Amos, LISREL, Mplus, Mx, LTA)
Solution 2
Data based procedures e.g., Multiple Imputation (MI)
Two Steps
Step 1: Deal with the missing data (e.g., replace missing values with plausible
values Produce a product
Step 2: Analyze the product as if there were no missing data
FAQ
Aren't you somehow helping yourself with imputation?
. . .
NO. Missing data imputation . . .
does NOT give you something for nothing
DOES let you make use of all data you have
. . .
FAQ
Is the imputed value what the person would have given?
NO. When we impute a value . .
We do not impute for the sake of the value itself
We impute to preserve important characteristics of the whole data set
. . .
We want . . .
unbiased parameter estimation e.g., b-weights
Good estimate of variability e.g., standard errors
best statistical power
Causes of Missingness
Ignorable MCAR: Missing Completely At Random MAR: Missing At Random
Non-Ignorable MNAR: Missing Not At Random
MCAR(Missing Completely At Random)
MCAR 1: Cause of missingness completely random process (like coin flip)
MCAR 2: Cause uncorrelated with variables of
interest Example: parents move
No bias if cause omitted
MAR (Missing At Random)
Missingness may be related to measured variables
But no residual relationship with unmeasured variables Example: reading speed
No bias if you control for measured variables
MNAR (Missing Not At Random)
Even after controlling for measured variables ...
Residual relationship with unmeasured variables
Example: drug use reason for absence
MNAR Causes
The recommended methods assume missingness is MAR
But what if the cause of missingness is not MAR?
Should these methods be used when MAR assumptions not met?
. . .
YES! These Methods Work!
Suggested methods work better than “old” methods
Multiple causes of missingness Only small part of missingness may be
MNAR
Suggested methods usually work very well
Methods:"Old" vs MAR vs MNAR
MAR methods (MI and ML) are ALWAYS at least as good as, usually better than "old" methods
(e.g., listwise deletion)
Methods designed to handle MNAR missingness are NOT always better than MAR methods
Analysis: Old and New
Old Procedures: Analyze Complete
Cases(listwise deletion)
may produce bias
you always lose some power (because you are throwing away data)
reasonable if you lose only 5% of cases
often lose substantial power
Analyze Complete Cases
(listwise deletion)
1 1 1 1 0 1 1 1 1 0 1 1 1 1 0 1 1 1 1 0
very common situation only 20% (4 of 20) data points missing but discard 80% of the cases
Other "Old" Procedures
Pairwise deletion May be of occasional use for preliminary
analyses
Mean substitution Never use it
Regression-based single imputation generally not recommended ... except ...
Recommended Model-Based Procedures
Multiple Group SEM (Structural Equation Modeling)
Latent Transition Analysis (Collins et al.)
A latent class procedure
Recommended Model-Based Procedures
Raw Data Maximum Likelihood SEMaka Full Information Maximum Likelihood (FIML) Amos (James Arbuckle)
LISREL 8.5+ (Jöreskog & Sörbom)
Mplus (Bengt Muthén)
Mx (Michael Neale)
Amos 7, Mx, Mplus, LISREL 8.8
Structural Equation Modeling (SEM) Programs
In Single Analysis ...
Good Estimation
Reasonable standard errors
Windows Graphical Interface
Limitation with Model-Based Procedures
That particular model must be what you want
Recommended Data-Based Procedures
EM Algorithm (ML parameter estimation)
Norm-Cat-Mix, EMcov, SAS, SPSS
Multiple Imputation NORM, Cat, Mix, Pan (Joe Schafer) SAS Proc MI LISREL 8.5+ Amos 7
EM Algorithm Expectation - MaximizationAlternate between
E-step: predict missing dataM-step: estimate parameters
Excellent (ML) parameter estimates
But no standard errors must use bootstrap or multiple imputation
Multiple Imputation
Problem with Single Imputation:Too Little Variability
Because of Error Variance
Because covariance matrix is only one estimate
Too Little Error Variance
Imputed value lies on regression line
Imputed Values on Regression Line
Restore Error . . .
Add random normal residual
Regression Line only One Estimate
Covariance Matrix (Regression Line) only One
Estimate Obtain multiple plausible estimates of the
covariance matrix
ideally draw multiple covariance matrices from population
Approximate this with Bootstrap Data Augmentation (Norm) MCMC (SAS 8.2, 9)
Data Augmentation stochastic version of EM
EM E (expectation) step: predict missing data M (maximization) step: estimate parameters
Data Augmentation I (imputation) step: simulate missing data P (posterior) step: simulate parameters
Data Augmentation
Parameters from consecutive steps ... too related i.e., not enough variability
after 50 or 100 steps of DA ...
covariance matrices are like random draws from the population
Multiple Imputation Allows:
Unbiased Estimation
Good standard errors provided number of imputations (m)
is large enough
too few imputations reduced power with small effect sizes
0
2
4
6
8
10
12
14
Perc
ent P
ow
er
Fallo
ff
100 85 70 55 40 25 10m Imputations
Power FalloffFMI = .50, rho = .10
From Graham, J.W., Olchowski, A.E., & Gilreath, T.D. (2007). How many imputations are really needed? Some practical clarifications of multiple imputation theory.
Prevention Science, 8, 206-213.
ρ
Part 2Attrition: Bias and Loss of
Power
Relevant Papers Graham, J.W., (in press). Missing data analysis: making it work in
the real world. Annual Review of Psychology, 60.
Collins, L. M., Schafer, J. L., & Kam, C. M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6, 330_351.
Hedeker, D., & Gibbons, R.D. (1997). Application of random-effects pattern-mixture models for missing data in longitudinal studies, Psychological Methods, 2, 64-78.
Graham, J.W., & Collins, L.M. (2008). Using Modern Missing Data Methods with Auxiliary Variables to Mitigate the Effects of Attrition on Statistical Power. Annual Meetings of the Society for Prevention Research, San Francisco, CA. (available upon request)
Graham, J.W., Palen, L.A., et al. (2008). Attrition: MAR & MNAR missingness, and estimation bias. Annual Meetings of the Society for Prevention Research, San Francisco, CA. (available upon request)
What if the cause of missingness is MNAR?
Problems with this statement
MAR & MNAR are widely misunderstood concepts
I argue that the cause of missingness is never purely MNAR
The cause of missingness is virtually never purely MAR either.
MAR vs MNAR
"Pure" MCAR, MAR, MNAR never occur in field research
Each requires untenable assumptions e.g., that all possible correlations
and partial correlations are r = 0
MAR vs MNAR
Better to think of MAR and MNAR asforming a continuum
MAR vs MNAR NOT even the dimension of interest
MAR vs MNAR: What IS the Dimension of Interest?
How much estimation bias? when cause of missingness cannot be
included in the model
Bottom Line ...
All missing data situations are partly MAR and partly MNAR
Sometimes it matters ... bias affects statistical conclusions
Often it does not matter bias has tolerably little effect on statistical
conclusions
(Collins, Schafer, & Kam, Psych Methods, 2001)
Methods:"Old" vs MAR vs MNAR
MAR methods (MI and ML) are ALWAYS at least as good as, usually better than "old" methods
(e.g., listwise deletion)
Methods designed to handle MNAR missingness are NOT always better than MAR methods
Yardstick for Measuring Bias
Standardized Bias =
(average parameter est) – (population value)-------------------------------------------------------- X 100
Standard Error (SE)
|bias| < 40 considered small enough to be tolerable
A little background for Collins, Schafer, & Kam (2001; CSK)
Example model of interest: X Y X = Program (prog vs control)Y = Cigarette SmokingZ = Cause of missingness: say,
Rebelliousness (or smoking itself) Factors to be considered:
% Missing (e.g., % attrition) rYZ rZR
rYZ
Correlation between cause of missingness (Z)
e.g., rebelliousness (or smoking itself) and the variable of interest (Y)
e.g., Cigarette Smoking
rZR
Correlation between cause of missingness (Z)
e.g., rebelliousness (or smoking itself) and missingness on variable of interest
e.g., Missingness on the Smoking variable
Missingness on Smoking (often designated: R or RY) Dichotomous variable:
R = 1: Smoking variable not missingR = 0: Smoking variable missing
CSK Study Design (partial)
Simulations manipulated amount of missingness (25% vs 50%) rZY (r = .40, r = .90) rZR held constant
r = .45 with 50% missing (applies to "MNAR-Linear" missingness)
CSK Results (partial) (MNAR Missingness)
25% missing, rYZ = .40 ... no problem 25% missing, rYZ = .90 ... no problem 50% missing, rYZ = .40 ... no problem 50% missing, rYZ = .90 ... problem
* "no problem" = bias does not interfere with inference
These Results apply to the regression coefficient for X Y with "MNAR-Linear" missingness (see CSK, 2001, Table 2)
But Even CSK ResultsToo Conservative
Not considered by CSK: rZR In their simulation rZR = .45
Even with 50% missing and rYZ = .90 bias can be acceptably small
Graham et al. (2008): Bias acceptably small
(standardized bias < 40) as long as rZR < .24
rZR < .24 Very Plausible
Study rZR
_________ _____HealthWise
(Caldwell, Smith, et al., 2004) .106AAPT (Hansen & Graham, 1991) .093Botvin1 .044Botvin2 .078Botvin3 .104
All of these yield standardized bias < 10
(estimated)
CSK and Follow-up Simulations
Results very promising Suggest that even MNAR biases
are often tolerably small
But these simulations still too narrow
Beginnings of a Taxonomy of Attrition
Causes of Attrition on Y (main DV)
Case 1: not Program (P), not Y, not PY interaction
Case 2: P only Case 3: Y only . . . (CSK scenario) Case 4: P and Y only
Beginnings of a Taxonomy of Attrition
Causes of Attrition on Y (main DV)
Case 5: PY interaction only Case 6: P + PY interaction Case 7: Y + PY interaction Case 8: P, Y, and PY interaction
Taxonomy of Attrition
Cases 1-4 often little or no problem
Cases 5-8 Jury still out (more research needed) Very likely not as much of a problem
as previously though Use diagnostics to shed light
Use of Missing Data Diagnostics
Diagnostics based on pretest data not much help Hard to predict missing distal
outcomes from differences on pretest scores
Longitudinal Diagnostics can be much more helpful
Hedeker & Gibbons (1997)
Plot main DV over time for four groups: for Program and Control for those with and without last wave
of data
Much can be learned
Empirical Examples
Hedeker & Gibbons (1997) Drug treatment of psychiatric patients
Hansen & Graham (1991) Adolescent Alcohol Prevention Trial
(AAPT) Alcohol, smoking, other drug prevention
among normal adolescents (7th – 11th grade)
Empirical Example Used by Hedeker & Gibbons (1997) IV: Drug Treatment vs. Placebo Control DV: Inpatient Multidimensional Psychiatric
Scale (IMPS) 1 = normal 2 = borderline mentally ill 3 = mildly ill 4 = moderately ill 5 = markedly ill 6 = severely ill 7 = among the most extremely ill
From Hedeker & Gibbons (1997)
2.5
3
3.5
4
4.5
5
5.5
0 1 3 6
IMPSlow = better outcomes
Placebo Control
Drug Treatment
Weeks of Treatment
Longitudinal DiagnosticsHedeker & Gibbons Example Treatment
droppers do BETTER than stayers Control
droppers do WORSE than stayers Example of Program X DV interaction But in this case, pattern would lead to suppression bias Not as bad for internal validity in presence
of significant program effect
AAPT (Hansen & Graham, 1991)
IV: Normative Education Program vs Information Only Control
DV: Cigarette Smoking (3-item scale) Measured at one-year intervals 7th grade – 11th grade
AAPT
Cigarette Smoking
(high = more smoking; arbitrary scale)
th th th th th
Control
Control
Program
Program
Longitudinal DiagnosticsAAPT Example Treatment
droppers do WORSE than stayers little steeper increase
Control droppers do WORSE than stayers
little steeper increase
Little evidence for Prog X DV interaction Very likely MAR methods allow good
conclusions (CSK scenario holds)
Use of Auxiliary Variables
Reduces attrition bias Restores some power lost due to
attrition
What Is an Auxiliary Variable? A variable correlated with the variables
in your model but not part of the model not necessarily related to missingness used to "help" with missing data estimation
Best auxiliary variables: same variable as main DV, but measured at
waves not used in analysis model
Model of Interest
X Y res 11
Benefit of Auxiliary Variables
Example from Graham & Collins (2008)
X Y Z1 1 1 500 complete cases1 0 1 500 cases missing Y
X, Y variables in the model (Y sometimes missing)
Z is auxiliary variable
Benefit of Auxiliary Variables
Effective sample size (N')
Analysis involving N cases, with auxiliary variable(s)
gives statistical power equivalent to N' complete cases without auxiliary variables
Benefit of Auxiliary Variables
It matters how highly Y and Z (the auxiliary variable) are correlated
For example increase
rYZ = .40 N = 500 gives power of N' = 542 ( 8%) rYZ = .60 N = 500 gives power of N' = 608 (22%) rYZ = .80 N = 500 gives power of N' = 733 (47%) rYZ = .90 N = 500 gives power of N' = 839 (68%)
Effective Sample Size by rYZ
500
600
700
800
900
1000
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
rYZ
Effective
Sample
Size
Conclusions Attrition CAN be bad for internal validity But often it's NOT nearly as bad as often feared
Don't rush to conclusions, even with rather substantial attrition
Examine evidence (especially longitudinal diagnostics) before drawing conclusions
Use MI and ML missing data procedures! Use good auxiliary variables to minimize impact
of attrition
Part 3:Illustration of Missing Data
Analysis: Multiple Imputation with NORM and
Proc MI
Multiple Imputation:Basic Steps
Impute
Analyze
Combine results
Imputation and Analysis
Impute 40 datasets a missing value gets a different imputed
value in each dataset
Analyze each data set with USUAL procedures e.g., SAS, SPSS, LISREL, EQS, STATA
Save parameter estimates and SE’s
Combine the ResultsParameter Estimates to
Report
Average of estimate (b-weight) over 40 imputed datasets
Combine the ResultsStandard Errors to Report
Weighted sum of: “within imputation” variance
average squared standard error usual kind of variability
“between imputation” variancesample variance of parameter estimates
over 40 datasets variability due to missing data
Materials for SPSS Regression
Starting place http://methodology.psu.edu
downloads (you will need to get a free user ID to download all our free software)
missing data software Joe Schafer's Missing Data Programs John Graham's Additional NORM Utilities
http://mcgee.hhdev.psu.edu/missing/index.html