The Hype and Futility of Measuring Implementation Fidelity v5
-
Upload
david-ross-judkins -
Category
Documents
-
view
221 -
download
0
Transcript of The Hype and Futility of Measuring Implementation Fidelity v5
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
1/46
The Hype and Futility of
Measuring Implementation
Fidelity (in GRTs)David Judkins
Presentation at Evaluation 2009Orlando
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
2/46
2
The Hype
Effectiveness research is now at the point
of sophistication wherein black-box
outcomes studies are no longer acceptable.Mowbray, Holter, Teague and Bybee,
2003
89,900 Google hits on October 10, 2009, forthe phrase, what works best for whom
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
3/46
3
Lofty Goals
What social programs, policies, and
interventions work?
For whom do they work, and under what
conditions?
And why do they workor fall short?
Preface toLearning More from Social
Experiments, edited by Howard Bloom
(34,000 Google hits on book title)
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
4/46
4
And why do they workor fall
short? Bloom expands on the question in an
MDRC announcement about the book
publication:But, in the past, there have been questions that
randomized experiments have not been able to
address effectively. What component of a socialpolicy made it successful?
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
5/46
5
Can such questions be answered?
I will argue that the answer is generally
negative
Worse, that attempting to answer it
compromises the first objective of
determining whether the intervention works
at all (This is all in the context of group
randomized trials)
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
6/46
6
Thesis
It is better not to attempt to measure fidelity inGRTs
It is counter-productive to try to answer questions
about efficacy (intervention effects under idealconditions) in a trial designed to measureeffectiveness (intervention effects under realisticconditions)
Other forms of measurement about theintervention process in the hopes of learning moreabout alternate interventions are also vain andwasteful
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
7/46
7
Outline
Opportunities & perspectives
The preconditions for useful fidelity measurement
Operational challenges in fidelity measurement
Statistical issues in the estimation of fidelity-
adjusted intervention effectiveness
A case study Another perspective
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
8/46
8
Opportunities
Many educational, social, and behavioral
interventions are complex
Multidimensional Incorporate aspects of culturally accepted best practices
(traditions and fads)
Require the participation of trained intervenors and of
intervention subjects over extended periods of time Can never be detailed enough to handle every
eventuality
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
9/46
9
Cynics Perspective
If there is failure:
Blame the subjects
Blame the intervenors
Disqualify or discount the work of control
intervenors who by virtue of superior skill,
appear to infringe on the developers recipe,possibly merely by implementing the culturally
accepted best practices
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
10/46
10
Kaspar Hauser (Jeder fr sich
und Gott gegen alle The wolf child raised
in isolation from most
humanity would makean ideal foil for many
educational and
parenting
interventions
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
11/46
11
The Thrifty Perspective
It is urgent that an effective intervention be
found
Limited number of fresh ideas in circulation
Limited dollars for research
It would be nice to be able to learn from an
experiment designed to measure the impact
of a complex intervention A3B7C2 what
would be the effect of A1B22C4
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
12/46
12
Preconditions for Fidelity
Measurement Well-defined intervention
A way of splitting an intervention into
components that could be recombined in
alternative strengths and mixtures
Some theory about what aspects of
interactions between intervenor and subjectare relevant and consequential
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
13/46
13
Operational Challenges
Choice of informant
Subject
Intervenor
Trainer/ senior intervenor adviser
Neutral observer
How to make fidelity reliable, valid andcost effective?
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
14/46
14
Intervenor Informant
Likely to
think they are
doing justfine if asked
to summarize
their fidelity
Let's Begin with the Letter People: ECE
0
24
6
8
10
12
14
16
18
20
22
24
1 2 3 4 5
Project Director rating
Frequency
Play & Learning Strategies (PALS): PE
0
24
6
8
10
12
14
16
18
20
22
24
1 2 3 4 5
Project Director rating
Frequency
Partners for Literacy: ECE
0
2
4
6
8
10
12
14
16
18
20
22
24
1 2 3 4 5
Project Director rating
Fre
quency
Partners for Literacy: PE
0
2
4
6
8
10
12
14
16
18
20
22
24
1 2 3 4 5
Project Director rating
Frequency
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
15/46
15
Intervenor Informant (2)
If asked to keep detailed logs, they will
likely do a poor job
For those who do a good job on detailed
activity reporting, it will probably detract
from their effectiveness
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
16/46
16
Trainer/Advisor Informant
Can have vested interest
Blind neither to treatment status nor
outcome outlook
Possible to read the writing on the wall and
rate the intervenors with unfavorable
average outcomes as having low fidelity,thereby protecting the fidelity-adjusted
effectiveness of the intervention
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
17/46
17
Trainer/Advisor Informant (2)
Even if unbiased, how sound of an opinion
can be formulated from initial training and
occasional (often telephone) contact withintervenors?
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
18/46
18
Neutral Observer
Very costly
Need staff who fully understand the
intervention model
Need extensive training for consistent rating
Usually need travel
Results in strong pressure for additional
clustering
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
19/46
19
Neutral Observer (2)
High cost of training, travel, and salary
directly reduce power for primary
effectiveness research by reducing subjectsample size (for fixed budget)
Pressure for stronger clustering indirectly
reduces power by reducing the number ofintervenors and/or intervention sites
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
20/46
20
Statistical Issues
Most advocates of fidelity measurement
have unwarranted optimism about the
ability of statisticians to do anything usefulwith the data
Of course, one can always hunt for the
statistician who will provide rosy promisesof artful analyses
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
21/46
21
Statistical Issues (2)
The statistician who offers multi-level
causal path mediated analyses will be loved
by many, but as Tukey said: The data may not contain the answer. The
combination of some data and an aching
desire for an answer does not ensure that areasonable answer can be extracted from a
given body of data.
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
22/46
22
The Best We (Statisticians and
Econometricians) Can Offer Requires heroic assumptions. Either:
Randomization provides an instrumental
variable for fidelity; orThe collection of measured covariates is rich
enough to render fidelity conditionally
independent of potential outcomes
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
23/46
23
Heroic Assumption #1
Can only render the mediating role of one (one!)
unidimensional summary of fidelity
By definition,Zis an instrumental variable for the effect of
Xon Yif the only effect ofZon Yis throughX
In other words, one must be able to rule out a priori that
there could be any effects ofZon Ythat do not run through
X
In the context of fidelity-adjusted effect estimation, this
means that there is a unique plausible summarization of
fidelity
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
24/46
24
Heroic Assumption #1 (cont.)
Might not be so heroic if the intervention is very
simple and nearly instantaneous
Then a binary measure of fidelity might be theunique plausible choice
Or if the intervention is purely unidimensional,
perhaps a uniquely plausible ordinal measure of
fidelity could be developed
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
25/46
25
Heroic Assumption #1 (cont.)
The little recognized but ironic kick is that even if
you make this assumption, the formal hypothesis
tests for fidelity-adjusted interventioneffectiveness based on the IV approach yield the
same star pattern as the original analysis
The point estimate will be altered, but if the ITT
analysis found no statistically significant treatmenteffect, an IV analysis with randomization as the
instrumental variable will yield the same finding
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
26/46
26
Heroic Assumption #2
If one relies upon the adequacy of covariate
measurement, one quickly runs up against sample
size problems A typical group randomized trial will have only a
few dozen intervenors per arm (maybe just one or
two dozen, and I have seen less than one dozen)
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
27/46
27
Heroic Assumption #2 (cont.)
If we agree that it would probably take on the
order of 30 covariates to fully explain why some
intervenors are more faithful than others (thepropensity scoring approach) or more effective
than others (the ANCOVA approach), then we
need on the order of a 1000 intervenors before we
even consider interactions among the covariates
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
28/46
28
Heroic Assumption #2 (cont.)
However, instrument designers generally have no
clue how to design intervenor background
questionnaires that would explain intervenorfidelity
And if we knew how to measure intervenor
effectiveness, then the entire experiment would be
unnecessary
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
29/46
29
CASE STUDY
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
30/46
30
CLIO
Randomized field trial of curricula for Even
Start Centers
5 arm study4 active, 1 control
Three fidelity measurements:
Local Even Start center director
Curriculum designer
Neutral observer
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
31/46
31
Fidelity Instrument Development
Several of the top national experts in the
evaluation of early education interventions
designed the neutral observer instrumentsand training
Curriculum designers were consulted
Curriculum designers had ongoing contactwith intervenors through technical
assistance contracts
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
32/46
32
Correlations Between Developer
and Observer Fidelity Ratings Across 96 active projects for early
childhood curriculum:
o 0.48 in year 1
o 0.39 in year 2
Across 48 active projects for parenting
curriculum:o 0.10 in year 1
o -0.01in year 2
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
33/46
33
Relationship between developer-rated fidelity
and emergent child English literacy (arm A2)
Fidelity
Outcome
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
34/46
34
Relationship between developer-rated fidelity
and emergent child English literacy (arm B2)
Fidelity
Outcome
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
35/46
35
Relationship between developer-rated fidelity
and emergent child English literacy (arm A1)
Fidelity
Outcome
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
36/46
36
Relationship between developer-rated fidelity
and emergent child English literacy (arm B1)
Fidelity
Outcome
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
37/46
37
Relationship between developer-rated fidelity
and emergent child English literacy (control)
Fidelity
Outcome
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
38/46
38
Relationship between observer-rated fidelity
and emergent child English literacy (arm A2)
Fidelity
Outcome
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
39/46
39
Relationship between observer-rated fidelity
and emergent child English literacy (arm B2)
Fidelity
Outcome
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
40/46
40
Relationship between observer-rated fidelity
and emergent child English literacy (arm A1)
Fidelity
Outcome
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
41/46
41
Relationship between observer-rated fidelity
and emergent child English literacy (arm B1)
Fidelity
Outcome
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
42/46
42
Relationship between observer-rated fidelity
and emergent child English literacy (control)
Fidelity
Outcome
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
43/46
43
Methods and Results
Multiplied arm indicators by fidelity scores
(constrained to lie between 0 and 1) in multi-level
model Generally similar results
Fidelity-adjusted estimates not always larger than
ITT estimates!
Two more stars
One positive
One negative!
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
44/46
44
Case Study Wrap Up
A lot of money spent with little discernable
return
We still dont know how to develop goodpreschool curricula for Even Start projects
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
45/46
45
Other voices
Peter Schochet, Mathematica Policy
Research, in a recent IES white paper, final
line: Thus, these classroom practice mediators
may be of little help in confirming the
studys conceptual model and identifyingteacher practices that are most associated
with student learning gains.
-
7/31/2019 The Hype and Futility of Measuring Implementation Fidelity v5
46/46
Josh Angrist
Instrumental Variables Methods in Experimental
Criminological Research: What, Why, and How? 2004.
Journal Of Experimental Criminology.
Especially noteworthy is the fact that, in marked contrast
with an unfortunate trend in education research,
criminologists do not appear to have been afflicted with
what social scientist Tom Cook (2001) calls
sciencephobia. This is a tendency to eschew rigorousquantitative research designs in favor of a softer approach
that emphasizes process over outcomes.
46