The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge:...
Transcript of The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge:...
The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation systemDean KarlanProfessor of Economics, Yale UniversityPresident, Innovations for Poverty Action
The Goldilocks Problem
Overview
• Where to begin?– Theory of change
• What to evaluate? – Monitoring vs. Impact Evaluation– What is monitoring?– When is an impact evaluation appropriate?– When is an impact evaluation not appropriate?
• How to measure? – The core problem– Framework for ‘right-fit’ M&E systems
• Credible• Accurate Data• Appropriate Analysis
• Actionable• Responsible• Transportable
Theory of Change
A theory of change explains the “why” of a program by telling a sequential story: • what goes in, • what gets done, • what comes out, and• how the world thus (hopefully) changes for the better
Clearly articulating this helps organizations design sound programs and, lays the foundation for right-fit data collection
Theory of change
Example from the Deworm the World Initiative
IMPACT: Higher
incomes
OUTCOME: Decreased
worm prevalence
OUTPUT: Number of students
dewormed
INPUT: Staff time, deworming
tablets
ACTIVITY:Trainings,
deworming days
Monitoring vs. Impact Evaluation
Monitoring Impact Evaluation
Impact Evaluation: Asks: How have lives changed compared to how they would have changed had the program/policy/business not happened?
Monitoring: Asks: What did the program/policy/business use, do and produce?
OutputsInputs Activities Outcomes Impact
Program
Tinkering
Figure out implementation
Monitoring!
We know what we’re doing
What is monitoring?
• Collecting and analyzing high quality, actionable data on program implementation
• Monitoring can help:– Internal learning— answer questions related to program progress
and program improvement–External reporting— demonstrate accountability and transparency
Is there evidence?
Yes
Keep on doing that!
Can you generate it?
Monitoring AND Impact evaluation!
Program
Tinkering
Figure out implementation
Monitoring!
We know what we’re doing
No
No Yes
When is an impact evaluation appropriate?
Design issues: • Sample size is sufficient
• Often the deal breaker
• Note that this means enough separable units
• Timing is right• Not yet implemented
• Some path for expansion
• Knowledge gap worth filling
When is an impact evaluation appropriate?
Practical issues: • When there is budget and capacity to do it well.
• Often quasi-experimental methods cost just as much as RCTs• Randomizing is not expensive, surveying is • Field work is done well and results are disseminated
• The partner cares deeply about the answer to the research question and will use the results
Proof of concept: • The intervention has been implemented before and the process refined• but the idea needs to be validated• and there is a clear return on investment
When is an impact evaluation notappropriate?
Design issues: • Separable units
–Sample size is too small–Macro policy (monetary policy, trade, etc)
• Spillovers rampant and unmeasurable• No resource constraint that can be randomized well
–Example: refugee camp• Ethical: When we know the answer already
–We don’t keep running RCTs on vaccines – they work, continuing to evaluate is unethical
How to measure?The Core Problem
• Too little data
• Too much data
• Wrong data
From which set would you choose?
or
Jams: The results
Sheena Iyengar and Mark Lepper. “When Choice is Demotivating: Can One Desire Too Much of a Good Thing?” Journal of Personality and Social Psychology, 2000, Vol. 79, No. 6, 995-1006
0
20
40
60
80
100
120
140
160
24 jams 6 jams
Num
ber o
f peo
ple
Stoppedat booth
Jams: The results
Sheena Iyengar and Mark Lepper. “When Choice is Demotivating: Can One Desire Too Much of a Good Thing?” Journal of Personality and Social Psychology, 2000, Vol. 79, No. 6, 995-1006
0
20
40
60
80
100
120
140
160
24 jams 6 jams
Num
ber o
f peo
ple
Stoppedat booth
Purchased
Finding the ‘right-fit’
Usefulness
Amount of data
Framework for ‘right-fit’ M&E Systems
CART principlesCredible
Collect accurate high quality data and analyze them appropriately
Actionable
Commit to act on the data you collect
Responsible
Ensure the benefits of data collection outweigh the costs
Transportable
Collect data that generate knowledge for other programs
Credible• Data must accurately measure what they are supposed to
measure• Appropriate analysis must be conducted in a credible way
Bad quality data and data analyzed badly are similar to snake oil – worse than doing nothing at all!
Validity – is it what you were trying to measure?To be valid, data should capture the essence of what one is seeking to measure.
Reliability – can you trust the data that is collected?Reliability implies that the same data collection procedure will produce the same data repeatedly.
Accurate Data
Appropriate Analysis
• Impact implies causality; it tells us how an organization has changed the world around it• That means comparing what happened, to what would
have happened in the absence of the programwhat happened with the program
- what would have happened without the program= IMPACT of the program
• To measure impact it is necessary to find a way to credibly estimate the counterfactual i.e. how program participants would have fared if the program had not occurred
Appropriate AnalysisImpact Evaluation
Appropriate AnalysisConstructing the counterfactual
• Counterfactual is often constructed by selecting a group not affected by the program
• Non-randomized:• Argue that a certain excluded group mimics the counterfactual.
• Randomized:• Use random assignment of the program to create a control
group which mimics the counterfactual.
Appropriate AnalysisImpact of a Remedial Education Program (Balsakhi)
Method Impact Estimate
Pre-post 26.42*
Simple Difference -5.05*
Difference-in-Difference 6.82*
Regression 1.92
Randomized Experiment 5.87*
*: Statistically significant at the 5% level
RCTs may be the gold standard for measuring
impact, but they aren’t the right fit for every organization
or every project.
Actionable
Organizations should ask three questions of each and every piece of data that they want to collect:• Is there a specific action that we will take based on the
findings?• Do we have the resources necessary to implement that
action?• Do we have the commitment required to take that
action?
Responsible
• The responsibility principle can help organizations assess tradeoffs in a number of different areas, including:• Data Collection Methods: Is there is a cheaper or more
efficient method of data collection that does not compromise quality?
• Use of respondents’ time: Does the information to be gained justify taking a beneficiary’s time to answer?
• Resource Use: Is the total amount of spending on data collection justified, given the information it will provide, when compared to the amount spent on other areas of the organization (such as administrative and programmatic costs)?
Transportable
• Your analysis should help others too• Sharing successes and failures
• External validity• The Balsakhi program
helped inform the development of an education program in Ghana
Two line summary
Data: Have a plan! Preferably a good one.
Thank you!
Please refer to http://www.poverty-action.org/goldilocksfor additional resources and case studies.