The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge:...

30
The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system Dean Karlan Professor of Economics, Yale University President, Innovations for Poverty Action

Transcript of The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge:...

Page 1: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation systemDean KarlanProfessor of Economics, Yale UniversityPresident, Innovations for Poverty Action

Page 2: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

The Goldilocks Problem

Presenter
Presentation Notes
Like Goldilocks, nonprofit organizations have to navigate many choices and challenges to build monitoring and evaluation systems. What kind of evaluation should the organization undertake? How can organizations develop systems that work just right?
Page 3: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

Overview

• Where to begin?– Theory of change

• What to evaluate? – Monitoring vs. Impact Evaluation– What is monitoring?– When is an impact evaluation appropriate?– When is an impact evaluation not appropriate?

• How to measure? – The core problem– Framework for ‘right-fit’ M&E systems

• Credible• Accurate Data• Appropriate Analysis

• Actionable• Responsible• Transportable

Page 4: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

Theory of Change

A theory of change explains the “why” of a program by telling a sequential story: • what goes in, • what gets done, • what comes out, and• how the world thus (hopefully) changes for the better

Clearly articulating this helps organizations design sound programs and, lays the foundation for right-fit data collection

Presenter
Presentation Notes
Developing a theory of change typically begins in one of two ways. For new programs, an organization begins by identifying the problem or need it intends to address, or the change it wants to make through the program. Once an organization has agreed on the problem to solve or the change it intends to effect, it works backwards (often called backwards mapping) to lay out the series of actions needed to produce those results. At this point the program defines the specific activities it will undertake, the goods and services (outputs) it will deliver as a result of those activities, and the intended social changes (impact) it will create. At every step of the way, the assumptions underlying each step are clearly laid out and examined. For organizations that already have a program in place, developing a theory of change is more like self-reflection. Such self-reflection can help strengthen programs by forcing organizations to challenge their assumptions and by asking whether sufficient evidence exists to support the current strategy. Such moments can be reassuring, but can help also identify crucial gaps in current operations. No matter when an organization starts developing a theory of change, the theory should articulate the assumptions that must hold for the program to work.
Page 5: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

Theory of change

Example from the Deworm the World Initiative

IMPACT: Higher

incomes

OUTCOME: Decreased

worm prevalence

OUTPUT: Number of students

dewormed

INPUT: Staff time, deworming

tablets

ACTIVITY:Trainings,

deworming days

Page 6: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

Monitoring vs. Impact Evaluation

Monitoring Impact Evaluation

Impact Evaluation: Asks: How have lives changed compared to how they would have changed had the program/policy/business not happened?

Monitoring: Asks: What did the program/policy/business use, do and produce?

OutputsInputs Activities Outcomes Impact

Presenter
Presentation Notes
A note about terminology
Page 7: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

Program

Tinkering

Figure out implementation

Monitoring!

We know what we’re doing

Page 8: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

What is monitoring?

• Collecting and analyzing high quality, actionable data on program implementation

• Monitoring can help:– Internal learning— answer questions related to program progress

and program improvement–External reporting— demonstrate accountability and transparency

Page 9: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

Is there evidence?

Yes

Keep on doing that!

Can you generate it?

Monitoring AND Impact evaluation!

Program

Tinkering

Figure out implementation

Monitoring!

We know what we’re doing

No

No Yes

Page 10: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

When is an impact evaluation appropriate?

Design issues: • Sample size is sufficient

• Often the deal breaker

• Note that this means enough separable units

• Timing is right• Not yet implemented

• Some path for expansion

• Knowledge gap worth filling

Page 11: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

When is an impact evaluation appropriate?

Practical issues: • When there is budget and capacity to do it well.

• Often quasi-experimental methods cost just as much as RCTs• Randomizing is not expensive, surveying is • Field work is done well and results are disseminated

• The partner cares deeply about the answer to the research question and will use the results

Proof of concept: • The intervention has been implemented before and the process refined• but the idea needs to be validated• and there is a clear return on investment

Page 12: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

When is an impact evaluation notappropriate?

Design issues: • Separable units

–Sample size is too small–Macro policy (monetary policy, trade, etc)

• Spillovers rampant and unmeasurable• No resource constraint that can be randomized well

–Example: refugee camp• Ethical: When we know the answer already

–We don’t keep running RCTs on vaccines – they work, continuing to evaluate is unethical

Page 13: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

How to measure?The Core Problem

• Too little data

• Too much data

• Wrong data

Presenter
Presentation Notes
Too little - Some organizations do not collect enough (of the right kind of) data, which means they cannot fulfill what should be their top priority: using data to learn, innovate, and improve program implementation over time. Too much - Some collect more data than they actually have the resources to analyze, resulting in wasted time and effort that could have been spent more productively elsewhere Wrong - Many organizations track changes in outcomes over time, but not in a way that allows one to know if the organization caused the change to happen, or if just happened to happen alongside the program. This distinction matters greatly for knowing whether to continue the program.
Page 14: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

From which set would you choose?

or

Presenter
Presentation Notes
Offered a coupon for jam and then showed either a display of 6 jams or 24 jams.
Page 15: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

Jams: The results

Sheena Iyengar and Mark Lepper. “When Choice is Demotivating: Can One Desire Too Much of a Good Thing?” Journal of Personality and Social Psychology, 2000, Vol. 79, No. 6, 995-1006

0

20

40

60

80

100

120

140

160

24 jams 6 jams

Num

ber o

f peo

ple

Stoppedat booth

Page 16: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

Jams: The results

Sheena Iyengar and Mark Lepper. “When Choice is Demotivating: Can One Desire Too Much of a Good Thing?” Journal of Personality and Social Psychology, 2000, Vol. 79, No. 6, 995-1006

0

20

40

60

80

100

120

140

160

24 jams 6 jams

Num

ber o

f peo

ple

Stoppedat booth

Purchased

Presenter
Presentation Notes
data are like jams – too much is overwhelming and can impede decision-making. So when thinking about how to find the right-fit in data collection, remember that more is not always better.
Page 17: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

Finding the ‘right-fit’

Usefulness

Amount of data

Presenter
Presentation Notes
How can organizations find right-fit monitoring and evaluation systems that support learning, action and responsibility? As with Goldilocks search for the best porridge, chair, and bed, the key to is to find the right data. More is not always better. Nor is less. And simply the middle is not the answer either. What is the right balance?
Page 18: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

Framework for ‘right-fit’ M&E Systems

CART principlesCredible

Collect accurate high quality data and analyze them appropriately

Actionable

Commit to act on the data you collect

Responsible

Ensure the benefits of data collection outweigh the costs

Transportable

Collect data that generate knowledge for other programs

Presenter
Presentation Notes
organizations need a framework to help them wade through the decisions they will encounter – whether they are setting up a whole monitoring and evaluation system from scratch; reforming an old, tired, and poorly fit system; or simply designing a small survey
Page 19: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

Credible• Data must accurately measure what they are supposed to

measure• Appropriate analysis must be conducted in a credible way

Bad quality data and data analyzed badly are similar to snake oil – worse than doing nothing at all!

Presenter
Presentation Notes
There are two elements to credible data Bad quality data or bad analysis are similar to snake oil. Worse that doing nothing at all
Page 20: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

Validity – is it what you were trying to measure?To be valid, data should capture the essence of what one is seeking to measure.

Reliability – can you trust the data that is collected?Reliability implies that the same data collection procedure will produce the same data repeatedly.

Accurate Data

Presenter
Presentation Notes
Validity: To know someone’s age, asking their age is usually fairly straightforward. But many concepts are far less clear. Consider a simple example: an An organization would like to ask a question that measures the number of times a person has sought medical attention in the last month. The concept the organization wants to measure is care seeking. What are all the types of activities that could fall under this concept? Doctor and hospital visits? Homeopathy? Traditional healers? Massage therapists? Depending on the context, the way this question is asked could result in many different answers. Reliability: One simple way of thinking about reliability is to consider the use of physical instruments in data collection. Suppose a study uses scales to measure the weight of respondents. If the scales are calibrated differently each day, they will not produce a reliable estimate of weight. For survey questions, reliability implies that a survey question will be interpreted and answered the same way by different respondents. Admin data or data from third party sources also needs to go through quality checks
Page 21: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

Appropriate Analysis

Page 22: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

• Impact implies causality; it tells us how an organization has changed the world around it• That means comparing what happened, to what would

have happened in the absence of the programwhat happened with the program

- what would have happened without the program= IMPACT of the program

• To measure impact it is necessary to find a way to credibly estimate the counterfactual i.e. how program participants would have fared if the program had not occurred

Appropriate AnalysisImpact Evaluation

Page 23: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

Appropriate AnalysisConstructing the counterfactual

• Counterfactual is often constructed by selecting a group not affected by the program

• Non-randomized:• Argue that a certain excluded group mimics the counterfactual.

• Randomized:• Use random assignment of the program to create a control

group which mimics the counterfactual.

Page 24: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

Appropriate AnalysisImpact of a Remedial Education Program (Balsakhi)

Method Impact Estimate

Pre-post 26.42*

Simple Difference -5.05*

Difference-in-Difference 6.82*

Regression 1.92

Randomized Experiment 5.87*

*: Statistically significant at the 5% level

Page 25: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

RCTs may be the gold standard for measuring

impact, but they aren’t the right fit for every organization

or every project.

Page 26: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

Actionable

Organizations should ask three questions of each and every piece of data that they want to collect:• Is there a specific action that we will take based on the

findings?• Do we have the resources necessary to implement that

action?• Do we have the commitment required to take that

action?

Presenter
Presentation Notes
Need to start with clear description of the decisions that you will make with the data. Also necessary to determine decision triggers – if we find out that our results are above or below a certain level, what would we do? Also critical that systems can deliver the data in time for the decision to be made.
Page 27: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

Responsible

• The responsibility principle can help organizations assess tradeoffs in a number of different areas, including:• Data Collection Methods: Is there is a cheaper or more

efficient method of data collection that does not compromise quality?

• Use of respondents’ time: Does the information to be gained justify taking a beneficiary’s time to answer?

• Resource Use: Is the total amount of spending on data collection justified, given the information it will provide, when compared to the amount spent on other areas of the organization (such as administrative and programmatic costs)?

Page 28: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

Transportable

• Your analysis should help others too• Sharing successes and failures

• External validity• The Balsakhi program

helped inform the development of an education program in Ghana

Page 29: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

Two line summary

Data: Have a plan! Preferably a good one.

Page 30: The Goldilocks Challenge: learning to structure a ‘right ... · The Goldilocks Challenge: learning to structure a ‘right-fit’ and effective monitoring and evaluation system.

Thank you!

Please refer to http://www.poverty-action.org/goldilocksfor additional resources and case studies.