IDEV 624 – Monitoring and Evaluation

66
IDEV 624 – Monitoring and Evaluation Evaluating Program Impact Elke de Buhr, PhD Payson Center for International Development Tulane University

description

IDEV 624 – Monitoring and Evaluation. Evaluating Program Impact Elke de Buhr, PhD Payson Center for International Development Tulane University. Process vs. Outcome/Impact Monitoring. Outcome Impact Monitoring Evaluation. Process Monitoring. LFM. USAID Results Framework. - PowerPoint PPT Presentation

Transcript of IDEV 624 – Monitoring and Evaluation

Page 1: IDEV 624 – Monitoring and Evaluation

IDEV 624 – Monitoring and Evaluation

Evaluating Program Impact

Elke de Buhr, PhDPayson Center for International Development

Tulane University

Page 2: IDEV 624 – Monitoring and Evaluation

Process vs. Outcome/Impact Monitoring

Process Monitoring Outcome Impact

Monitoring Evaluation

LFM

USAID Results Framework

Page 3: IDEV 624 – Monitoring and Evaluation

04/20/23

What is the problem? Situation Analysis & Surveillance

What are the contributing factors?

Determinants Research

What interventions and resources are needed? Needs, Resource, Response Analysis & Input Monitoring

What interventions can work (efficacy & effectiveness)? Efficacy & Effectiveness Studies, Formative & Summative Evaluation, Research Synthesis

Are we implementing the program as planned? Outputs Monitoring

What are we doing? Are we doing it right?Process Monitoring & Evaluation, Quality Assessments

Are interventions working/making a difference? Outcome Evaluation Studies

Are collective efforts being implemented on a large enough scale to impact the epidemic? (coverage; impact)? Surveys & Surveillance

Understanding Potential Responses

Monitoring & Evaluating National Programs

Determining Collective Effectiveness

ACTIVITIES

OUTPUTS

INPUTS

OUTCOMES

OUTCOMES & IMPACTS

A Public Health Questions Approach to HIV/AIDS M&E

Are we doing the right things?

Are we doing them right?

Are we doing them on a large enough scale?

Problem Identification

(UNAIDS 2008)

Page 4: IDEV 624 – Monitoring and Evaluation

04/20/23

Most Some Few*All

Input/ Output Monitoring

Input/ Output Monitoring

Process EvaluationProcess

EvaluationOutcome

Monitoring / Evaluation

Outcome Monitoring / Evaluation

Levels of Monitoring & Evaluation EffortLevels of Monitoring & Evaluation Effort

Number of

Projects

Number of

Projects

*Disease impact monitoring is synonymous with disease surveillance and should be part of all national-level efforts, but cannot be easily linked to specific projects

Strategic Planning for M&E: Setting Realistic Expectations

4

Impact Monitoring / Evaluation

Impact Monitoring / Evaluation

Page 5: IDEV 624 – Monitoring and Evaluation

Monitoring Strategy

• Process Activities

• Outcome/Impact Goals and Objectives

Page 6: IDEV 624 – Monitoring and Evaluation

Impact Evaluation

Page 7: IDEV 624 – Monitoring and Evaluation

Impact Evaluation• Impact evaluations are undertaken to find out whether

a program has accomplished its intended effects • Directed at the net effects of an intervention,

impact evaluations produce "an estimate of the impact of the intervention uncontaminated by the influence of other processes and events that also may affect the behavior or conditions at which the social program being evaluated is directed” (Rossi/Freeman 1989: 229)

• Ideally, impact assessments establish causality by means of a randomized experiment

Page 8: IDEV 624 – Monitoring and Evaluation

Outcome vs. Impact

• Outcome level: Status of an outcome at some point of time

• Outcome change: Difference between outcome levels at different points in time

• Impact/program effect: Proportion of an outcome change that can be attributed uniquely to a program as opposed to the influence of some other factor

(Rossi/Lipsey/Freeman 2004)

Page 9: IDEV 624 – Monitoring and Evaluation

Outcome vs. Impact (cont.)• Impact/program effect: the value added or

net gain that would not have occurred without the program and the only part of the outcome for which the program can honestly take credit– Most demanding evaluation task– Time-consuming and costly

Page 10: IDEV 624 – Monitoring and Evaluation

(Rossi/Lipsey/Freeman 2004: 207)

Page 11: IDEV 624 – Monitoring and Evaluation

Outline of an Impact Evaluation

1. Unit of analysis

2. Research question/hypothesis

3. Evaluation design

4. Sampling method

5. Impact indicators

6. Data analysis plan

Page 12: IDEV 624 – Monitoring and Evaluation

1. Unit of Analysis

Page 13: IDEV 624 – Monitoring and Evaluation

Unit of Analysis• Unit of analysis: The units on which outcome

measures are taken in an impact assessment and, correspondingly, the units on which data are available for analysis

• The unit of analysis in impact assessments is determined by 1. the nature of the intervention and

2. the targets to which the intervention is directed

• Can be individuals, households, neighborhoods, organizations, geographic areas, etc.

(Rossi/Lipsey/Freeman 2004)

Page 14: IDEV 624 – Monitoring and Evaluation

What are your program’s units of analysis?

Page 15: IDEV 624 – Monitoring and Evaluation

2. Research Question/Hypothesis

Page 16: IDEV 624 – Monitoring and Evaluation

Hypothesis

• Hypothesis: Formal statement that predicts relationship between one or more factors and the problem under study

• Support or reject the null hypothesis

• Null = no relationship

• Test:– Compare same variable over time– Comparison between two or more groups

Page 17: IDEV 624 – Monitoring and Evaluation

Can you formulate a null hypothesis for your program?

Page 18: IDEV 624 – Monitoring and Evaluation

3. Evaluation Design

Page 19: IDEV 624 – Monitoring and Evaluation

Evaluation Designs• Evaluation strategies:

– Comparisons over time– Comparison between groups

• Research designs:– Pre-test/Post-test designs– Time series– Quasi-experiments– Randomized experiments

Page 20: IDEV 624 – Monitoring and Evaluation

Comparisons Over Time

Time

XO1 O2

Time

O4 O6O2O1 O3 X O5

Time

O3 O4XO1 O2 X X

Pretest/Post-test design

Longitudinal designs /

Time series

Page 21: IDEV 624 – Monitoring and Evaluation

Effect of Intervention?

(Fisher, A A and J R Foreit Designing HIV/AIDS Intervention Studies: An Operations Handbook Population Council: May 2002, p.56)

Page 22: IDEV 624 – Monitoring and Evaluation

Effect of Intervention?

(Fisher and Foreit, p.57)

Page 23: IDEV 624 – Monitoring and Evaluation

Effect of Intervention?

(Fisher and Foreit, p. 57)

Page 24: IDEV 624 – Monitoring and Evaluation

Effect of Intervention?

(Fisher and Foreit, p. 58)

Page 25: IDEV 624 – Monitoring and Evaluation

Comparisons Between Groups

Time

XO1 O2

O4O3

Experimental group

Comparison group

Time

XO1 O2

O4O3

Experimental group

Control groupR

Quasi-experimental

design

Experimental design

Page 26: IDEV 624 – Monitoring and Evaluation

Randomized Experiments• “Flagships of impact assessment”

(Rossi/Lipsey/Freeman 2004: 262)

• When conducted well, provide the most credible conclusions about program effects

• Isolate the effects of the intervention being evaluated by ensuring that intervention and control group are statistically equivalent except for the intervention received

• In practice, it is sufficient if groups, as aggregates, are comparable with regard to any characteristic relevant to the outcome

Page 27: IDEV 624 – Monitoring and Evaluation

Randomization• Randomization: Assignment of potential targets to

intervention and control groups on the basis of chance so that every unit in a target population has the same probability as any other to be selected for either group

• Approximations of randomization: Acceptable if the groups that are being compared do not differ on any characteristic relevant to the intervention or the expected outcomes ( Quasi-experiments)

(Rossi/Lipsey/Freeman 2004)

Page 28: IDEV 624 – Monitoring and Evaluation

Feasible?• Randomized experiments are not feasible for all impact

assessments• Results may be ambiguous if

– program in early stages of implementation– interventions change in ways experiments cannot easily

capture

• In addition, the method may – be perceived as unfair or unethical (requires withholding

services from parts of the target population) – be too resource intensive (technical expertise, time, costs, etc.)– cause disruption in program procedures for delivering services,

create artificial situation

Page 29: IDEV 624 – Monitoring and Evaluation

Quasi-Experimental Designs

• Often used when it is not feasible to randomly assign targets to intervention and control groups

• Types of quasi-experimental designs: matched controls, statistical controls, reflexive controls, etc.

• Threats to validity: Selection bias, secular trends, interfering events, maturation

Page 30: IDEV 624 – Monitoring and Evaluation

Threats to Validity

Page 31: IDEV 624 – Monitoring and Evaluation

Threats to Internal Validity

• INTERNAL VALIDITY: Any changes that are observed in the dependent variable are due to the effect of the independent variable. They are not due to some other independent variables (extraneous variables, alternative explanations, rival hypotheses). The extraneous variables need to be controlled for in order to be sure that any results are due to the treatment and thus the study is internally valid.

• Threat of History: Study participants may have had outside learning experiences and enhanced their knowledge on a topic and thus score better when they are assessed after an intervention independent from the impact of the intervention. (No control group)

• Threat of Maturation: Study participants may have matured in their ability to understand concepts and developed learning skills over time and thus score better when they are assessed after an intervention independent from the impact of the intervention. (No control group)

• Threat of Mortality: Study participants may drop out and do not participate in all measures. Those that drop out are likely to differ from those that continue to participate. (No pretest)

• Treat of Testing: Study participants might do better on the posttest compared to the pretest simply because they take the same test a second time.

• Threat of Instrumentation: The posttest may have been revised or otherwise modified compared to the pretest and the two test are not comparable anymore.

• John Henry Effect: Control group may try extra hard after not becoming part of the “chosen” group (compensatory rivalry). • Resentful Demoralization of Control Group: Opposite of John Henry Effect. Control group may be demoralized and

perform below normal after not becoming part of the “chosen” group.• Compensatory Equalization: Control group may feel disadvantaged for not being part of the “chosen” group and receive

extra resources to keep everybody happy. This can cloud the effect if the intervention.• Statistical Regression: Threat to validity in cases in which the researcher uses extreme groups as study participants that

have been selected based on test scores. Due to the role that chance plays in test scores, the scores of students that score at the bottom of the normal curve are likely to go up, the scores of those that score at the top will go down if they are assessed a second time.

• Differential Selection: Experimental and control group differ in its characteristics. This may influence the results. • Selection-Maturation Interaction: Combines the threats to validity described as differential selection and maturation. If

experimental and control group differ in important respects, as for example age, differences in achievement might be due to this maturational characteristic rather than the treatment.

• Experimental Treatment Diffusion: Close proximity of treatment and control group might result in treatment diffusion. This clouds the effect of the intervention.

Page 32: IDEV 624 – Monitoring and Evaluation

Threats to Validity Matrix

History Matura-tion

Mortality Testing Instru-mentation

John Henry Effect

Compensa-tory Equali-zation

Differen-tial Selection

One-Shot Case Study

YES YES YES - - - - -

One-Group Pretest-Posttest Design

YES YES CONT. YES MAYBE - - -

Time Series Design YES CONT. CONT. YES MAYBE - - -

Pretest-Posttest Control Group Design

CONT. CONT. CONT. CONT. CONT. MAYBE MAYBE CONT.

Posttest-Only Control Group Design

CONT. CONT. YES - - MAYBE MAYBE CONT.

Single-Factor Multiple Treatment Designs

CONT. CONT. CONT. CONT. CONT. MAYBE MAYBE CONT.

Solomon 4 – Group Design

CONT. CONT. CONT. CONT. CONT. MAYBE MAYBE CONT.

Factorial Design CONT. CONT. CONT. CONT. CONT. MAYBE MAYBE CONT.

Static-Group Comparison Design

CONT. CONT. YES - - MAYBE MAYBE YES

Nonequivalent Control Group Design

CONT. CONT. CONT. CONT. CONT. MAYBE MAYBE CONT.

Page 33: IDEV 624 – Monitoring and Evaluation

Research Designs - Variations

A. Simple Designs

B. Cross-Sectional Studies

C. Longitudinal Studies

D. Experimental Designs

Page 34: IDEV 624 – Monitoring and Evaluation

A. Simple Designs

• One-Shot Case Study

• One-Group Pretest-Posttest Design

• Time Series Design

X O

O X O

O O O O X O O O O

R = Random assignment of subjects to conditionsX = Experimental treatmentO = Observation of the dependent variable (pretest, posttest, interim measure, etc.)

Page 35: IDEV 624 – Monitoring and Evaluation

B. Cross-Sectional Studies

Group 3

Group 1 Group 2

Comparison ofgroups. One pointin time.

Variations: Case-control study

Page 36: IDEV 624 – Monitoring and Evaluation

Case-Control StudyGroup 1

(with

characteristic) Event(s)

Group 2 (without

characteristic)

Comparison ofgroups. One pointin time.

Major limitations: Cannot be sure that population has not changed since event(s).

Page 37: IDEV 624 – Monitoring and Evaluation

C. Longitudinal Studies

PopulationPopulation Population

Comparison of population over time. Repeated measurements.

Variations: Panel study, Cohort study

Page 38: IDEV 624 – Monitoring and Evaluation

Panel Study

Group 1Group 1 Group 1

Measures change over time. Repeated data collection from same individuals.

Major limitations: High drop-out rates pose threat to internal validity.

Page 39: IDEV 624 – Monitoring and Evaluation

Cohort Study

Cohort (3)Cohort (1) Cohort (2)

Measures change over time. Repeated data collection from same cohort but different individuals.

Major limitations: Measures total change but fluctuations within cohort are not assessed.

Page 40: IDEV 624 – Monitoring and Evaluation

D. Experimental Designs

Group 2

Experi-ment

Group 2

Group 1 Group 1

Pre-Test Post-Test

Compares group(s) exposed to treatment with group not exposed to treatment. Measures at two points of time.

Variations: True experimental design, Quasi-experimental design

Page 41: IDEV 624 – Monitoring and Evaluation

True Experimental Design

Group 2

Experi-ment

Group 2

Group 1 Group 1

Pre-Test Post-Test

Compares group(s) exposed to treatment with group not exposed to treatment. Measures at two points of time. Research subjects are assigned randomly to treatment and control group.

Major limitations: Not feasible for all research & ethical problems.

Target population

Groups assignedrandomly.

Page 42: IDEV 624 – Monitoring and Evaluation

True Experimental Designs• True experimental designs use control groups

and random assignment of participants

Variations:• Pretest-Posttest Control Group Design• Posttest-Only Control Group Design• Single-Factor Multiple Treatment Designs• Solomon 4 – Group Design• Factorial Design

Page 43: IDEV 624 – Monitoring and Evaluation

Pretest-Posttest Control Group Design

• The randomly assigned experimental group receives the treatment and the control group receives no treatment or an alternative treatment

R O X OR O O

Page 44: IDEV 624 – Monitoring and Evaluation

Posttest-Only Control Group Design

• Like previous but without pretest.

R X OR O

Page 45: IDEV 624 – Monitoring and Evaluation

Single-Factor Multiple Treatment Designs

• Extension of Pretest-Posttest Control Group Design• Sample is assigned randomly to one of several

conditions

R O X1 OR O X2 OR O O

Page 46: IDEV 624 – Monitoring and Evaluation

Solomon 4 – Group Design

• Developed by researchers that worried about the effect of pretesting on the validity of the results.

R O X OR O OR X OR O

Page 47: IDEV 624 – Monitoring and Evaluation

Factorial Design

• Allows to include more than one independent variable.

• Test for the effects of different kinds of variables that might be expected to influence outcomes (gender, age, etc.).

Two Independent VariablesAB

A x B

Three Independent VariablesABC

A x BA x CB x C

A x B x C

Page 48: IDEV 624 – Monitoring and Evaluation

Quasi-Experimental Design

Group 2

Experi-ment

Group 2

Group 1 Group 1

Pre-Test Post-Test

Compares group(s) exposed to treatment with group not exposed to treatment. Measures at two points of time. Random assignment not possible.

Major limitations: Not a true experiment. Threats to validity. ( Selection bias)

Target population

Groups not assignedrandomly.

Page 49: IDEV 624 – Monitoring and Evaluation

Quasi-Experimental Designs• Quasi-experimental designs lack the random

assignment of experimental designs.

Variations:

• Static-Group

Comparison Design

• Nonequivalent Control

Group Design

X O-------------

O

O X O-------------

O O

Page 50: IDEV 624 – Monitoring and Evaluation

Choosing an Evaluation Design

Page 51: IDEV 624 – Monitoring and Evaluation

Impact Evaluation Strategy

• Comparison – Same group (over time)– Different groups

• Design balances accuracy and reliability with cost and feasibility

What is a “good enough” research design?

Page 52: IDEV 624 – Monitoring and Evaluation

Research Design Flow-Chart

Research Design

ObservationalStudy

Experimental Study

Cross-Sectional Longitudinal Single Group True Experiment Quasi-Experiment

Survey ResearchParticipant Observation

Clinical ExperimentNatural Experiment

Methods Methods

Page 53: IDEV 624 – Monitoring and Evaluation

Comparison Group Flow Chart

(Methodologist Toolchest, Version 3.0)

Page 54: IDEV 624 – Monitoring and Evaluation

4. Sampling Methods

Page 55: IDEV 624 – Monitoring and Evaluation

Sample Selection

• Sample size

• Sampling frame

• Sample selection = sampling– Probability sampling– Nonprobability sampling

Page 56: IDEV 624 – Monitoring and Evaluation

Sampling Methods• Census vs. Sampling

– Census measures all units in a population– Sampling identifies and measures a subset of

individuals within the population

• Probability vs. Non-Probability Sampling– Probability sampling results in a sample that is

representative of the target population– A non-probability sample is not representative of

any population

Page 57: IDEV 624 – Monitoring and Evaluation

Probability Sampling• Sample representative of the target population, large

sample size– Simple random/systematic sampling – Stratified random/systematic sampling– Cluster sampling– Experimental and quasi-experimental designs

.

Advantages• Findings representative

of the population• Advanced statistical

analysis

Disadvantages• Costly and time

consuming (depending on target population)

• Significant training needs

Page 58: IDEV 624 – Monitoring and Evaluation

5. Impact Indicators

Page 59: IDEV 624 – Monitoring and Evaluation

Concepts, Variables and Indicators

Example 1Example 1 Example 2Example 2 Example 3Example 3

ConceptsConcepts SizeSize Economic Economic well-beingwell-being

HealthHealth

VariablesVariables AreaArea Income per Income per capitacapita

Life Life ExpectancyExpectancy

IndicatorsIndicators Square Square kilometerskilometers

Purchasing Purchasing Power Parity Power Parity (PPP) GNP (PPP) GNP

($) per capita($) per capita

Average Average years of life years of life

if born in if born in 19701970

(Phuong Pham, Introduction to Quantitative Analysis)

Page 60: IDEV 624 – Monitoring and Evaluation

Indicator Criteria1. Measurable (able to be recorded and

analyzed in quantitative or qualitative terms)

2. Precise (defined the same way by all people)

3. Consistent (not changing over time so that it always measures the same thing)

4. Sensitive (changing proportionally in response to actual changes in the condition or item being measured) 

Page 61: IDEV 624 – Monitoring and Evaluation

Categorical vs. Continuous Variables

• Continuous variables– A variable that can be measured (weight,

height, age, etc.)• Categorical variables

– A variable that cannot be measured but can be categorized (ethnic group, age group, educational level, socio-economic class, etc.)

Page 62: IDEV 624 – Monitoring and Evaluation

6. Data Analysis Plan

Page 63: IDEV 624 – Monitoring and Evaluation

Data Analysis

• Type of variable– Categorical– Continuous

• Type of data analysis– Descriptive analysis– Hypothesis testing

Page 64: IDEV 624 – Monitoring and Evaluation

Descriptive Analysis vs. Hypothesis Testing

• Descriptive data analysis– Organizing and summarizing data

• Statistical inference – Procedure by which we reach a

conclusion about a population on the basis of the information contained in a sample that has been drawn from that population

(Phuong Pham, Introduction to Quantitative Analysis)

Page 65: IDEV 624 – Monitoring and Evaluation

Exercise

Page 66: IDEV 624 – Monitoring and Evaluation

Exercise• Outline an Outcome and/or Impact Evaluation

for your program

• Include a description of:1. Unit of analysis

2. Research question/hypothesis

3. Evaluation design

4. Sampling method

5. Impact indicators

6. Data analysis plan