Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges...

48
Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference

Transcript of Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges...

Page 1: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Problems with the Design and Implementation

of Randomized Experiments

ByLarry V. Hedges

Northwestern University

Presented at the 2009 IES Research Conference

Page 2: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Hard Answers to Easy Questions

ByLarry V. Hedges

Northwestern University

Presented at the 2009 IES Research Conference

Page 3: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Easy Question

Isn’t it ok if I just match (schools) on some variable before randomizing?

(You know lots of people do it)

This is a simple question, but giving it an answer requires serious thinking about design and analysis

Page 4: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

What Does this Question Mean?

Generally adding matching or blocking variables means adding another (blocking) factor to the design

The exact consequences depend on the design you started with:

• Individually randomized (completely randomized design)

• Cluster randomized (hierarchical design)

• Multicenter or matched (randomized blocks design)

Page 5: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Individually Randomized (Completely Randomized) Design

In this case you are adding a blocking factor crossed with treatment (p blocks)

In other words, the design becomes a (generalized) randomized block design

Blocks

1 2 … p

T        

C        

Page 6: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Individually Randomized (Completely Randomized) Design

How does this impact the analysis?

Think about a balanced design with 2n students per block and p blocks and the ANOVA partitioning of sums of squares and degrees of freedom

Original partitioningSSTotal = SST + SSWT

dfTotal = dfT + dfWT 2pn – 1 = 1 + 2pn – 2

Original test statistic

F = SST/(SSWT/dfWT)

Page 7: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Individually Randomized (Completely Randomized) Design

New partitioningSSTotal = SST + SSB + SSBxT + SSWC dfTotal = dfT + dfB + dfBxT + dfWC 2pn – 1 = 1 + (p – 1) + (p – 1) + 2p(n – 1)

New test statistic ?

F = SST/(SSWC/dfWC)

Or

F = SST/(SSBxT/dfBxT)

It depends on the inference model

Page 8: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Individually Randomized (Completely Randomized) Design

Original Design Blocked Design

SS = SST + SSWT SS = SST + (SSB + SSBxT + SSWC)

df = dfT + dfWT df = dfT + (dfB + dfBxT + dfWC)

2pn–1 = 1 + (2pn –2) 2pn–1 = 1 + (p-1) + (p-1) + 2p(n-1)

Page 9: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Inference Models

I will mention two inference models• Conditional inference model• Unconditional inference model

These inference models determine the type of inference (generalization) you wish to make

Inference model chosen has implications for the statistical analysis procedure chosen

The inference model determines the natural random effects

Page 10: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Inference Models

Conditional Inference Model

Generalization is to the blocks actually in the experiment (or those just like them)

Blocks in the experiment are the universe (population)

Generalization to other blocks depends on extra-statistical considerations (which blocks are just like them? How do you know?)

Generalization obviously cannot be model free

Page 11: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Inference ModelsUnconditional Inference model

Generalization is to a universe (of blocks) including blocks not in the experiment

Blocks in the experiment are a sample of blocks in the universe (population)

If blocks in the experiment can be considered a representative sample, inference to the population of blocks is by sampling theory

If blocks are not a probability sample, generalization gets tricky (what is the universe? How do you know?)

Page 12: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Inference Models

You can think of the inference model as linked to the sampling model for blocks

If the blocks observed are a (random) sample of blocks, then they are a source of random variation

If blocks observed are the entire universe of relevant blocks, then they are not a source of random variation

The statistical analysis can be chosen independently of the inference model, but if it doesn’t include all sources of random variation, inferences will be compromised

Page 13: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Inference Models and Statistical AnalysesIndividually Randomized Design

Blocks are fixed effects under the conditional inference models

In this case the correct test statistic is

FC = SST/(SSWC/dfWC)

and the F-distribution has 1 & 2p(n -1) df

Block effects are random under the unconditional inference model

In this case the correct test statistic is

FU = SST/(SSBxT/dfBxT)

and the F-distribution has 1 & (p -1) df

Page 14: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Inference Models and Statistical AnalysesIndividually Randomized Design

You can see that the error term in the test has (a lot) more df under fixed effects model 2p(n – 1) versus (p – 1)

What you can’t see is that (if there is a treatment effect) the average value of the F-statistic is typically also larger under the fixed effects model

It is bigger by a factor proportional to

where ω = σBxT2/σB

2 is a treatment heterogeneity parameter and ρ is the intraclass correlation and

ρ nωρ

ρ

1 ρ ρ

Page 15: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Possible Statistical Analyses Individually Randomized Design

Possible statistical analyses

1. Ignore the blocking

2. Include blocks as fixed effects

3. Include blocks as random effects

Consequences depend on whether you want to make a conditional or unconditional inference

Page 16: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Making Unconditional Inferences Individually Randomized Design

Possible statistical analyses

1. Ignore the blockingBad idea: Will inflate significance levels of tests for

treatment effects substantially

2. Include blocks as fixed effectsBad idea: Will inflate significance levels of tests for

treatment effects substantially

3. Include blocks as random effectsCorrect significance levels (but less power than

conditional analysis)

Page 17: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Making Conditional Inferences Individually Randomized Design

Possible statistical analyses

1. Ignore the blockingBad idea: May deflate actual significance levels of tests

for treatment effects substantially (unless ρ = 0)

• Include blocks as fixed effectsCorrect significance levels and more powerful test than

for unconditional analysis

• Include blocks as random effectsBad idea: May deflate significance levels and reduce

power

Page 18: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Cluster Randomized (Hierarchical) Design

The issues about blocking in the cluster randomized design are the same as in the individually randomized design

The inference model will determine the most appropriate statistical analysis

Examining the properties of the statistical analysis may also reveal the weakness of the design for a given inference purpose

For example, a small number of blocks may provide only very uncertain inference to a universe of blocks based on sampling arguments

Page 19: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Cluster Randomized (Hierarchical) Design

In this case you are adding a blocking factor crossed with treatment (p blocks) but clusters are still nested within treatments [here Cij is the jth cluster in the ith block]

Note that there are m clusters in each treatment per block

Block 1 Block p

C11, …, C1m C1(m+1), …, C2m Cp1, …, Cpm Cp(m+1), …, Cp(2m)

T   ---…

  ---

C ---   ---  

Page 20: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Cluster Randomized (Hierarchical) Design

How does this impact the analysis?

Think about a balanced design with 2mn students per block and p blocks and the ANOVA partitioning of sums of squares and degrees of freedom

Original partitioningSSTotal = SST + SSC + SSWC:T

dfTotal = dfT + dfC + dfWC:T 2mn – 1 = 1 + 2(m – 1) + 2m(n – 1)

Original test statistic

F = SST/(SSc/dfC)

Page 21: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Cluster Randomized (Hierarchical) Design

New partitioning

SSTotal = SST + SSB + SSBxT + SSC:BxT + SSWC

dfTotal = dfT + dfB + dfBxT + dfC:BxT + dfWC

2mpn – 1 = 1+ (p – 1) +(p – 1) +2p(m – 1) +2pm (n – 1)

New test statistic ?

F = SST/(SSWT/dfWT)

F = SST/(SSC:BxT/dfC:BxT)

Page 22: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Inference Models and Statistical Analyses Cluster Randomized Design

Blocks are fixed under the conditional inference model, but clusters are typically random

In this case the correct test statistic is

FC = SST/(SSC:BxT/dfC:BxT)

and the F-distribution has 1 & 2p(m – 1) df

Blocks are random under the unconditional inference model, but clusters are typically random

In this case there is no exact ANOVA test if there are block treatment interactions, but a conservative test uses the test statistic

FC = SST/(SSB/dfB)

and the F-distribution has 1 & (p – 1) df (large sample tests, e.g., based on HLM, are available)

Page 23: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Inference Models and Statistical Analyses Cluster Randomized Design

You can see that the error term has more df under fixed effects model

If there is a treatment effect the average value of the F-statistic is also larger under the fixed effects model

It is bigger by a factor proportional to

where ωB = σBxT2/σB

2 is a treatment heterogeneity parameter and ρB and ρC are the block and cluster level intraclass correlations, respectively and

ω +nB B C

C

ρ mn ρ ρ

ρ nρ

1 B Cρ ρ ρ

Page 24: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Possible Statistical AnalysesCluster Randomized Design

Possible statistical analyses

1. Ignore the blocking

2. Include blocks as fixed effects

3. Include blocks as random effects

Consequences depend on whether you want to make a conditional or unconditional inference

Page 25: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Making Unconditional InferencesCluster Randomized Design

Possible statistical analyses

1. Ignore the blockingBad idea: Will inflate significance levels of tests for

treatment effects substantially

2. Include blocks as fixed effectsBad idea: Will inflate significance levels of tests for

treatment effects substantially

3. Include blocks as random effectsCorrect significance levels but less power than

conditional analysis

Page 26: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Making Conditional InferencesCluster Randomized Design

Possible statistical analyses

1. Ignore the blockingBad idea: May deflate actual significance levels of tests

for treatment effects substantially

2. Include blocks as fixed effectsCorrect significance levels and more powerful test than

for unconditional analysis

3. Include blocks as random effectsNot such a bad idea: significance levels unaffected

Page 27: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Multi-center (Randomized Blocks) Design

The issues about blocking in the multicenter (randomized blocks) design are the same as in the cluster randomized design

The inference model will determine the most appropriate statistical analysis

Examining the properties of the statistical analysis may also reveal the weakness of the design for a given inference purpose

For example, a small number of blocks may provide only very uncertain inference to a universe of blocks based on sampling arguments

Page 28: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Multi-center (Randomized Blocks) Design

In this case you are adding a blocking factor crossed with treatment (p blocks) and clusters, but clusters are still nested within blocks [here Cij is the jth cluster in the ith block]

Note that there are m clusters in each treatment per block and n individuals in each treatment in each cluster

Block 1 Block p

C11 … C1m … Cp1 … Cpm

T  …

 …

 …

 

C        

Page 29: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Multi-center (Randomized Blocks) Design

How does this impact the analysis?

Think about a balanced design with 2mn students per block and p blocks n individuals per cell and the ANOVA partitioning of sums of squares and degrees of freedom

Original partitioningSSTotal = SST + SSC + SSTxC + SSWC

dfTotal = dfT + dfC + dfTxC + dfWC 2pmn – 1 = 1 + (pm – 1) + (pm – 1) + 2pm(n – 1)

Original test statistic

F = SST/(SSTxC/dfTxC)

Page 30: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Multi-center (Randomized Blocks) Design

New partitioning

SSTotal = SST + SSB + SSC:B + SSBxT + SSC:BxT + SSWC

dfTotal = dfT + dfB + dfC:B + dfBxT + dfC:BxT + dfWC

2mpn – 1 = 1+ (p – 1) + p(m – 1) + (p – 1) +2p(m – 1) +2pm (n – 1)

New test statistic ?

F = SST/(SSWC/dfWC)

F = SST/(SSBxT/dfBxT)

F = SST/(SSBxT/dfBxT)

Page 31: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Inference Models and Statistical Analyses Randomized Blocks Design

Blocks are fixed under the conditional inference models, but clusters are typically random

In this case the correct test statistic is

FC = SST/(SSC:BxT/dfC:BxT)

and the F-distribution has 1 & p(m – 1) df

Blocks are random under the unconditional inference model, but clusters are typically random

In this case the correct test statistic is

FU = SST/(SSBxT/dfBxT)

and the F-distribution has 1 & (p – 1) df

Page 32: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Inference Models and Statistical Analyses Randomized Blocks Design

You can see that the error term has more df under fixed effects model

If there is a treatment effect the average value of the F-statistic is also larger under the fixed effects model

It is bigger by a factor proportional to

where ωB = σBxT2/σB

2 and ωC = σCxT2/σC

2 are treatment heterogeneity parameters and ρB and ρC are the block and cluster level intraclass correlations, respectively and

ω +nB B C C

C C

ρ mn ρ ω ρ

ρ nω ρ

1 B Cρ ρ ρ

Page 33: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Possible Statistical AnalysesRandomized Blocks Design

Possible statistical analyses

1. Ignore the blocking

2. Include blocks as fixed effects

3. Include blocks as random effects

Consequences depend on whether you want to make a conditional or unconditional inference

Page 34: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Making Unconditional Inferences Randomized Blocks Design

Possible statistical analyses

1. Ignore the blockingBad idea: Will inflate significance levels of tests for

treatment effects substantially

2. Include blocks as fixed effectsBad idea: Will inflate significance levels of tests for

treatment effects substantially

3. Include blocks as random effectsCorrect significance levels but less power than

conditional analysis

Page 35: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Making Conditional Inference Randomized Blocks Design

Possible statistical analyses

1. Ignore the blockingBad idea: May deflate actual significance levels of tests

for treatment effects substantially

2. Include blocks as fixed effectsCorrect significance levels and more powerful test than

for unconditional analysis

3. Include blocks as random effectsBad idea: May deflate significance levels and reduce

power

Page 36: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Another Easy Question

There was some attrition from my study after assignment. Does that cause a serious problem?

This is another simple question, but the answer is far from simple. One answer can be framed using concepts of experimental design

Page 37: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Post Assignment Attrition

A different question has a simple answer:

Does that (attrition) cause a problem in principle?

The simple answer to that question is YES!

Randomized experiments with attrition no longer give model free, unbiased estimates of the causal effect of treatment

Whether the bias is serious or not depends (on the model that generates the missing data)

Page 38: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Post Assignment Attrition

The design is changed by adding a crossed factor corresponding to missingness like this

Now we can see a problem with estimating treatment effect from only the observed part of the design: The observed treatment effect is only part of the total treatment effect

Observed Missing

T    

C    

Page 39: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Post Assignment Attrition

Suppose that the means are given by the μ’s and the proportions are given by the π’s

Observed   Missing

Proportion Mean   Proportion Mean

T   μTO  μTM

C   μCO       μCM

Page 40: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Post Assignment Attrition

The treatment effect on all individuals randomized is

When the proportion of dropouts is equal in T and C so that

πT = πC = π

The mean of the treatment effect on all individuals randomized is

T TO T TM C CO C CMπ μ π μ π μ π μ

TO CO TM CMπ μ μ π μ μ

Page 41: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Post Assignment Attrition

Rewriting this we see that the average treatment effect for individuals assigned to treatment is

where δO is the treatment effect among the individuals that are observed and δM is the treatment effect among the individuals that are not observed and δ is the treatment effect among all individuals assigned

Thus bounds on δM imply bounds on δ

l

O Mδ π δ π δ

Page 42: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Post Assignment Attrition

No estimate of the treatment effect is possible without an estimate of the treatment effect among the missing individuals

One possibility is to model (assume) that we know something about the treatment effect in the missing individuals

We can assume a range of values to get bounds on the possible treatment effect

Page 43: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Post Assignment Attrition

When attrition rate is not the same in the treatment groups (πT ≠ πC) the analysis is trickier

One idea is to convince ourselves that the treatment effect for those who drop out is the same as those who do not

  Observed   Missing

  Mean   Mean

T 90 33

C 67 10

T-C 23 23

Page 44: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Post Assignment Attrition

This does not assure that attrition has not altered the treatment effect

l

  Observed   Missing

  Mean   Mean

T 90 33

C 67 10

T-C 23 23

Page 45: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Post Assignment Attrition

This does not assure that attrition has not altered the treatment effect

We have to know both μTM and μCM to identify the treatment effect, knowing δM = (μTM – μCM) is not enough

  Observed   Missing   Total

  n Mean   n Mean   n Mean

T 10 90 90 33 100 39

C 90 67 10 10 100 61

T-C 23 23 -23

Page 46: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Post Assignment Attrition

Suppose that BL

TM and BLCM are lower bounds on the means for

missing individuals in the treatment group and

BUTM and BU

CM are the upper bounds

Then the upper and lower bounds on the treatment effect are

Lower

Upper

T TO T TM C CO C CML Uπ μ π B π μ π B

T TO T TM C CO C CMU Lπ μ π B π μ π B

Page 47: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Post Assignment Attrition

Note that none of the results on attrition involve sampling or estimation error

Results get more complex if we take this into account, but the basic ideas are those here

Page 48: Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.

Conclusions

Many simple questions arise in connection with field experiments

The answers to these questions often require thinking through complex aspects of

• the design

• the inference model

• assumptions about missing data

No correct answers are possible without recognizing these complexities