Statistical power in experiments in which samples of participants respond to samples of stimuli

Statistical power in experiments in which samples of participants respond to samples of stimuli

Jake WestfallUniversity of Colorado Boulder

David A. Kenny Charles M. Judd University of Connecticut University of Colorado Boulder

• Studies involving participants responding to stimuli (hypothetical data matrix):

Subject #123...

4 6 7 3 8 8 7 9 5 64 7 8 4 6 9 6 7 4 53 6 7 4 5 7 5 8 3 4

• Just in domain of implicit prejudice and stereotyping:– IAT (Greenwald et al.)– Affective Priming (Fazio et al.)– Shooter task (Correll et al.)– Affect Misattribution Procedure (Payne et al.)– Go/No-Go task (Nosek et al.)– Primed Lexical Decision task (Wittenbrink et al.)– Many non-paradigmatic studies

• “How many stimuli should I use?”• “How similar or variable should the stimuli

be?”• “When should I counterbalance the

assignment of stimuli to conditions?”• “Is it better to have all participants respond

to the same set of stimuli, or should each participant receive different stimuli?”

• “Should participants make multiple responses to each stimulus, or should every response by a participant be to a unique stimulus?”

Hard questions

Power analysis in crossed designs• Power determined by several parameters:– 1 effect size (Cohen’s d)– 2 sample sizes• p = # of participants• q = # of stimuli

– Set of Variance Partitioning Coefficients (VPCs)• VPCs describe what proportion of the random

variation in the data comes from which sources• Different designs depend on different VPCs

Four common experimental designs

For power = 0.80, need q ≈ 50

For power = 0.80, need p ≈ 20

Maximum attainable power• In crossed designs, power asymptotes

at a maximum theoretically attainable value that depends on:– Effect size– Number of stimuli– Stimulus variability

• Under realistic assumptions, maximum attainable power can be quite low!

To obtain max.power = 0.9…

Pessimist: q=86

Realist: q=20 to 50

Optimist: q=11

Implications of maximum attainable power

• Think hard about your experimental stimuli before you begin collecting data!– Once data collection begins, maximum

attainable power is pretty much determined.• Even the most optimistic assumptions

imply that we should use at least 11 stimuli per between-stimulus condition– Based on achieving max. power = 0.9 to

detect a medium effect size (d = 0.5)

What about time-consuming stimulus presentation?

• Assume that responses to each stimulus take about 10 minutes (e.g., film clips).

• Power analysis says we need q=60 to reach power=0.8 (based on having p=60)

• But then it would take over 10 hours for a participant to respond to every stimulus!

• The highest feasible number of responses per participant is, say, 6 (about one hour)

• Are we doomed to have low power? No!

Stimuli-within-Block designs

Standard error reduced by factor of 2.3!

The endURL for power app:

JakeWestfall.org/power/ Article reference:

Westfall, J., Kenny, D. A., & Judd, C. M. (in press). Statistical Power and Optimal Design in Experiments in Which Samples of Participants Respond to Samples of Stimuli. Journal of Experimental Psychology: General.

http://jakewestfall.org/power/

Statistical power in experiments in which samples of participants respond to samples of stimuli

Documents

Transcript of Statistical power in experiments in which samples of participants respond to samples of stimuli