Sampling and case selection

40
COURSE DATE PROFESSOR FALL 2010 MICHAEL NELSON HONORS THESIS CAPSTONE GOVERNMENT DEPARTMENT

description

Lecture for Government Students on research methods

Transcript of Sampling and case selection

Page 1: Sampling and case selection

COURSE

DATE PROFESSORFALL 2010 MICHAEL NELSON

HONORS THESIS CAPSTONEGOVERNMENT DEPARTMENT

Page 2: Sampling and case selection
Page 3: Sampling and case selection

This time...empirical methods

samplingsmall-N causal inference

Page 4: Sampling and case selection

sampling probability sampling

non-probability samplingsampling “challenges”

Page 5: Sampling and case selection

Groups in Sampling

The Theoretical Population

The Study Population

The Sampling Frame

The Sample

Page 6: Sampling and case selection

probability sampling from Henry

Page 7: Sampling and case selection

general sampling strategies from Patton

Page 8: Sampling and case selection

sampling & case selection challenges

• Population Size

• Sampling Bias

• probability of selection correlated with IV; will get the same relationship, but there is systematic non-representativeness

• Selection Bias

• subset of sampling bias; probability of selection correlated with DV

• underestimates the relationship (regression line b instead of a)

• Non-response Bias

• possibility that you are unable to collect data; data set is unrepresentative

misses getsx

y a, b

pop

misses

gets

x

ya

pop

b

Page 9: Sampling and case selection

Causal inference for small-N

researchproperties of small-N researchcase study purposes & types

strategies

Page 10: Sampling and case selection

Case selection

• For quantitative research, selection should be random

• For qualitative research, selection often must be done intentionally (King, Keohane and Verba, 1994).

Page 11: Sampling and case selection
Page 12: Sampling and case selection

properties of small-n research

• intensive

• field research in natural settings

• many kinds of data: observation, interview, archives

• typically: case-centered, not variable centered

Page 13: Sampling and case selection

Case selection strategies

Page 14: Sampling and case selection

Case studies and research design

from Gerring and McDermott (2007)

Page 15: Sampling and case selection

Gerring on case studies

Research Goals Case Study Cross-Case Study

1. Hypothesis Generating Testing

2. Validity Internal External

3. Causal Insight Mechanisms Effects

4. Scope of Proposition

Deep Broad

Empirical Factors Case Study Cross-Case Study

5. Populations of Cases

Heterogeneous Homogenous

6. Causal Strength Strong Weak

7. Useful Variation Rare Common

8. Data Availability Concentrated DispersedAdditional Factors Case Study Cross-Case Study

1. Causal Complexity

? ?

2. State of the Field ? ?

Page 16: Sampling and case selection
Page 17: Sampling and case selection
Page 18: Sampling and case selection

Case study purposes & types: case selection as sampling

1.Descriptive Case Study: atheoretical; goal is to understand the case itself

2.Plausibility Probe: does the empirical phenomena exist; focus on availability of data; concern with plausibility of finding relationships between variables of interest

3.Hypothesis-Generating Case Study: seeks to find a generalization about cause and effect

4.Hypothesis-Testing Case Studies

4.1. Critical Case

4.2. Rival Hypotheses

4.3. ....

Page 19: Sampling and case selection

Generating Hypotheses

Page 20: Sampling and case selection

Extreme cases

• Represent unusual values of the dependent or independent variables

• Used for hypothesis generation

• Not intended to be representative

Page 21: Sampling and case selection

Deviant cases

• Cases that deviate from the typical population

• A “high residual” case (outlier)

• Useful for generating hypotheses, especially new explanations for the outcome (dependent variable) of interest

Page 22: Sampling and case selection

Hypothesis- Testing Strategies: case selection

1.goal: establish the relationship between two or more variables

2.selection advice:

2.1. choose cases that minimize variability in the other variables that might impact the relationship you are investigating

2.2. representative sample

Page 23: Sampling and case selection

hypothesis - testing case studies

critical case

rival hypotheses

Page 24: Sampling and case selection

Selecting the typical case

• Look for cases that are “typical” other cases

• Idea is that these cases are “low residual” cases

• Useful for hypothesis testing.

Page 25: Sampling and case selection

Select diverse cases

• Select cases that are represent the full range of variation

• Useful for hypothesis generation and hypothesis testing

• Represent variation in the population but not necessarily the distribution of that population

Page 26: Sampling and case selection

Influential case

• Cases with influential configurations of the independent variables are chosen

• Useful for verifying the status of a highly influential case

• Not necessarily representative

Page 27: Sampling and case selection

Crucial case

• Cases that are likely to represent an outcome of interest

• Choice usually requires qualitative assessment of crucialness

• Useful for hypothesis testing

• Should be highly representative

Page 28: Sampling and case selection

Selecting cases on the Independent Variable

• You select cases based on the values of an independent variable(s)

• Requires that you know a little bit about all of the potential cases

• Requires you act as if you don’t know the values of the dependent variable

Page 29: Sampling and case selection

Mill’s Methods

agreement

difference

Page 30: Sampling and case selection

Most Similar cases

• Cases are selected based on their similarity on variables other than the independent variable the hypothesis is testing the outcome of interest

• Useful for hypothesis testing and generation

• Not necessarily representative of the broader

• Most Similar Systems analysis involves a non-equivalent group design:

N O X O

N O O

Page 31: Sampling and case selection

Thad’s example: income inequality and civil war

Income Inequality

Poverty Civil WarColonial Past

External Threat

Page 32: Sampling and case selection

Case Income Inequality

Poverty Colonial Past

External Threat

Civil War?

Costa Rica Moderate Yes Yup Nope No

El Salvador High Yes Yup Nope Yes

Cuba High Yes Yup Nope Yes

adapted from Thad Kousser, UCSD

Page 33: Sampling and case selection

Case selection challenges

Page 34: Sampling and case selection

Case study challenges

• Motive behind the selection of case studies is not obvious (Is it convenience? Or is it because they are good stories). Without understanding this, the project is at best useless and at worst terrible misleading.

• Generalizability – Can the lessons learned from this case be applied to a larger class?

• Falsifiability – Results are presented in such a way that it would be difficult for an impartial researcher to replicate the project and arrive at the same result.

• No or Negative Degrees of Freedom: The researcher has more explanatory variables (moving pieces) than observations.

• Selection on the Dependent Variable: Choosing cases because of their performance on outcome of interest.

Page 35: Sampling and case selection

Strategies: remember threats to internal & external validity!

• History, maturation, instrumentation (data limitations)

• Selection bias

• KKV give example of business school student who wants a high paid job and selects for his study sample only those graduates earning high salaries. He then relates salary to number of accounting courses. By excluding graduates with low salaries, he paradoxically underestimates the effect of additional accounting courses on income.

Page 36: Sampling and case selection

Geddes on selection bias

Page 37: Sampling and case selection

Geddes, continued

Page 38: Sampling and case selection

Strategies: combining with large-N

1. Goal: Increase number of observations

1.1. Comparative case with large-N analysis of embedded units

2. Goal: Study causal mechanisms

2.1. Large-N study establishes relationships between variables (causal effect)

2.2. Small-N study establishes causal mechanism, looking at intervening steps (causal mechanism)

2.3. Note: causal explanation requires an understanding of both the causal effect and the causal mechanism

3. Goal: Study of spuriousness

3.1. Large-N study establishes relationships between variables (causal effect)

3.2. Small-N study engages claims of spuriousness

4. Goal: Study of deviant cases

4.1. Large-N study establishes deviant cases

4.2. Small-N study examines deviant cases

5. Goal: Establish generality of findings

5.1. Small-N study suggests X causes Y, but lacks external validity

5.2. Large-N study looks to establish the generality of findings

Page 39: Sampling and case selection

Strategies:Increasing leverage for causal inference in case studies

1.Congruence Method: Test a hypothesis by understanding a case; looks for fit between theory and case; involves multiple independent variables

2.Pattern Matching: Type of congruence testing, usually focused on a single independent variable; compares alternative theories with respect to multiple outcomes

3. Process Tracing: Focus is on establishing the causal mechanism, by examining fit of theory to intervening causal steps; how does “X” produce a series of conditions that come together in some way (or don’t) to produce “Y”?

4. Counterfactual Analysis: Gain leverage through rigorous, disciplined thought experiments

Page 40: Sampling and case selection

Strategies: structured, focused comparison

1. “the comparison is focused because it deals selectively with only certain aspects of a historical case... and structured because it employs general questions to guide the data collection analysis in that historical case” - Alexander and George

2. Steps (Kaarbo and Beasley)2.1. Identify the research question2.2. Identify variables (usually from existing theory)2.3. Select cases: comparable cases with variation in

the values of the dependent variable, selected from across population subgroups (aids external validity)

2.4. Define and specify your measurement strategy for concepts, including a “codebook” for the questions you employ in data collection

2.5. “Code-write cases”2.6. Comparison (search for patterns) and implications

for theory