Causal Inference and Ambiguous Manipulations

Post on 25-Feb-2016

76 views 4 download

description

Causal Inference and Ambiguous Manipulations. Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University. 1. Motivation. Wanted: Answers to Causal Questions: Does attending Day Care cause Aggression? Does watching TV cause obesity? - PowerPoint PPT Presentation

Transcript of Causal Inference and Ambiguous Manipulations

1

Causal Inference and

Ambiguous Manipulations

Richard Scheines

Grant Reaber, Peter SpirtesCarnegie Mellon University

2

1. MotivationWanted: Answers to Causal Questions: • Does attending Day Care cause Aggression? • Does watching TV cause obesity?• How can we answer these questions

empirically?• When and how can we estimate the size of

the effect?• Can we know our estimates are reliable?

3

Causation & Intervention

P(Lung Cancer | Tar-stained teeth = no)

P(Lung Cancer | Tar-stained teeth set= no)

Conditioning is not the same as intervening

Show Teeth Slides

4

Gender

CEO Earings

Gender

CEO Earings

I

5

Causal Inference: Experiments

Gold Standard: Randomized Clinical Trials - Intervene: Randomly assign treatment - Observe Response

Estimate P( Response | Treatment assigned)

6

Causal Inference: Observational Studies

Collect a sample on - Potential Causes (X) - Response (Y) - Covariates (potential confounders Z)

Estimate P(Y | X, Z)• Highly unreliable• We can estimate sampling variability, but we don’t know

how to estimate specification uncertainty from data

Individual Day Care Aggressiveness

John

Mary

A lot

None

A lot

A little

7

2. Progress 1985 – Present

1. Representing causal structure, and connecting it to probability

2. Modeling Interventions3. Indistinguishability and Discovery

Algorithms

8

Representing Causal Structures

Causal Graph G = {V,E} Each edge X Y represents a direct causal claim:

X is a direct cause of Y relative to V

Exposure Infection Symptoms

9

Direct Causation

X is a direct cause of Y relative to S, iff z,x1 x2 P(Y | X set= x1 , Z set= z)

P(Y | X set= x2 , Z set= z)

where Z = S - {X,Y} X Y

10

Causal Bayes Networks

P(S = 0) = .7P(S = 1) = .3

P(YF = 0 | S = 0) = .99 P(LC = 0 | S = 0) = .95P(YF = 1 | S = 0) = .01 P(LC = 1 | S = 0) = .05P(YF = 0 | S = 1) = .20 P(LC = 0 | S = 1) = .80P(YF = 1 | S = 1) = .80 P(LC = 1 | S = 1) = .20

Smoking [0,1]

Lung Cancer[0,1]

Yellow Fingers[0,1]

P(S,Y,F) = P(S) P(YF | S) P(LC | S)

The Joint Distribution Factors

According to the Causal Graph,

i.e., for all X in V

P(V) = P(X|Immediate Causes of(X))

11

Modeling Ideal Interventions

Interventions on the Effect

WearingSweater

Room

Temperature

Pre-experimental SystemPost

12

Modeling Ideal Interventions

Interventions on the Cause

Pre-experimental SystemPost

WearingSweater

Room

Temperature

13

Interventions & Causal Graphs

• Model an ideal intervention by adding an “intervention” variable outside the original system

• Erase all arrows pointing into the variable intervened upon

Exp Inf

Rash

Intervene to change Inf

Post-intervention graph?Pre-intervention graph

Exp Inf Rash

I

14

Calculating the Effect of Interventions

Pre-manipulation Joint Distribution

P(Exp,Inf,Rash) = P(Exp)P(Inf | Exp)P(Rash|Inf)

Intervention on Inf

Exp Inf

Rash

Post-manipulation Joint Distribution

P(Exp,Inf,Rash) = P(Exp)P(Inf | I) P(Rash|Inf)

Exp Inf

Rash

I

15

Causal Discovery from Observational Studies

X3 | X2 X1

X2 X3 X1

Causal Markov Axiom(D-separation)

IndependenceRelations

Equivalence Class ofCausal Graphs

X2 X3 X1

X2 X3 X1

Discovery Algorithm

16

Equivalence Class with Latents:PAGs: Partial Ancestral Graphs

X2

X3

X1

X2

X3

Represents

PAG

X1 X2

X3

X1

X2

X3

T1

X1

X2

X3

X1

etc.

T1

T1 T2

Assumptions:• Acyclic graphs• Latent variables• Sample Selection Bias

Equivalence:• Independence over measured variables

17

Knowing when we know enough to calculate the effect of Interventions

The Prediction Algorithm (SGS, 2000)

Causal Inference from Observational Studies

18

Causal Discovery from Observational Studies

X2 X3 X1 Prediction Algorithm

Equivalence Class (PAG)

X4

Predictions? P(X3 | X2set) yes P(X2 | X1set) Don’t know P(X1 | X2set) yes ….

Observed Independence X1 _||_ X4 X1 _||_ X3 | X2 X4 _||_ X3 | X2

Discovery Algorithm

19

3. The Ambiguity of Manipulation

Assumptions

• Causal graph known (Cholesterol is a cause of Heart Condition)

• No Unmeasured Common Causes

Heart Disease

Total Blood Cholesterol

Therefore The manipulated and unmanipulated distributions are the same:

P(H | TC = x) = P(H | TC set= x)

20

The Problem with Predicting the Effects of Acting

Problem – the cause is a composite of causes that don’t act uniformly,

E.g., Total Blood Cholesterol (TC) = HDL + LDL

Heart Disease

Total Blood Cholesterol = HDL

+ LDL +

-

•The observed distribution over TC is determined by the unobserved joint distribution over HDL and LDL

• Ideally Intervening on TC does not determine a joint distribution for HDL and LDL

21

The Problem with Predicting the Effects of Setting TC

Heart Disease

Total Blood Cholesterol = HDL

+ LDL +

-

• P(H | TC set1= x) puts NO constraints on P(H | TC set2= x),

• P(H | TC = x) puts NO constraints on P(H | TC set= x) • Nothing in the data tips us off about our ignorance, i.e., we don’t know that we don’t know.

22

Examples Abound

Social Adjustment

Total TV = Violent Junk

+ PBS, Discovery Channel

+ -

Aggressiveness Total Day Care =

Overcrowded, Poor Quality +

High Quality

+ -

23

Possible Ways Out

• Causal Graph is Not Known:

Cholesterol does not really cause Heart Condition

• Confounders (unmeasured common causes) are present:

LDL and HDL are confounders

24

Cholesterol is not really a cause of Heart Condition

Relative to a set of variables S (and a background),

X is a cause of Y iff x1 x2 P(Y | X set= x1) P(Y | X set= x2)

• Total Cholesterol is a cause of Heart Disease

25

Cholesterol is not really a cause of Heart Condition

Is Total Cholesterol is a direct cause of Heart Condition relative to: {TC, LDL, HDL, HD}?

• TC is logically related to LDL, HDL, so manipulating it once LDL and HDL are set is impossible.

26

LDL, HDL are confounders

Heart Disease TC

HDL LDL

?

• No way to manipulate TCl without affecting HDL, LDL

• HDL, LDL are logically related to TC

27

Logico-Causal Systems

S: Atomic Variables

• independently manipulable

• effects of all manipulations are unambiguous

S’: Defined Variables

• defined logically from variables in S

For example:

S: LDL, HDL, HD, Disease1, Disease2

S’: TC

28

Logico-Causal Systems: Adding EdgesS: LDL, HDL, HD, D1, D2 S’: TC

System over S System over S U S’ D1 D2

LDL HDL

HD

D1 D2

LDL HDL

HD

TC

?

TC HD iff manipulations of TC are unambiguous wrt HD

29

Logico-Causal Systems: Unambiguous Manipulations

TC HD iff all manipulations of TC are unambiguous wrt HD

For each variable X in S’, let Parents(X’) be the set of variables in S that logically determine X’, i.e.,

X’ = f(Parents(X’)), e.g., TC = LDL + HDL

Inv(x’) = set of all values p of Parents(X’) s.t., f(p) = x’

A manipulation of a variable X’ in S’ to a value x’

wrt another variable Y is unambiguous iff

p1≠ p2 [P(Y | p1 Inv(x’)) = P(Y | p2 Inv(x’))]

30

Logico-Causal Systems: Removing Edges

S: LDL, HDL, HD, D1, D2 S’: TC

System over S System over S U S’ D1 D2

LDL HDL

HD

D1 D2

LDL HDL

HD

TC

? ?

Remove LDL HD iff LDL _||_ HD | TC

31

Logico-Causal Systems: Faithfulness

D1 D2

LDL HDL

HD

TC

Faithfulness: Independences entailed by structure, not by special parameter values. Crucial to inference

Effect of TC on HD unambiguous

Unfaithfulness: LDL _||_ HDL | TC

Because LDL and TC determine HDL, and similarly, HDL and TC determine TC

32

Effect on Prediction Algorithm

Manipulate: Effect on: Assume manipulation unambiguous

ManipulationMaybe ambiguous

Disease 1 Disease 2 None None

Disease 1 HD Can’t tell Can’t tell

Disease 1 TC Can’t tell Can’t tell

Disease 2 Disease 1 None None

Disease 2 HD Can’t tell Can’t tell

Disease 2 TC Can’t tell Can’t tell

TC Disease 1 None Can’t tell

TC Disease 2 None Can’t tell

TC HD Can’t tell Can’t tell

HD Disease 1 None Can’t tell

HD Disease 2 None Can’t tell

HD TC Can’t tell Can’t tell

Observed System:

TC, HD, D1, D2 D1 D2

LDL HDL

HD

TC

? ? ?

Still sound – but less informative

33

Effect on Prediction Algorithm

Observed System:

TC, HD, D1, D2, X

D1 D2

LDL HDL

HD

TC

?

X

Not completely sound

No general characterization of when the Prediction algorithm, suitably modified, is still informative and sound. Conjectures, but no proof yet.

Example:• If observed system has no deterministic relations• All orientations due to marginal independence relations are still valid

34

Effect on Causal Inference ofAmbiguous Manipulations

Experiments, e.g., RCTs:

Manipulating treatment is• unambiguous sound• ambiguous unsound

Observational Studies, e.g., Prediction Algorithm:

Manipulation is• unambiguous potentially sound• ambiguous potentially sound

35

References

• Causation, Prediction, and Search, 2nd Edition, (2000), by P. Spirtes, C. Glymour, and R. Scheines ( MIT Press)

• Causality: Models, Reasoning, and Inference, (2000), Judea Pearl, Cambridge Univ. Press

• Spirtes, P., Scheines, R.,Glymour, C., Richardson, T., and Meek, C. (2004), “Causal Inference,” in Handbook of Quantitative Methodology in the Social Sciences, ed. David Kaplan, Sage Publications, 447-478

• Spirtes, P., and Scheines, R. (2004). Causal Inference of Ambiguous Manipulations. in Proceedings of the Philosophy of Science Association Meetings, 2002.

• Reaber, Grant (2005). The Theory of Ambiguous Manipulations. Masters Thesis, Department of Philosophy, Carnegie Mellon University