1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy &...
-
Upload
sophie-barker -
Category
Documents
-
view
216 -
download
0
Transcript of 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy &...
![Page 1: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/1.jpg)
1
Automatic Causal Discovery
Richard Scheines
Peter Spirtes, Clark Glymour
Dept. of Philosophy & CALD
Carnegie Mellon
![Page 2: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/2.jpg)
2
Outline
1. Motivation
2. Representation
3. Discovery
4. Using Regression for Causal
Discovery
![Page 3: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/3.jpg)
3
1. Motivation
Non-experimental Evidence
Typical Predictive Questions
• Can we predict aggressiveness from Day Care
• Can we predict crime rates from abortion rates 20 years ago
Causal Questions:
• Does attending Day Care cause Aggression?
• Does abortion reduce crime?
Day Care Aggressiveness
John
Mary
A lot
None
A lot
A little
![Page 4: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/4.jpg)
4
Causal Estimation
Manipulated Probability P(Y | X set= x, Z=z)
from
Unmanipulated Probability P(Y | X = x, Z=z)
When and how can we use non-experimental data to tell us about the effect of an intervention?
![Page 5: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/5.jpg)
5
Conditioning vs. Intervening
P(Y | X = x1) vs. P(Y | X set= x1)
Stained Teeth Slides
![Page 6: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/6.jpg)
6
2. Representation
1. Representing causal structure, and
connecting it to probability
2. Modeling Interventions
![Page 7: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/7.jpg)
7
Causation & Association
X is a cause of Y iff
x1 x2 P(Y | X set= x1) P(Y | X set= x2)
X and Y are associated iff
x1 x2 P(Y | X = x1) P(Y | X = x2)
![Page 8: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/8.jpg)
8
Direct Causation
X is a direct cause of Y relative to S, iff z,x1 x2 P(Y | X set= x1 , Z set= z)
P(Y | X set= x2 , Z set= z)
where Z = S - {X,Y} X Y
![Page 9: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/9.jpg)
9
Association
X and Y are associated iff
x1 x2 P(Y | X = x1) P(Y | X = x2) X Y
X Y
X and Y are independent iff X and Y are not associated
![Page 10: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/10.jpg)
10
Causal Graphs
Causal Graph G = {V,E} Each edge X Y represents a direct causal claim:
X is a direct cause of Y relative to V
MatchStruck
MatchTip
Temperature
MatchLights
![Page 11: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/11.jpg)
11
Modeling Ideal Interventions
Ideal Interventions (on a variable X):
• Completely determine the value or distribution of a variable X
• Directly Target only X (no “fat hand”)E.g., Variables: Confidence, Athletic PerformanceIntervention 1: hypnosis for confidenceIntervention 2: anti-anxiety drug (also muscle relaxer)
![Page 12: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/12.jpg)
12
Teeth
StainsSmoking
Pre-experimental SystemPost
Modeling Ideal Interventions
Interventions on the Effect
![Page 13: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/13.jpg)
13
Modeling Ideal Interventions
Teeth
StainsSmoking
Pre-experimental SystemPost
Interventions on the Cause
![Page 14: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/14.jpg)
14
Interventions & Causal Graphs
• Model an ideal intervention by adding an “intervention” variable outside the original system
• Erase all arrows pointing into the variable intervened upon
Exp Inf
Rash
Intervene to change Inf
Post-intervention graph?Pre-intervention graph
Exp Inf Rash
I
![Page 15: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/15.jpg)
15
Calculating the Effect of Interventions
Smoking [0,1]
Lung Cancer[0,1]
Yellow Fingers[0,1]
P(YF,S,L) = P(S) P(YF|S) P(L|S)
P(YF,S,L)m = P(S) P(YF|Manip) P(L|S)
Smoking [0,1]
Lung Cancer[0,1]
Yellow Fingers[0,1]
Manipulation
Replace pre-manipulation causes
with manipulation
![Page 16: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/16.jpg)
16
Calculating the Effect of Interventions
Smoking [0,1]
Lung Cancer[0,1]
Yellow Fingers[0,1]
P(YF,S,L) = P(S) P(YF|S) P(L|S)
P(YF,S,L) = P(S) P(YF|Manip) P(L|S)
Smoking [0,1]
Lung Cancer[0,1]
Yellow Fingers[0,1]
Manipulation
P(L|YF)
P(L| YF set by Manip) Probability Calculus
Probability Calculus
![Page 17: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/17.jpg)
17
Causal Structure
Statistical Predictions
The Markov Condition
Causal Graphs
Z Y X
Independence
X _||_ Z | Y
i.e.,
P(X | Y) = P(X | Y, Z)
Markov Condition
![Page 18: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/18.jpg)
18
Causal Markov Axiom
In a Causal Graph G, each variable V is independent of its non-effects, conditional on its direct causes
in every probability distribution that G can parameterize (generate)
![Page 19: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/19.jpg)
19
Causal Graphs Independence
Acyclic causal graphs: d-separation Causal Markov axiom
Cyclic Causal graphs: Linear structural equation models : d-
separation, not Causal Markov For some discrete variable models: d-
separation, not Causal Markov Non-linear cyclic SEMs : neither
![Page 20: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/20.jpg)
20
Causal Structure Statistical Data
X3 | X2 X1
X2 X3 X1
Causal Markov Axiom (D-separation)
Independence Relations
Acyclic Causal Graph
![Page 21: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/21.jpg)
21
Causal Discovery
Statistical Data Causal Structure
X3 | X2 X1
X2 X3 X1
Causal Markov Axiom(D-separation)
IndependenceRelations
Equivalence Class ofCausal Graphs
X2 X3 X1
X2 X3 X1
Discovery Algorithm
Background Knowledge
e.g., X2 before X3
![Page 22: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/22.jpg)
22
Equivalence Classes
D-separation equivalenceD-separation equivalence over a set ODistributional equivalenceDistributional equivalence over a set O
Two causal models M1 and M2 are distributionally equivalent iff for any parameterization 1 of M1, there is a parameterization 2 of M2 such that M1(1) = M2(2), and vice versa.
![Page 23: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/23.jpg)
23
Equivalence Classes
For example, interpreted as SEM models
M1 and M2 : d-separation equivalent & distributionally equivalent
M3 and M4 : d-separation equivalent & not distributionally equivalent
X1 X2
1
X1 X2
1
2
2
X1 X2
3
'2'1
X3
'3
X1 X2
2
1
X3
3
3
1
2
4
![Page 24: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/24.jpg)
24
D-separation Equivalence Over a set X
Let X = {X1,X2,X3}, then Ga and Gb
1) are not d-separation equivalent, but
2) are d-separation equivalent over X
X3
T1
X2X1
X3
X2X1 T2
Ga Gb
![Page 25: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/25.jpg)
25
D-separation Equivalence
D-separation Equivalence Theorem (Verma and Pearl, 1988)
Two acyclic graphs over the same set of
variables are d-separation equivalent iff they have: the same adjacencies the same unshielded colliders
![Page 26: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/26.jpg)
26
Representations ofD-separation Equivalence Classes
We want the representations to:Characterize the Independence
Relations Entailed by the Equivalence Class
Represent causal features that are shared by every member of the equivalence class
![Page 27: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/27.jpg)
27
Patterns & PAGs
Patterns (Verma and Pearl, 1990): graphical representation of an acyclic d-separation equivalence - no latent variables.
PAGs: (Richardson 1994) graphical representation of an equivalence class including latent variable models and sample selection bias that are d-separation equivalent over a set of measured variables X
![Page 28: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/28.jpg)
28
Patterns
X2 X1
X2 X1
X2 X1
X4 X3
X2 X1
Possible Edges Example
![Page 29: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/29.jpg)
29
Patterns: What the Edges Mean
X2 X1
X2 X1X1 X2 in some members of theequivalence class, and X2 X1 inothers.
X1 X2 (X1 is a cause of X2) inevery member of the equivalenceclass.
X2 X1 X1 and X2 are not adjacent in anymember of the equivalence class
![Page 30: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/30.jpg)
30
Patterns
X2
X4 X3
X1
D-separation Equivalence Class
DAG
??????
![Page 31: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/31.jpg)
31
Patterns
X2
X4 X3
X1
X2
X4 X3
Represents
Pattern
X1 X2
X4 X3
X1
![Page 32: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/32.jpg)
32
Patterns
X2 X3 X1
Not
Represents
Pattern
X2 X3 X1
X2 X3 X1
X2 X3 X1
X2 X3 X1
Not all boolean combinations of orientations of unoriented pattern adjacencies occur in the equivalence class.
![Page 33: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/33.jpg)
33
PAGs: Partial Ancestral Graphs
X2 X1
X2 X1
X2 X1
X2 There is a latent commoncause of X1 and X2
No set d-separates X2 and X1
X1 is a cause of X2
X2 is not an ancestor of X1
X1
X2 X1 X1 and X2 are not adjacent
What PAG edges mean.
![Page 34: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/34.jpg)
34
PAGs: Partial Ancestral Graph
X2
X3
X1
X2
X3
Represents
PAG
X1 X2
X3
X1
X2
X3
T1
X1
X2
X3
X1
etc.
T1
T1 T2
![Page 35: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/35.jpg)
35
Search Difficulties
The number of graphs is super-exponential in the number of observed variables (if there are no hidden variables) or infinite (if there are hidden variables)
Because some graphs are equivalent, can only predict those effects that are the same for every member of equivalence class Can resolve this problem by outputting
equivalence classes
![Page 36: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/36.jpg)
36
What Isn’t Possible
Given just data, and the Causal Markov and Causal Faithfulness Assumptions: Can’t get probability of an effect being
within a given range without assuming a prior distribution over the graphs and parameters
![Page 37: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/37.jpg)
37
What Is Possible
Given just data, and the Causal Markov and Causal Faithfulness Assumptions: There are procedures which are
asymptotically correct in predicting effects (or saying “don’t know”)
![Page 38: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/38.jpg)
38
Overview of Search Methods
Constraint Based Searches TETRAD
Scoring Searches Scores: BIC, AIC, etc. Search: Hill Climb, Genetic Alg., Simulated
Annealing Very difficult to extend to latent variable models
Heckerman, Meek and Cooper (1999). “A Bayesian Approach to Causal Discovery” chp. 4 in Computation, Causation, and Discovery, ed. by Glymour and Cooper, MIT Press, pp. 141-166
![Page 39: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/39.jpg)
39
Constraint-based Search
Construct graph that most closely implies conditional independence relations found in sample
Doesn’t allow for comparing how much better one model is than another
It is important not to test all of the possible conditional independence relations due to speed and accuracy considerations – FCI search selects subset of independence relations to test
![Page 40: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/40.jpg)
40
Constraint-based Search
Can trade off informativeness versus speed, without affecting correctness
Can be applied to distributions where tests of conditional independence are known, but scores aren’t
Can be applied to hidden variable models (and selection bias models)
Is asymptotically correct
![Page 41: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/41.jpg)
41
Search for Patterns
Adjacency:
•X and Y are adjacent if they are dependent conditional on all subsets that don’t include X and Y
•X and Y are not adjacent if they are independent conditional on any subset that doesn’t include X and Y
![Page 42: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/42.jpg)
42
Search
X4 X3
X2
X1 Independencies entailed???
![Page 43: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/43.jpg)
43
Search
X4 X3
X2
X1 Independencies entailed
X1 _||_ X2
X1_||_ X4 | X3
X2_||_ X4 | X3
![Page 44: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/44.jpg)
44
X1
X2
X3 X4
CausalGraph
Independcies
Begin with:
X1
X2
X3 X4
X1 X2
X1 X4 {X3}
X2 X4 {X3}
Search: Adjacency
![Page 45: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/45.jpg)
45
X1
X2
X3 X4
Causal Graph
Independcies
Begin with:
From
X1
X2
X3 X4
X1 X2
X1 X4 {X3}
X2 X4 {X3}
X1
X2
X3 X4
X1
X2
X3 X4
X1
X2
X3 X4
From
From
X1 X2
X1 X4 {X3}
X2 X4 {X3}
![Page 46: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/46.jpg)
46
Search: Orientation in Patterns
X Y Z
X Z | YX Z | Y
Before OrientationY Unshielded
Collider Non-collider
X Y Z
X Y Z
X Y Z
X Y Z
X Y Z
![Page 47: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/47.jpg)
47
Search: Orientation in PAGs
X Y Z
X Z | YX Z | Y
Y Unshielded
Collider Non-collider
X Y Z X Y Z
![Page 48: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/48.jpg)
48
Orientation: Away from Collider
X3
X2
* X1
X1 X3 | X2
1) X1 - X2 adjacent, andinto X2.2) X2 - X3 adjacent, andunoriented.3) X1 - X3 not adjacent
2) X1 - X2 into X2
No Yes
X3
X2
* X1 X3
X2
* X1
Test
Test Conditions
![Page 49: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/49.jpg)
49
Search: Orientation
X4 X3
X2
X1
X4 X3
X2
X1
X4 X3
X2
X1
X4 X3
X2
X1
X4 X3
X2
X1
PAG Pattern
X4 X3
X2
X1
X1 || X2
X1 || X4 | X3
X2 || X4 | X3
After Orientation Phase
![Page 50: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/50.jpg)
50
Knowing when we know enough to
calculate the effect of Interventions
Observation: IQ _||_ Lead
Background Knowledge: Lead prior to IQ
Exposure to Lead
IQ
Exposure to Lead
IQ
SES
Exposure to Lead
IQ
PAG
P(IQ | Lead) P(IQ | Lead set=) P(IQ | Lead) = P(IQ | Lead set=)
![Page 51: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/51.jpg)
51
Knowing when we know enough to
calculate the effect of Interventions
Observation: All pairs associated
Lead _||_ Grades | IQ
Background Lead prior to IQ prior Knowledge to Grades
Exposure to Lead
IQ Grades
PAG
P(IQ | Lead) P(IQ | Lead set=)
P(Grades | IQ) = P(Grades | IQ set=)
Exposure to Lead
IQ
SES
Grades
Exposure to Lead
IQ Grades
P(IQ | Lead) = P(IQ | Lead set=)
P(Grades | IQ) = P(Grades | IQ set=)
![Page 52: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/52.jpg)
52
Knowing when we know enough to calculate the effect of Interventions
• Causal graph known
• Features of causal graph known
• Prediction algorithm (SGS - 1993)
• Data tell us when we know enough – i.e., we know when we don’t know
![Page 53: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/53.jpg)
53
4. Problems with Using Regession for Causal
Inference
![Page 54: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/54.jpg)
54
Regression to estimate Causal Influence
• Let V = {X,Y,T}, where
-measured vars: X = {X1, X2, …, Xn}
-latent common causes of pairs in X U Y: T = {T1, …, Tk}
• Let the true causal model over V be a Structural
Equation Model in which each V V is a linear
combination of its direct causes and independent,
Gaussian noise.
![Page 55: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/55.jpg)
55
Regression to estimate Causal Influence
• Consider the regression equation: Y = b0 + b1X1 + b2X2 + ..…bnXn
• Let the OLS regression estimate bi be the estimated causal influence of Xi on Y.
• That is, holding X/Xi experimentally constant, bi is an estimate of the change in E(Y) that results from an intervention that changes Xi by 1 unit.
• Let the real Causal Influence Xi Y = i
• When is the OLS estimate bi an unbiased estimate of the the real Causal
Influence Xi Y = i ?
![Page 56: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/56.jpg)
56
Regression vs. PAGs to estimate Qualitative Causal Influence
• bi = 0 Xi _||_ Y | X/Xi
• Xi - Y not adjacent in PAG over X U Y S X/Xi, Xi _||_ Y | S
• So for any SEM over V in which • Xi _||_ Y | X/Xi and
S X/Xi, Xi _||_ Y | S
PAG is superior to regression wrt errors of commission
![Page 57: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/57.jpg)
57
Regression Example
X2
Y
X3 X1
T1
True Model
T2
X2
Y
X3 X1
PAG
b1
b2
b3
0
0 X
0 X
![Page 58: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/58.jpg)
58
Regression Bias
If
• Xi is d-separated from Y conditional on X/Xi in the true graph after removing Xi Y, and
• X contains no descendant of Y, then:
bi is an unbiased estimate of i
![Page 59: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/59.jpg)
59
Regression Bias Theorem
If T = , and X prior to Y, then
bi is an unbiased estimate of i
![Page 60: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/60.jpg)
60
Tetrad 4 Demo
www.phil.cmu.edu/projects/tetrad
![Page 61: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/61.jpg)
61
Applications
• Genetic Regulatory Networks
• Pneumonia• Photosynthesis• Lead - IQ • College Retention• Corn Exports
• Rock Classification• Spartina Grass• College Plans• Political Exclusion• Satellite Calibration• Naval Readiness
![Page 62: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/62.jpg)
MS or Phd Projects
• Extending the Class of Models Covered• New Search Strategies• Time Series Models (Genetic Regulatory Networks)• Controlled Randomized Trials vs. Observations Studies
![Page 63: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/63.jpg)
Projects: Extending the Class of Models Covered
1) Feedback systems
2) Feedback systems with latents
3) Conservation, or equilibrium systems
4) Parameterizing discrete latent variable models
![Page 64: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/64.jpg)
Projects: Search Strategies
1) Genetic Algorithms, Simulated Annealing
2) Automatic Discretization
3) Scoring Searches among Latent Variable Models
4) Latent Clustering & Scale Construction
![Page 65: 1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d0a5503460f949dd5f4/html5/thumbnails/65.jpg)
65
References
• Causation, Prediction, and Search, 2nd Edition, (2001), by P. Spirtes, C. Glymour, and R. Scheines ( MIT Press)
• Causality: Models, Reasoning, and Inference, (2000), Judea Pearl, Cambridge Univ. Press
• Computation, Causation, & Discovery (1999), edited by C. Glymour and G. Cooper, MIT Press
• Causality in Crisis?, (1997) V. McKim and S. Turner (eds.), Univ. of Notre Dame Press.
• TETRAD IV: www.phil.cmu.edu/tetrad
• Web Course on Causal and Statistical Reasoning : www.phil.cmu.edu/projects/csr/