Learning Causality Some slides are from Judea Pearl’s class lecture .
-
Upload
eugenia-hicks -
Category
Documents
-
view
216 -
download
2
Transcript of Learning Causality Some slides are from Judea Pearl’s class lecture .
![Page 1: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/1.jpg)
Learning Causality
Some slides are from Judea Pearl’s class lecturehttp://bayes.cs.ucla.edu/BOOK-2K/viewgraphs.html
![Page 2: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/2.jpg)
A causal model Example
• Statement ‘rain causes mud’ implies an asymmetric relationship: the rain will create mud, but the mud will not create rain.
• Use ‘→’ when refer such causal relationship;• There is no arrow between ‘rain’ and ‘other
causes of mud’ means that there is no direct causal relationship between them;
Rain Other causes of mudMud
![Page 3: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/3.jpg)
Directed (causal) Graphs
• A and B are causally independent;• C, D, E, and F are causally dependent on A and B;• A and B are direct causes C;• A and B are indirect causes D, E and F;• If C is prevented from changing with A and B, then A
and B will no longer cause changes in D, E and F.
A
F
D
E
B
C
![Page 4: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/4.jpg)
Conditional Independence
![Page 5: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/5.jpg)
Conditional Independence
![Page 6: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/6.jpg)
Conditional Independence (Notation)
![Page 7: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/7.jpg)
Causal Structure
![Page 8: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/8.jpg)
Causal Structure (cont’d)
• A Causal Structure serves as a blueprint for forming a “casual model” – a precise specification of how each variable is influenced by its parents in the DAG.
• We assume that Nature is at liberty to impose arbitrary functional relationships between each effect and its causes and then to perturb these relationships by introducing arbitrary disturbance;
• These disturbances reflect “hidden” or unmeasurable conditions.
![Page 9: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/9.jpg)
Causal Model
![Page 10: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/10.jpg)
Causal Model (Cont’d)• Once a causal model M is formed, it defines a joint
probability distribution P(M) over the variables in the system;
• This distribution reflects some features of the causal structure– Each variable must be independent of its grandparents, given the
values of its parents
• We may allowed to inspect a select subset OV of “observed” variables to ask questions about P[o], the probability distribution over the observations;
• We may recover the topology D of the DAG, from features of the probability distribution P[o].
![Page 11: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/11.jpg)
Inferred Causation
![Page 12: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/12.jpg)
Latent Structure
![Page 13: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/13.jpg)
Structure Preference
![Page 14: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/14.jpg)
Structure Preference (Cont’d)
• The set of independencies entailed by a causal structure imposes limits on its power to mimic other structure;
• L1 cannot be preferred to L2 if there is even one observable dependency that is permitted by L1 and forbidden by L2;
• L1 is preferred to L2 if L2 has subset of L1’s independence;
• Thus, test for preference and equivalence can sometimes be reduced to test dependencies, which can be determined by topology of the DAGs without concerning parameters.
![Page 15: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/15.jpg)
Minimality
![Page 16: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/16.jpg)
Consistency
![Page 17: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/17.jpg)
Inferred Causation
![Page 18: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/18.jpg)
Examples{a,b,c,d} reveal two independencies:1.a is independent of b;2.d is independent of {a,b} given c;
Assume further that the data reveals no other independencies;
a = having a cold;b = having hay fever;c = having to sneeze;d = having to wipe one’s nose.
![Page 19: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/19.jpg)
Example (Cont’d){a,b,c,d} reveal two independencies:1. a is independent of b;2. d is independent of {a,b} given c;
minimal
Arbitrary relations between a and b
Not minimal: fails to impose conditional Independence between d and {a,b}
Not consistent with data: impose marginalindependence between d and {a,b}
![Page 20: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/20.jpg)
Stability
The stability condition states that, as we vary the parmeters from to , no indpendence in P can be destroyed.
In other words, if the independency exists, it will always exists.
![Page 21: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/21.jpg)
Stable distribution
• A probability distribution P is a faithful/stable distribution if there exist a directed acyclic graph (DAG) D such that the conditional independence relationship in P is also shown in the D, and vice versa.
![Page 22: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/22.jpg)
IC algorithm (Inductive Causation)
• IC algorithm (Pearl)
– Based on variable dependencies;– Find all pairs of variables that are dependent
of each other (applying standard statistical method on the database);
– Eliminate (as much as possible) indirect dependencies;
– Determine directions of dependencies;
![Page 23: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/23.jpg)
Comparing abduction, deduction and induction
• Deduction: major premise: All balls in the box are black
minor premise: These balls are from the box
conclusion: These balls are black
• Abduction: rule: All balls in the box are black
observation: These balls are black
explanation: These balls are from the box
• Induction: case: These balls are from the box
observation: These balls are black
hypothesized rule: All ball in the box are black
A => B A ---------BA => B B-------------Possibly A
Whenever A then B but not vice versa-------------Possibly A => B
Induction: from specific cases to general rules;Abduction and deduction:
both from part of a specific case to other part of the case using general rules (in different ways)
Source from httpwww.csee.umbc.edu/~ypeng/F02671/lecture-notes/Ch15.ppt
![Page 24: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/24.jpg)
IC Algorithm (Cont’d)
• Input: – P – a stable distribution on a set V of
variables;
• Output: – A pattern H(P) compatible with P;
Patten: is a partially directed DAG• some edges are directed and • some edges are undirected;
![Page 25: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/25.jpg)
IC Algorithm: Step 1
• For each pair of variables a and b in V, search for a set Sab such that (a╨b | Sab) holds in P – in other words, a and b should be
independent in P, conditioned on Sab .
• Construct an undirected graph G such that vertices a and b are connected with an edge if and only if no set Sab can be found.
Sab
aNot Sab
b
Sab a b
a b╨
![Page 26: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/26.jpg)
IC Algorithm: Step 2
• For each pair of nonadjacent variables a and b with a common neighbor c, check if c Sab.
• If it is, then continue;• Else add arrowheads at c• i.e a→ c ← b
Yes
c
a
b
a b C╨No
ca
b
![Page 27: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/27.jpg)
ExampleRain
Other causes of mud
Mud
RainOther causes of mud
Mud
![Page 28: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/28.jpg)
IC Algorithm Step 3
• In the partially directed graph that results, orient as many of the undirected edges as possible subject to two conditions:– The orientation should not create a new v-
structure;– The orientation should not create a directed cycle;
![Page 29: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/29.jpg)
Rules required to obtaining a maximally oriented pattern
• R1: Orient b — c into b→c whenever there is an arrow a→b such that a and c are non adjacent;
cb cb
ba c
![Page 30: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/30.jpg)
Rules required to obtaining a maximally oriented pattern
• R2: Orient a — b into a→b whenever there is a chain a→c→b;
ba ba
ca b
![Page 31: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/31.jpg)
Rules required to obtaining a maximally oriented pattern
R3: Orient a — b into a→b whenever there are two chains a—c→b and a—d→b such that c and d are nonadjacent;
ba ba
c
a b
d
![Page 32: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/32.jpg)
Rules required to obtaining a maximally oriented pattern
R4: Orient a — b into a→b whenever there are two chains a—c→d and c→d→b such that c and b are nonadjacent;
ba ba
ca d
dc b
![Page 33: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/33.jpg)
IC* Algorithm
• Input: – P, a sampled distribution;
• Output: – core(P), a marked pattern;
![Page 34: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/34.jpg)
Marked Pattern:Four types of edges
![Page 35: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/35.jpg)
IC* Algorithm: Step 1
For each pair of variables a and b, search for a set Sab such that a and b are independent in P, conditioned on Sab. If there is no such Sab, place an undirected link between the two variables, a – b.
![Page 36: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/36.jpg)
IC* Algorithm: Step 2
• For each pair of nonadjacent variables a and b with a common neighbor c, check if cSab
– If it is, then continue;– If it is not, then add arrow heads pointing at c (i.e. a
c b).
• In the partially directed graph that results, add (recursively) as many arrowheads as possible, and mark as many edges as possible, according to the following two rules:
![Page 37: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/37.jpg)
IC* Algorithm: Rule 1
• R1: For each pair of non-adjacent nodes a and b with a common neighbor c, if the link between a and c has an arrow head into c and if the link between c and b has no arrowhead into c, then add an arrow head on the link between c and b pointing at b and mark that link to obtain c –* b;
c
a
b
c
a
b
*
![Page 38: Learning Causality Some slides are from Judea Pearl’s class lecture .](https://reader038.fdocuments.us/reader038/viewer/2022110206/56649d0a5503460f949dd581/html5/thumbnails/38.jpg)
IC* Algorithm: Rule 2
• R2: If a and b are adjacent and there is a directed path (composed strictly of marked links) from a to b, then add an arrowhead pointing toward b on the link between a and b;