The Teaching and Learning Centre Lingnan University Brant Knutzen Lingnan University.
Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with...
-
Upload
judith-cheevers -
Category
Documents
-
view
213 -
download
0
Transcript of Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with...
![Page 1: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/1.jpg)
Weakening the Causal Faithfulness Assumption
Jiji ZhangLingnan University
Based on joint work with Peter Spirtes
![Page 2: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/2.jpg)
2
Markov and Faithfulness Assumptions
Suppose the set of observed variables V is causally sufficient and its causal structure can be properly represented by a DAG over V.
A statement of conditional independence is said to be entailed by a DAG if it is entailed by the Markov property of the DAG.
Causal Markov Assumption: Every conditional independence statement entailed by the causal DAG over V is satisfied by the joint distribution over V.
Causal Faithfulness Assumption: Every conditional independence statement satisfied by the joint distribution over V is entailed by the causal DAG over V.
![Page 3: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/3.jpg)
3
Simple Examples of Unfaithfulness
X
Y
Z-
+ +
X Y
Z
X
[0, 1]
Y Z
[0, 1, 2] [0, 1]
Entailed: none; Extra: X Z.
Entailed: X Z | Y; Extra: X Z.
Entailed: X Y; Extra: X Z; Y Z
![Page 4: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/4.jpg)
4
Testing Faithfulness?• Without knowing the true causal DAG, the Faithfulness
assumption is not fully testable.
• But given the Markov assumption, the Faithfulness assumption has a testable consequence: the distribution of V is (Markov and) faithful to some DAG.
• Unfaithfulness is in principle detectable if the distribution is not faithful to any DAG.
It is undetectable if the distribution is faithful to some (false) DAG.
![Page 5: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/5.jpg)
5
SGS AlgorithmS1. Form the complete undirected graph H over V.
S2. For each pair of variables X and Y, search for S V\{X, Y} such that X and Y are independent conditional on S. Remove the edge between X and Y in H iff such a set is found.
S3. For each unshielded triple <X, Y, Z> (i.e., X and Y are adjacent, Y and Z are adjacent, but X and Z are not adjacent),
(1) If X and Z are not independent conditional on any subset of V\{X, Y} that contains Y, then mark the triple as a collider: X Y Z.
(2) If X and Z are not independent conditional on any subset of V\{X, Y} that does not contain Y, then mark the triple as a non-collider (i.e., not X Y Z).
S4. More orientation rules …
![Page 6: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/6.jpg)
6
Justification of S2S2. For each pair of variables X and Y, search for S V\{X, Y} such that
X and Y are independent conditional on S. Remove the edge between X and Y in H iff such a set is found.
• Inference of adjacencies is justified by the Markov assumption.
• Inference of non-adjacencies is justified by a consequence of the Faithfulness assumption.
Adjacency-Faithfulness: For every X, YV, if X and Y are adjacent in the true causal DAG, then they are not independent conditional on any subset of V\{X,Y}.
![Page 7: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/7.jpg)
7
Justification of S3S3. For each unshielded triple <X, Y, Z> (i.e., X and Y are adjacent, Y
and Z are adjacent, but X and Z are not adjacent),
(1) If X and Z are not independent conditional on any subset of V\{X, Y} that contains Y, then mark the triple as a collider: X Y Z.
(2) If X and Z are not independent conditional on any subset of V\{X, Y} that does not contain Y, then mark the triple as a non-collider (i.e., not X Y Z).
• (1) and (2) are both justified by the Markov assumption.
• What about the Faithfulness assumption?
![Page 8: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/8.jpg)
8
Justification of S3 (con’t)• The antecedent of clause (1) and that of clause (2) do not exhaust the
logical possibilities.
• The remaining logical possibility is ruled out by the following consequence of the Faithfulness assumption:
Orientation-Faithfulness: For every unshielded triple <X, Y, Z> in the true causal DAG,
– If X Y Z, then X and Z are not independent conditional on any subset of V\{X,Y} that contains Y.
– Otherwise, X and Z are not independent conditional on any subset of V\{X,Z} that does not contain Y.
X Y Z
Entailed: X Z | Y; Extra: X Z.
![Page 9: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/9.jpg)
9
First Weakening of Faithfulness• It follows that given the Markov and Adjacency-Faithfulness
assumptions, violations of Orientation-Faithfulness are detectable, and a there is a straightforward test:
S3*. For each unshielded triple <X, Y, Z>,
(1) If X and Z are not independent conditional on any subset of V\{X, Y} that contains Y, then mark the triple as a collider: X Y Z.
(2) If X and Z are not independent conditional on any subset of V\{X, Y} that does not contain Y, then mark the triple as a non-collider (i.e., not X Y Z).
(3) Otherwise, mark the triple as ambiguous or unfaithful.
![Page 10: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/10.jpg)
10
Conservative SGS• Replace S3 with S3*, and we get what we call the Conservative
SGS (CSGS) algorithm.
• The CSGS algorithm is correct under the causal Markov and Adjacency-Faithfulness assumptions.
• When Orientation-Faithfulness happens to hold, the output of CSGS is the same as that of SGS.
![Page 11: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/11.jpg)
11
E-pattern• We call the (supposed) output of CSGS an extended pattern (e-pattern), which represents a set of patterns (each of which represents a Markov equivalence class of
DAGs).
X
Y
Z U
W
X
Y
Z U
W
X
Y
Z U
W
X
Y
Z U
W
![Page 12: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/12.jpg)
12
Violations of Adjacency-Faithfulness• Some violations of Adjacency-Faithfulness are also detectable.
• Compare to an undetectable violation:
X
Y
ZExtra: X Z.
X Y
Z
Extra: X Z; Y Z.
X
Y
Z
W
Extra: X Z.
![Page 13: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/13.jpg)
13
Triangle-Faithfulness
Triangle-Faithfulness: For every triangle <X, Y, Z> (i.e., they are adjacent to one another) in the true causal DAG,
(1) If Y is a non-collider on the path <X, Y, Z>, then X and Z are not independent conditional on any subset of V\{X,Y} that does not contain Y.
(2) If Y is a collider on the path <X, Y, Z>, then X and Z are not independent conditional on any subset of V\{X,Y} that contains Y.
• Triangle-Faithfulness is weaker than Adjacency-Faithfulness.
X
Y
Z X
Y
Z
X
Y
Z
![Page 14: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/14.jpg)
14
Further Weakening of Faithfulness• Another weak condition entailed by the Adjacency-Faithfulness
assumption is known as the causal Minimality condition: no proper subgraph of the true causal DAG satisfies the Markov condition with the joint distribution.
• Theorem: Given the causal Markov, Minimality and Triangle-Faithfulness assumptions, any violation of the Faithfulness assumption is detectable.
• What if we only make the Markov, Minimality and Triangle-Faithfulness assumptions?
![Page 15: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/15.jpg)
15
CSGS under the Weaker Assumptions• Given the Markov assumption, in the adjacency step S2, the inferred adjacencies
are still correct.
• The inferred non-adjacencies, however, are not necessarily correct, since Adjacency-Faithfulness is not assumed. (Mark the non-adjacencies as ‘apparent’).
• Given the Markov and Triangle-Faithfulness assumptions, the orientation step S3* is still correct!
(For an ‘apparently’ unshielded triple <X, Y, Z>, either it is really unshielded or it is a triangle. In the former case, S3* is correct by the Markov assumption; in the latter case, S3* is correct by the Triangle-Faithfulness assumption.)
![Page 16: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/16.jpg)
16
Testing Adjacency-Faithfulness?• Therefore, given only the Markov and Triangle-Faithfulness assumptions,
CSGS is still correct, provided that we take the non-adjacencies in the output as uninformative.
• Can we somehow test Adjacency-Faithfulness and confirm non-adjacencies if the test returns affirmative?
• What we have for now: take the output of CSGS and check the Markov condition for each pattern represented by the output. If every pattern satisfies the Markov condition, then the non-adjacencies are correct (assuming Minimality in addition to Markov and Triangle-Faithfulness).
![Page 17: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/17.jpg)
17
Conjecture• The condition should be improvable. In particular, it is sufficient but not necessary
for Adjacency-Faithfulness.
• A necessary condition for Adjacency-Faithfulness is: some pattern represented by the CSGS output satisfies the Markov condition.
• Conjecture: The necessary condition is also sufficient.
That is, assuming Markov, Minimality, and Triangle-Faithfulness, Adjacency-Faithfulness holds iff some pattern represented by the CSGS output satisfies the Markov condition.
![Page 18: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/18.jpg)
18
Still Further Weakening• Let G and H be DAGs over V. H is an I-structure of G if every conditional
independence entailed by G is also entailed by H. H is a proper I-structure of G if H is an I-structure of G but G is not an I-structure of H.
P-minimality assumption: No proper I-structure of the true causal DAG satisfies the Markov condition with the joint distribution.
• The causal Faithfulness assumption is equivalent to a conjunction of (1) the P-minimality assumption and (2) that the joint distribution is faithful to some DAG.
![Page 19: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/19.jpg)
19
Still Further Weakening (con’t)• The causal Faithfulness assumption is often regarded as a methodological
assumption of simplicity; that is only part of its content, namely, the P-minimality assumption.
• Violations of the P-minimality assumption are not detectable; Given the P-minimality assumption, violations of (the rest of) the Faithfulness assumption are detectable.
• The causal (SGS-)minimality assumption plus the Triangle-Faithfulness assumption entail the P-minimality assumption.
• Conversely, the P-minimality assumption entails the causal (SGS-)minimality assumption, but does not entail Triangle-Faithfulness.
![Page 20: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/20.jpg)
20
Example
• Triangle-Faithfulness is violated, but P-minimality is not.
• Assuming Markov and P-minimality, the violation of Triangle-Faithfulness is detectable.
ZX
Y
W
Entailed: Y W | {X, Z}; Extra: X Z | {Y, W}.
ZX
Y
W
ZX
Y
W
ZX
Y
W
ZX
Y
W
ZX
Y
W
ZX
Y
W
![Page 21: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/21.jpg)
21
Example (con’t)
• I suspect that VCSGS (i.e., CSGS in which non-adjacencies are regarded as ambiguous, unless a check of Markov condition in the end confirms them) is also correct under the causal Markov and P-minimality assumptions.
ZX
Y
W
Entailed: Y W | {X, Z}; Extra: X Z | {Y, W}.
ZX
Y
W
Output of CSGS:
![Page 22: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/22.jpg)
22
Further Questions• Are there feasible versions (or approximations)?
• How about causal inference without causal sufficiency?
![Page 23: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/23.jpg)
23
PC and CPC
• The PC algorithm is a much more efficient version of SGS.
• The key efficiency-improving ideas are also applicable to CSGS (when Adjacency-Faithfulness is assumed to hold). The resulting algorithm was called Conservative PC (CPC).
• Joe Ramsey did simulations and found that even when the Faithfulness assumption is true, (1) CPC produces significantly fewer errors than PC at moderate sample sizes; (2) outputs about as much correct information as PC does; and (3) runs almost as fast.
![Page 24: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/24.jpg)
24
Almost Unfaithfulness
• The reason, we think, is that CPC not only guards against strict failure of orientation-faithfulness, but also guards against almost violations.
• Intuitively, CPC suspends judgments when it detects “almost unfaithfulness” at a given sample size, just as it suspends judgments when it detects unfaithfulness in the large sample limit.
![Page 25: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/25.jpg)
25
Uniform Consistency
• A negative result due to Robins et al. (2003) is that causal inference can only be pointwise consistent but not uniformly consistent under the Causal Markov and Faithfulness assumptions.
• The basis of their proof is related to almost unfaithfulness.
![Page 26: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/26.jpg)
26
Uniform Consistency of Inferring Causal Direction
• Suppose that we have the right adjacencies, and use procedures like PC to infer causal directions.
• Robins et al.’s results do not apply here.
• But we can still show that the PC procedure is not uniformly consistent in the inference of causal direction given the right adjacencies.
![Page 27: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/27.jpg)
27
Uniform Consistency of Inferring Causal Direction (con’t)
• Our argument is based on a theorem that no procedure can be uniformly consistent in, for example, deciding between an unshielded collider (X Y Z) and an unshielded non-collider without sometimes suspending judgments.
• This argument does not apply to CPC, and we can show that CPC can be made uniformly consistent in its inference of causal directions (given the right adjacencies).
![Page 28: Weakening the Causal Faithfulness Assumption Jiji Zhang Lingnan University Based on joint work with Peter Spirtes.](https://reader037.fdocuments.us/reader037/viewer/2022110319/56649c7d5503460f94931c5c/html5/thumbnails/28.jpg)
28
References
P. Spirtes and J. Zhang (forthcoming) “A uniformly consistent estimator of causal effects under the k-triangle-faithfulness assumption”, Statistical Science.
J. Zhang (2013) “A comparison of three Occam’s razors for Markovian causal models”, British Journal for the Philosophy of Science, 64(2): 423-448.
J. Zhang (2008) “Error probabilities for the inference of causal direction”, Synthese 163: 409-418.
J. Zhang and P. Spirtes (2008) “Detection of unfaithfulness and robust causal inference”, Minds and Machines 18(2): 239-271.
J. Ramsey, P. Spirtes, and J. Zhang (2006) “Adjacency-faithfulness and conservative causal inference”, UAI proceedings: 401-408.