Ockham’s Razor in Causal Discovery: A New Explanation

151
Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation Carnegie Mellon University www.hss.cmu.edu/philosophy/ faculty-kelly.php

description

Ockham’s Razor in Causal Discovery: A New Explanation. Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation Carnegie Mellon University www.hss.cmu.edu/philosophy/faculty-kelly.php. I. Prediction vs. Policy . Predictive Links. - PowerPoint PPT Presentation

Transcript of Ockham’s Razor in Causal Discovery: A New Explanation

Page 1: Ockham’s Razor in Causal Discovery: A New Explanation

Ockham’s Razor in Causal Discovery: A

New ExplanationKevin T. Kelly

Conor Mayo-WilsonDepartment of Philosophy

Joint Program in Logic and ComputationCarnegie Mellon University

www.hss.cmu.edu/philosophy/faculty-kelly.php

Page 2: Ockham’s Razor in Causal Discovery: A New Explanation

I. Prediction vs. Policy

Page 3: Ockham’s Razor in Causal Discovery: A New Explanation

Predictive LinksCorrelation or co-dependency allows one to predict Y from X.

Ash traysLung

can

cer Ash traysLinked toLung cancer!

scientistpolicy maker

Page 4: Ockham’s Razor in Causal Discovery: A New Explanation

PolicyPolicy manipulates X to achieve a change in Y.

Ash traysLung

can

cer

Prohibit ash trays!

Ash traysLinked toLung cancer!

Page 5: Ockham’s Razor in Causal Discovery: A New Explanation

PolicyPolicy manipulates X to achieve a change in Y.

Ash traysLung

can

cer

We failed!

Page 6: Ockham’s Razor in Causal Discovery: A New Explanation

Correlation is not Causation

Manipulation of X can destroy the correlation of X with Y.

Ash traysLung

can

cer

We failed!

Page 7: Ockham’s Razor in Causal Discovery: A New Explanation

Standard RemedyRandomized controlled study

Ash traysLung

can

cer

That’s what happensif you carry out thepolicy.

Page 8: Ockham’s Razor in Causal Discovery: A New Explanation

InfeasibilityExpenseMorality

Lead

IQ

Let me force a few thousand childrento eat lead.

Page 9: Ockham’s Razor in Causal Discovery: A New Explanation

InfeasibilityExpenseMorality

Lead

IQ

Just joking!

Page 10: Ockham’s Razor in Causal Discovery: A New Explanation

Ironic Alliance

Lead

IQ

Ha! You will never prove thatlead affects IQ…

industry

Page 11: Ockham’s Razor in Causal Discovery: A New Explanation

Ironic Alliance

Lead

IQ

And you can’t throw my peopleout of work on a mere whim.

Page 12: Ockham’s Razor in Causal Discovery: A New Explanation

Lead

IQ

So I will keep on polluting, which will never settle the matter because it is not a randomized trial.

Ironic Alliance

Page 13: Ockham’s Razor in Causal Discovery: A New Explanation

II. Causes From Correlations

Page 14: Ockham’s Razor in Causal Discovery: A New Explanation

Causal Discovery

Protein A

Protein BProtein C Cancer protein

Patterns of conditional correlation can imply unambiguous causal conclusions

(Pearl, Spirtes, Glymour, Scheines, etc.)

Eliminate protein C!

Page 15: Ockham’s Razor in Causal Discovery: A New Explanation

Basic Idea Causation is a directed, acyclic

network over variables. What makes a network causal is a

relation of compatibility between networks and joint probability distributions.X

YZ

XY Z

compatibility

pG

Page 16: Ockham’s Razor in Causal Discovery: A New Explanation

Joint distribution p is compatible with directed, acyclic network G iff:

Causal Markov Condition: each variable X is independent of its non-effects given its immediate causes.

Faithfulness Condition: every conditional independence relation that holds in p is a consequence of the Causal Markov Cond.

Compatibility

Y ZXW

VV

Page 17: Ockham’s Razor in Causal Discovery: A New Explanation

B C

Common Cause• B yields info about C (Faithfulness);• B yields no further info about C given A (Markov).

A

A

B C

Page 18: Ockham’s Razor in Causal Discovery: A New Explanation

Causal Chain• B yields info about C (Faithfulness);• B yields no further info about C given A (Markov).

B

A

C

A

B

C

Page 19: Ockham’s Razor in Causal Discovery: A New Explanation

Common Effect• B yields no info about C (Markov);• B yields extra info about C given A (Faithfulness).

A

B C

A

B C

Page 20: Ockham’s Razor in Causal Discovery: A New Explanation

Distinguishability

A

B CA

B

C

A

C

B

A

B C

indistinguishable distinctive

Page 21: Ockham’s Razor in Causal Discovery: A New Explanation

Immediate Connections• There is an immediate causal connection between X and Y iff

X is dependent on Y given every subset of variables not containing X and Y (Spirtes, Glymour and Scheines)

X YNo intermediate conditioning setbreaks dependency

X YZ

WSome conditioningset breaks dependency

Page 22: Ockham’s Razor in Causal Discovery: A New Explanation

Recovery of Skeleton• Apply preceding condition to recover every non-oriented immediate causal connection.

X YYZ

skeleton

X YYZ

truth

Page 23: Ockham’s Razor in Causal Discovery: A New Explanation

Orientation of Skeleton• Look for the distinctive pattern of common effects.

Common effectX Y

YZ

X YYZ

truth

Page 24: Ockham’s Razor in Causal Discovery: A New Explanation

Orientation of Skeleton• Look for the distinctive pattern of common effects.

• Draw all deductive consequences of these orientations.

Common effectX Y

YZ

Y is not common effect of ZYSo orientation must be downward

X YYZ

truth

Page 25: Ockham’s Razor in Causal Discovery: A New Explanation

Causation from Correlation

Protein A

Protein BProtein C Cancer protein

The following network is causally unambiguous if all variables are observed.

Page 26: Ockham’s Razor in Causal Discovery: A New Explanation

Causation from Correlation

Protein A

Protein BProtein C Cancer protein

The red arrow is also immune to latent confounding causes

Page 27: Ockham’s Razor in Causal Discovery: A New Explanation

Brave New World for Policy

Protein A

Protein BProtein C Cancer protein

Experimental (confounder-proof) conclusions from correlational data!

Eliminate protein C!

Page 28: Ockham’s Razor in Causal Discovery: A New Explanation

III. The Catch

Page 29: Ockham’s Razor in Causal Discovery: A New Explanation

Metaphysics vs. Inference

The above results all assume that the true statistical independence relations for p are given.

But they must be inferred from finite samples.

Sample Inferred statisticaldependencies

Causalconclusions

Page 30: Ockham’s Razor in Causal Discovery: A New Explanation

Problem of Induction Independence is indistinguishable

from sufficiently small dependence at sample size n.

independence

dependencedata

Page 31: Ockham’s Razor in Causal Discovery: A New Explanation

Bridging the Inductive Gap

Assume conditional independence until the data show otherwise.

Ockham’s razor: assume no more causal complexity than necessary.

Page 32: Ockham’s Razor in Causal Discovery: A New Explanation

Inferential Instability No guarantee that small

dependencies will not be detected later.

Can have spectacular impact on prior causal conclusions.

Page 33: Ockham’s Razor in Causal Discovery: A New Explanation

Current Policy AnalysisProtein A

Protein BProtein C Cancer protein

Eliminate protein C!

Protein A

Protein BProtein C Cancer protein

Page 34: Ockham’s Razor in Causal Discovery: A New Explanation

As Sample Size Increases…

Rescind that order!

Protein A

Protein BProtein C Cancer proteinweak

Protein D

Page 35: Ockham’s Razor in Causal Discovery: A New Explanation

As Sample Size Increases Again…

Eliminate protein C again!

Protein A

Protein BProtein C Cancer proteinweak

Protein D

Protein Eweak

weak

Page 36: Ockham’s Razor in Causal Discovery: A New Explanation

As Sample Size Increases Again…

Protein A

Protein BProtein C Cancer proteinweak

Protein D

Protein Eweak

weak

Etc.Eliminate protein C again!

Page 37: Ockham’s Razor in Causal Discovery: A New Explanation

Typical Applications Linear Causal Case: each variable

X is a linear function of its parents and a normally distributed hidden variable called an “error term”. The error terms are mutually independent.

Discrete Multinomial Case: each variable X takes on a finite range of values.

Page 38: Ockham’s Razor in Causal Discovery: A New Explanation

No unobserved latent confounding causes

An Optimistic Concession

Genetics

Smoking Cancer

Page 39: Ockham’s Razor in Causal Discovery: A New Explanation

Causal Flipping Theorem

No matter what a consistent causal discovery procedure has seen so far, there exists a pair G, p satisfying the above assumptions so that the current sample is arbitrarily likely in p and the procedure produces arbitrarily many opposite conclusions in p about an arbitrary causal arrow in G as sample size increases.

oops

I meant oops

oopsI meant

I meant

Page 40: Ockham’s Razor in Causal Discovery: A New Explanation

Causal Flipping Theorem

Every consistent causal inference method is covered.

Therefore, multiple instability is an intrinsic feature of the causal discovery problem.

oops

I meant oops

oopsI meant

I meant

Page 41: Ockham’s Razor in Causal Discovery: A New Explanation

The Crooked Course"Living in the midst of ignorance and considering themselves intelligent and enlightened, the senseless people go round and round, following crooked courses, just like the blind led by the blind." Katha Upanishad, I. ii. 5.

Page 42: Ockham’s Razor in Causal Discovery: A New Explanation

Extremist Reaction Since causal discovery cannot lead

straight to the truth, it is not justified.

I must remain silent.Therefore, I

win.

Page 43: Ockham’s Razor in Causal Discovery: A New Explanation

Moderate Reaction Many explanations have been

offered to make sense of the here-today-gone-tomorrow nature of medical wisdom — what we are advised with confidence one year is reversed the next — but the simplest one is that it is the natural rhythm of science.

(Do We Really Know What Makes us Healthy?, NY Times Magazine, Sept. 16, 2007).

Page 44: Ockham’s Razor in Causal Discovery: A New Explanation

Skepticism Inverted Unavoidable retractions are justified

because they are unavoidable. Avoidable retractions are not

justified because they are avoidable. So the best possible methods for

causal discovery are those that minimize causal retractions.

The best possible means for finding the truth are justified.

Page 45: Ockham’s Razor in Causal Discovery: A New Explanation

Larger Proposal The same holds for Ockham’s razor

in general when the aim is to find the true theory.

Page 46: Ockham’s Razor in Causal Discovery: A New Explanation

IV. Ockham’s Razor

Page 47: Ockham’s Razor in Causal Discovery: A New Explanation

Which Theory is Right?

???

Page 48: Ockham’s Razor in Causal Discovery: A New Explanation

Ockham Says:

Choose theSimplest!

Page 49: Ockham’s Razor in Causal Discovery: A New Explanation

But Why?

Gotcha!

Page 50: Ockham’s Razor in Causal Discovery: A New Explanation

Puzzle An indicator must be sensitive to

what it indicates.

simple

Page 51: Ockham’s Razor in Causal Discovery: A New Explanation

Puzzle An indicator must be sensitive to

what it indicates.

complex

Page 52: Ockham’s Razor in Causal Discovery: A New Explanation

Puzzle But Ockham’s razor always points

at simplicity.

simple

Page 53: Ockham’s Razor in Causal Discovery: A New Explanation

Puzzle But Ockham’s razor always points

at simplicity.

complex

Page 54: Ockham’s Razor in Causal Discovery: A New Explanation

Puzzle How can a broken compass help

you find something unless you already know where it is?

complex

Page 55: Ockham’s Razor in Causal Discovery: A New Explanation

Standard Accounts1. Prior Simplicity Bias

Bayes, BIC, MDL, MML, etc.

2. Risk MinimizationSRM, AIC, cross-validation, etc.

Page 56: Ockham’s Razor in Causal Discovery: A New Explanation

1. Bayesian Account Ockham’s razor is a feature of

one’s personal prior belief state. Short run: no objective

connection with finding the truth (flipping theorem applies).

Long run: converges to the truth, but other prior biases would also lead to convergence.

Page 57: Ockham’s Razor in Causal Discovery: A New Explanation

2. Risk Minimization Acct.

Risk minimization is about prediction rather than truth.

Urges using a false causal theory rather than the known true theory for predictive purposes.

Therefore, not suited to exact science or to practical policy applications.

Page 58: Ockham’s Razor in Causal Discovery: A New Explanation

V. A New Foundation for

Ockham’s Razor

Page 59: Ockham’s Razor in Causal Discovery: A New Explanation

Connections to the Truth Short-run

Reliability Too strong to be

feasible when theory matters.

Long-run Convergence Too weak to single

out Ockham’s razor

ComplexSimple

ComplexSimple

Page 60: Ockham’s Razor in Causal Discovery: A New Explanation

Middle Path Short-run Reliability

Too strong to be feasible when theory matters.

“Straightest” convergence Just right?

Long-run Convergence Too weak to single

out Ockham’s razor

ComplexSimple

ComplexSimple

ComplexSimple

Page 61: Ockham’s Razor in Causal Discovery: A New Explanation

Empirical Problems

T1 T2 T3

Set K of infinite input sequences. Partition of K into alternative

theories.K

Page 62: Ockham’s Razor in Causal Discovery: A New Explanation

Empirical Methods

T1 T2 T3

Map finite input sequences to theories or to “?”.

K

T3

e

Page 63: Ockham’s Razor in Causal Discovery: A New Explanation

Method Choice

T1 T2 T3

e1 e2 e3 e4

Input history

Output historyAt each stage, scientist can choose a new method (agreeing with past theory choices).

Page 64: Ockham’s Razor in Causal Discovery: A New Explanation

Aim: Converge to the Truth

T1 T2 T3

K

T3 ? T2 ? T1 T1 T1 T1 . . .T1 T1 T1

Page 65: Ockham’s Razor in Causal Discovery: A New Explanation

Retraction Choosing T and then not choosing

T next

T’T

?

Page 66: Ockham’s Razor in Causal Discovery: A New Explanation

Aim: Eliminate Needless Retractions

Truth

Page 67: Ockham’s Razor in Causal Discovery: A New Explanation

Aim: Eliminate Needless Retractions

Truth

Page 68: Ockham’s Razor in Causal Discovery: A New Explanation

Aim: Eliminate Needless Delays to Retractions

theory

Page 69: Ockham’s Razor in Causal Discovery: A New Explanation

applicationapplicationapplication

applicationcorollary

applicationtheory

applicationapplicationcorollary applicationcorollary

Aim: Eliminate Needless Delays to Retractions

Page 70: Ockham’s Razor in Causal Discovery: A New Explanation

Why Timed Retractions?Retraction minimization =generalized significance level.

Retraction time minimization = generalized power.

Page 71: Ockham’s Razor in Causal Discovery: A New Explanation

Easy Retraction Time Comparisons

T1 T1 T2 T2

T1 T1 T2 T2 T3 T3T2 T4 T4

T2 T2

Method 1

Method 2

T4 T4 T4

. . .

. . .

at least as many at least as late

Page 72: Ockham’s Razor in Causal Discovery: A New Explanation

Worst-case Retraction Time Bounds

T1 T2

Output sequences

T1 T2

T1 T2

T4

T3

T3

T3

T3

T3 T3

T4

T4

T4

T4 T4

. . .

(1, 2, ∞)

. . .

. . .

. . .. . .. . .

T4

T4

T4

T1 T2 T3 T3 T3 T4T3 . . .

Page 73: Ockham’s Razor in Causal Discovery: A New Explanation

Curve Fitting Data = open intervals around Y

at rational values of X.

Page 74: Ockham’s Razor in Causal Discovery: A New Explanation

Curve Fitting No effects:

Page 75: Ockham’s Razor in Causal Discovery: A New Explanation

Curve Fitting First-order effect:

Page 76: Ockham’s Razor in Causal Discovery: A New Explanation

Curve Fitting Second-order effect:

Page 77: Ockham’s Razor in Causal Discovery: A New Explanation

Ockham

ConstantLinear

QuadraticCubic

There yet?Maybe.

Page 78: Ockham’s Razor in Causal Discovery: A New Explanation

Ockham

ConstantLinear

QuadraticCubic

There yet?Maybe.

Page 79: Ockham’s Razor in Causal Discovery: A New Explanation

Ockham

ConstantLinear

QuadraticCubic

There yet?Maybe.

Page 80: Ockham’s Razor in Causal Discovery: A New Explanation

Ockham

ConstantLinear

QuadraticCubic

There yet?Maybe.

Page 81: Ockham’s Razor in Causal Discovery: A New Explanation

Ockham Violation

ConstantLinear

QuadraticCubic

There yet?Maybe.

Page 82: Ockham’s Razor in Causal Discovery: A New Explanation

Ockham Violation

ConstantLinear

QuadraticCubic

I know you’re coming!

Page 83: Ockham’s Razor in Causal Discovery: A New Explanation

Ockham Violation

ConstantLinear

QuadraticCubic

Maybe.

Page 84: Ockham’s Razor in Causal Discovery: A New Explanation

Ockham Violation

ConstantLinear

QuadraticCubic

!!!

Hmm, it’s quite nice here…

Page 85: Ockham’s Razor in Causal Discovery: A New Explanation

Ockham Violation

ConstantLinear

QuadraticCubic

You’re back!Learned your lesson?

Page 86: Ockham’s Razor in Causal Discovery: A New Explanation

Violator’s Path

ConstantLinear

QuadraticCubic

See, you shouldn’t run aheadEven if you are right!

Page 87: Ockham’s Razor in Causal Discovery: A New Explanation

Ockham Path

ConstantLinear

QuadraticCubic

Page 88: Ockham’s Razor in Causal Discovery: A New Explanation

More General Argument Required

Cover case in which demon has branching paths (causal discovery)

Page 89: Ockham’s Razor in Causal Discovery: A New Explanation

More General Argument Required

Cover case in which scientist lags behind (using time as a cost)

Come on!

Page 90: Ockham’s Razor in Causal Discovery: A New Explanation

Empirical Effects

Page 91: Ockham’s Razor in Causal Discovery: A New Explanation

Empirical Effects

Page 92: Ockham’s Razor in Causal Discovery: A New Explanation

Empirical Effects

May take arbitrarily long to discoverBut can’t be taken back

Page 93: Ockham’s Razor in Causal Discovery: A New Explanation

Empirical Effects

May take arbitrarily long to discoverBut can’t be taken back

Page 94: Ockham’s Razor in Causal Discovery: A New Explanation

Empirical Effects

May take arbitrarily long to discoverBut can’t be taken back

Page 95: Ockham’s Razor in Causal Discovery: A New Explanation

Empirical Effects

May take arbitrarily long to discoverBut can’t be taken back

Page 96: Ockham’s Razor in Causal Discovery: A New Explanation

Empirical Effects

May take arbitrarily long to discoverBut can’t be taken back

Page 97: Ockham’s Razor in Causal Discovery: A New Explanation

Empirical Effects

May take arbitrarily long to discoverBut can’t be taken back

Page 98: Ockham’s Razor in Causal Discovery: A New Explanation

Empirical Effects

May take arbitrarily long to discoverBut can’t be taken back

Page 99: Ockham’s Razor in Causal Discovery: A New Explanation

Empirical Theories True theory determined by which

effects appear.

Page 100: Ockham’s Razor in Causal Discovery: A New Explanation

Empirical Complexity

More complex

Page 101: Ockham’s Razor in Causal Discovery: A New Explanation

Background Constraints

More complex

Page 102: Ockham’s Razor in Causal Discovery: A New Explanation

Background Constraints

More complex

Page 103: Ockham’s Razor in Causal Discovery: A New Explanation

Ockham’s Razor Don’t select a theory unless it is

uniquely simplest in light of experience.

Page 104: Ockham’s Razor in Causal Discovery: A New Explanation

Weak Ockham’s Razor Don’t select a theory unless it

among the simplest in light of experience.

Page 105: Ockham’s Razor in Causal Discovery: A New Explanation

Stalwartness Don’t retract your answer while it

is uniquely simplest

Page 106: Ockham’s Razor in Causal Discovery: A New Explanation

Stalwartness Don’t retract your answer while it

is uniquely simplest

Page 107: Ockham’s Razor in Causal Discovery: A New Explanation

Timed Retraction Bounds

r(M, e, n) = the least timed retraction bound covering the total timed retractions of M along input streams of complexity n that extend e

Empirical Complexity 0 1 2 3 . . .

. . .

M

Page 108: Ockham’s Razor in Causal Discovery: A New Explanation

Efficiency of Method M at e

M converges to the truth no matter what;

For each convergent M’ that agrees with M up to the end of e, and for each n: r(M, e, n) r(M’, e, n)

Empirical Complexity 0 1 2 3 . . .

. . .

M M’

Page 109: Ockham’s Razor in Causal Discovery: A New Explanation

M is Beaten at e There exists convergent M’ that

agrees with M up to the end of e, such that For each n, r(M, e, n) r(M’, e, n); Exists n, r(M, e, n) > r(M’, e, n).

Empirical Complexity 0 1 2 3 . . .

. . .

M M’

Page 110: Ockham’s Razor in Causal Discovery: A New Explanation

Ockham Efficiency Theorem

Let M be a solution. The following are equivalent: M is always strongly Ockham and

stalwart; M is always efficient; M is never weakly beaten.

Page 111: Ockham’s Razor in Causal Discovery: A New Explanation

Example: Causal Inference Effects are conditional statistical

dependence relations.

X dep Y | {Z}, {W}, {Z,W}

Y dep Z | {X}, {W}, {X,W}

X dep Z | {Y}, {Y,W}

. . .. . .

Page 112: Ockham’s Razor in Causal Discovery: A New Explanation

Causal Discovery = Ockham’s Razor

X Y Z W

Page 113: Ockham’s Razor in Causal Discovery: A New Explanation

Ockham’s Razor

X Y Z W

X dep Y | {Z}, {W}, {Z,W}

Page 114: Ockham’s Razor in Causal Discovery: A New Explanation

Causal Discovery = Ockham’s Razor

X Y Z W

X dep Y | {Z}, {W}, {Z,W}Y dep Z | {X}, {W}, {X,W}X dep Z | {Y}, {Y,W}

Page 115: Ockham’s Razor in Causal Discovery: A New Explanation

Causal Discovery = Ockham’s Razor

X Y Z W

X dep Y | {Z}, {W}, {Z,W}Y dep Z | {X}, {W}, {X,W}X dep Z | {Y}, {W}, {Y,W}

Page 116: Ockham’s Razor in Causal Discovery: A New Explanation

Causal Discovery = Ockham’s Razor

X Y Z W

X dep Y | {Z}, {W}, {Z,W}Y dep Z | {X}, {W}, {X,W}X dep Z | {Y}, {W}, {Y,W}Z dep W| {X}, {Y}, {X,Y}Y dep W| {Z}, {X,Z}

Page 117: Ockham’s Razor in Causal Discovery: A New Explanation

Causal Discovery = Ockham’s Razor

X Y Z W

X dep Y | {Z}, {W}, {Z,W}Y dep Z | {X}, {W}, {X,W}X dep Z | {Y}, {W}, {Y,W}Z dep W| {X}, {Y}, {X,Y}Y dep W| {X}, {Z}, {X,Z}

Page 118: Ockham’s Razor in Causal Discovery: A New Explanation

IV. Simplicity Defined

Page 119: Ockham’s Razor in Causal Discovery: A New Explanation

ApproachEmpirical complexity reflects

nested problems of induction posed by the problem.

Hence, simplicity is problem-relative but topologically invariant.

Page 120: Ockham’s Razor in Causal Discovery: A New Explanation

Empirical Problems

T1 T2 T3

Set K of infinite input sequences. Partition Q of K into alternative

theories.

K

Page 121: Ockham’s Razor in Causal Discovery: A New Explanation

Simplicity Concepts A simplicity concept for (K, Q) is just

a well-founded order < on a partition S of K with ascending chains of order type not exceeding omega such that:

1. Each element of S is included in some answer in Q.

2. Each downward union in (S, <) is closed;

3. Incomparable sets share no boundary point.

4. Each element of S is included in the boundary of its successor.

Page 122: Ockham’s Razor in Causal Discovery: A New Explanation

Empirical Complexity Defined

Let K|e denote the set of all possibilities compatible with observations e.

Let (S, <) be a simplicity concept for (K|e, Q).

Define c(w, e) = the length of the longest < path to the cell of S that contains w.

Define c(T, e) = the least c(w, e) such that T is true in w.

Page 123: Ockham’s Razor in Causal Discovery: A New Explanation

Applications Polynomial laws: complexity =

degree Conservation laws: complexity =

particle types – conserved quantities.

Causal networks: complexity = number of logically independent conditional dependencies entailed by faithfulness.

Page 124: Ockham’s Razor in Causal Discovery: A New Explanation

General Ockham Efficiency Theorem

Let M be a solution. The following are equivalent: M is always strongly Ockham and

stalwart; M is always efficient; M is never beaten.

Page 125: Ockham’s Razor in Causal Discovery: A New Explanation

Conclusions Causal truths are necessary for

counterfactual predictions. Ockham’s razor is necessary for

staying on the straightest path to the true theory but does not point at the true theory.

No evasions or circles are required.

Page 126: Ockham’s Razor in Causal Discovery: A New Explanation

Future Directions Extension of unique efficiency

theorem to stochastic model selection.

Latent variables as Ockham conclusions.

Degrees of retraction. Pooling of marginal Ockham

conclusions. Retraction efficiency assessment of

MDL, SRM.

Page 127: Ockham’s Razor in Causal Discovery: A New Explanation

Suggested Reading "Ockham’s Razor, Truth, and Informat

ion", in Handbook of the Philosophy of Information, J. van Behthem and P. Adriaans, eds., to appear.

"Ockham’s Razor, Empirical Complexity, and Truth-finding Efficiency", Theoretical Computer Science, 383: 270-289, 2007.

Both available as pre-prints at: www.hss.cmu.edu/philosophy/faculty-kelly.php

Page 128: Ockham’s Razor in Causal Discovery: A New Explanation

1. Prior Simplicity Bias

The simple theory is more plausible now because it was more plausible yesterday.

Page 129: Ockham’s Razor in Causal Discovery: A New Explanation

More Subtle VersionSimple data are a miracle in the complex theory but not in the simple theory.

P C

Regularity: retrograde motion of Venus at solar conjunction

Has to be!

Page 130: Ockham’s Razor in Causal Discovery: A New Explanation

However… e would not be a miracle given P(q);

Why not this?

CP

Page 131: Ockham’s Razor in Causal Discovery: A New Explanation

The Real MiracleIgnorance about model: p(C) p(P);

+ Ignorance about parameter setting: p’(P(q) | P) p(P(q’ ) | P).

= Knowledge about C vs. P(q):p(P(q)) << p(C).

CP

qqqqqqqq

Lead into gold.Perpetual motion.Free lunch.

Sounds good!

Page 132: Ockham’s Razor in Causal Discovery: A New Explanation

Standard Paradox of IndifferenceIgnorance of red vs. not-red

+ Ignorance over not-red: = Knowledge about red vs. white.

qq

Knognorance = All the priveleges of knowledgeWith none of the responsibilitiesSounds good!

Page 133: Ockham’s Razor in Causal Discovery: A New Explanation

The Ellsberg Paradox

1/3 ? ?

Page 134: Ockham’s Razor in Causal Discovery: A New Explanation

Human Preference

1/3 ? ?

a > b

a c < cb

b

Page 135: Ockham’s Razor in Causal Discovery: A New Explanation

Human View

1/3 ? ?

a > b

a c < cb

bknowledge ignorance

knowledgeignorance

Page 136: Ockham’s Razor in Causal Discovery: A New Explanation

Bayesian “Rationality”

1/3 ? ?

a > b

a c > cb

bknognoranceknognorance

knognoranceknognorance

Page 137: Ockham’s Razor in Causal Discovery: A New Explanation

In Any Event

The coherentist foundations of Bayesianism have nothing to do with short-run truth-conduciveness.Not so loud!

Page 138: Ockham’s Razor in Causal Discovery: A New Explanation

Bayesian Convergence Too-simple theories get shot down…

ComplexityTheories

Updated opinion

Page 139: Ockham’s Razor in Causal Discovery: A New Explanation

Bayesian Convergence Plausibility is transferred to the next-

simplest theory…

Blam! ComplexityTheories

Updated opinion

Plink!

Page 140: Ockham’s Razor in Causal Discovery: A New Explanation

Bayesian Convergence Plausibility is transferred to the next-

simplest theory…

Blam! ComplexityTheories

Updated opinion

Plink!

Page 141: Ockham’s Razor in Causal Discovery: A New Explanation

Bayesian Convergence Plausibility is transferred to the next-

simplest theory…

Blam! ComplexityTheories

Updated opinion

Plink!

Page 142: Ockham’s Razor in Causal Discovery: A New Explanation

Bayesian Convergence The true theory is never shot down.

Blam! ComplexityTheories

Updated opinion

Zing!

Page 143: Ockham’s Razor in Causal Discovery: A New Explanation

Convergence But alternative strategies also

converge: Any theory choice in the short run is

compatible with convergence in the long run.

Page 144: Ockham’s Razor in Causal Discovery: A New Explanation

Summary of Bayesian Approach

Prior-based explanations of Ockham’s razor are circular and based on a faulty model of ignorance.

Convergence-based explanations of Ockham’s razor fail to single out Ockham’s razor.

Page 145: Ockham’s Razor in Causal Discovery: A New Explanation

2. Risk Minimization Ockham’s razor minimizes

expected distance of empirical estimates from the true value.

Truth

Page 146: Ockham’s Razor in Causal Discovery: A New Explanation

Unconstrained Estimates

are Centered on truth but spread around it.

Pop!Pop!Pop!Pop!

Unconstrained aim

Page 147: Ockham’s Razor in Causal Discovery: A New Explanation

Off-center but less spread.

Clamped aim

Truth

Constrained Estimates

Page 148: Ockham’s Razor in Causal Discovery: A New Explanation

Off-center but less spread Overall improvement in expected

distance from truth…

Truth

Pop!Pop!Pop!Pop!

Constrained Estimates

Clamped aim

Page 149: Ockham’s Razor in Causal Discovery: A New Explanation

Doesn’t Find True Theory

The theory that minimizes estimation risk can be quite false…

Four eyes!

Clamped aim

Page 150: Ockham’s Razor in Causal Discovery: A New Explanation

Makes Sense…when loss of an answer is similar in

nearby distributions.

Similarityp

Close is goodenough!Loss

Page 151: Ockham’s Razor in Causal Discovery: A New Explanation

But Not When Truth Matters

…i.e., when loss of an answer is discontinuous with similarity.

Similarityp

Close is no cigar!Loss