Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin...

27
Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson
  • date post

    15-Jan-2016
  • Category

    Documents

  • view

    219
  • download

    0

Transcript of Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin...

Page 1: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Incorporating Prior Information in Causal Discovery

Rodney O'Donnell, Jahangir Alam,

Bin Han, Kevin Korb and Ann Nicholson

Page 2: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Outline• Methods for learning causal models

– Data mining, Elicitation, Hybrid approach

• Algorithms for learning causal models– Constraint based– Metric based (including our CaMML)

• Incorporating priors into CaMML– 5 different types of priors

• Experimental Design

• Experimental Results

Page 3: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Learning Causal Bayesian Networks

Elicitation Data mining

Requires domain knowledge Requires large dataset

Expensive and time-consuming

Sometimes, the algorithms are “stupid”

(no prior knowledge →no common sense)

Partial knowledge may be insufficient

Data only tells part of the story

Page 4: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

A hybrid approach

• Combine the domain knowledge and the facts learned from data

• Minimize the expert’s effort in domain knowledge elicitation

Elicitation Data Mining

Causal BN

• Enhance the efficiency of the learning process– Reduce / bias the search space

Page 5: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Objectives

• Generate different prior specification methods

• Comparatively study the influences of priors on the BN structural learning

• Future: apply the methods to the Heart Disease modeling project

Page 6: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Causal learning algorithms

• Constraint based– Pearl & Verma’s algorithm, PC

• Metric based– MML, MDL, BIC, BDe, K2,

K2+MWST,GES,CaMML

• Priors on structure– Optional vs. Required – Hard vs. Soft

Page 7: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Priors on structure

Required Optional Hard Soft

K2 (BNT) yes yes

K2+MWST

(BNT)yes yes

GES

(Tetrad)yes yes

PC (Tetrad ) yes yes

CaMML yes yes yes

Page 8: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

CaMML• MML metric based• MML vs. MDL

– MML can be derived from Bayes’ Theorem (Wallace)– MDL is a non-Bayesian method

• Search: MCMC sampling through TOM space– TOM = DAG + total ordering – TOM is finer than DAG

A

B C

Two TOMs: ABC, ACB

A B C

One Tom: ABC

Page 9: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Priors in CaMML: arcs

Experts may provide priors on pairwise relations:1. Directed arcs:

– e.g. {A→B 0.7} (soft)– e.g. {A→D 1.0} (hard)

2. Undirected arcs– E.g. {A─C 0.6} (soft)

3. {A→B 0.7; B→A 0.8; A─C 0.6}– Represented by 2 adjacency matrices

0.7

0.8

A B C

A

B

C

Directed arcs

0.6

0.6

A B C

A

B

C

Undirected arcs

Page 10: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Priors in CaMML: arcs (continued)

• MML cost for each pair

AB: log(0.7) + log(1-0.8)

AC: log(1-0.6)

BC: log( default arc prior)

expert specified network

A0.7

0.8

0.6

B C

One candidate network

A

B C

Page 11: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Priors in CaMML: Tiers

• Expert can provide prior on an additional pairwise relation

• Tier: Temporal ordering of variables

E.g., Tier {A>>C 0.6;B>>C 0.8}

IMML(h)=log(0.6)+log(1-0.8)

A

C

B

One possible TOM

Page 12: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Priors in CaMML: edPrior

• Expert specifies single network, plus a confidence– e.g. EdConf=0.7

• Prior is based on edit distance from this network

A

B

Expert specified network

C

IMML(h)=-2*(log0.7-log(1-0.7))

One candidate network :ED=2

A

B C

Page 13: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Priors in CaMML: KTPrior

• Again, expert specifies single network, plus a confidence– e.g. KTConf = 0.7

• Prior is based on Kendall-Tau Edit distance from this network – KTEditDist = KT + undirected ED

A B C

Expert specified dagTOM: ABC

A

B C

A candidate TOM: ACB

IMML(h)=-3*(log0.7-log(1-0.7))

• B-C order in expert TOM disagrees with candidate TOM • KTEditDist = KT(1) + Undirected ED (2) = 3

Page 14: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Experiment 1: Design• Prior

– weak, strong– correct, incorrect

• Size of dataset– 100,1000,10k and 100k– For each size we randomly generate 30 datasets

• Algorithms: – CaMML– K2 (BNT)– K2+MWST (BNT)– GES (TETRAD)– PC (TETRAD)

• Models: AsiaNet, “Model6”(An artificial model)

Page 15: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Models: AsiaNet and “Model6”

Page 16: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Experimental Design

Priors

Algorithms

Sample Size

Page 17: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Experiment Design: Evaluation

• ED: Difference between Structures

• KL: Difference between distributions

Page 18: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Model6 (1000 samples)

Page 19: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Model6 (10k samples)

Page 20: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

AsiaNet (1000 Samples)

Page 21: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Experiment 1: Results

• With default priors: CaMML is comparable to or outperforms other algorithms

• With full tiers: – There is no statistically significant differences

between CaMML and K2– GES is slightly behind, PC performs poorly.

• CaMML is the only method allowing soft priors: – with the prior 0.7, CaMML is comparable to other

algorithms with full tiers– With stronger prior, CaMML performs better

• CaMML performs significantly better with expert’s priors than with uniform priors

Page 22: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Expertiment 2:Is CaMML well calibrated?

• Biased prior– Expert’s confidence may not be consistent

with the expert’s skill

e.g, expert 0.99 sure but wrong about a connection

– Biased hard prior– Soft prior and data will eventually overcome

the bad prior

Page 23: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Is CaMML well calibrated?

• Question: Does CaMML reward well calibrated experts?

• Experimental design– Objective measure: How good is a proposed

structure?: • ED: 0-14

– Subjective measure: Expert’s confidence• 0.5 to 0.9999

– How good is the learned structure?• KL distance

Page 24: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Effect of expert skill and confidence on quality of learned model

Better ← Expert Skill → Worse

Overconfidence penalized

Justified confidence rewarded

Unconfident expert

Page 25: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Experiment 2: Results

• CaMML improves the elicited structure and approaches the true structure

• CaMML improves when the expert confidence matches with the expert skill

Page 26: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Conclusions

• CaMML is comparable to other algorithms when given equivalent prior knowledge

• CaMML can incorporate more flexible prior knowledge

• CaMML’s results improve when expert is skillful or well calibrated

Page 27: Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.

Thanks