When are Graphical Causal Models not Good Models? CAPITS 2008 Jan Lemeire September 12 th 2008.
Graphical Causal Models: Determining Causes from Observations
-
Upload
maurilio-nihill -
Category
Documents
-
view
23 -
download
0
description
Transcript of Graphical Causal Models: Determining Causes from Observations
Graphical Causal Models: Determining Causes from Observations
William MarshRisk Assessment and Decision Analysis
(RADAR)Computer Science
RADAR Group, Computer Science
Risk Assessment and Decision Analysis Research areas
Software engineering, safety, finance, legal A new initiative in medical data analysis:
DIADEM
Norman FentonGroup leader
Martin Neil
http://www.dcs.qmul.ac.uk/researchgp/radar/
Outline
Graphical Causal Models Bayesian networks: prediction or
diagnosis Causal induction: learning causes from
data Causal effect estimation: strength of
causal relationships from data
DIADEM project
Bayesian Nets
Detecting Asthma Exacerbations
Aim to assist early detection of asthma episodes in Paediatric A&E Using only data
already available electronically
Network created by Experts Data
Bayes’ Theorem
)().|()().|(),( APABPBPBAPBAP
Joint probability
)().|()|( APABPBAP
Revised belief about A, given
evidence B
Prior probability of A
Factor to update belief about A, given evidence B
Bayes’ Theorem (Made Easy)
A person has a positive test result How likely is it they are infected? 17%
Infection
Test
yes, no
pos, negFalse positive P(T=pos|I=no) = 5%Negligible false negative
Infection rate: P(I) = 1%
Medical Uses of BNs
Diagnosis Differential diagnosis from symptoms
Prediction Likely outcome
Building a BN From expert knowledge expert
system From data data mining
Beyond Bayesian Networks
Cause versus Association
Both represent fever infection association ‘Causal model’ has arrow from cause to effect
Infection
Fever Infection
Fever
or ?
)().|(
)().|(
),(
FPFIP
IPIFP
FIP
Joint probability same:
Causal Induction
Discover causal relationships from data
Sometimes distinguishable
… different conditional independence
A B C
A B C
Causal Induction – Application
Discover causal relationships from data Need lots of data
Applied to gene regulatory networks Data from micro-array experiments Recent explanation of limitations
Estimating Causal Effects
Suppose A is a cause of B
What is the causal effect? Is it p(B | A) ?
A B
Benefits of Sports?
Is there a relationship between sport and exam success? Data available ‘Intelligence’ correlate
Is this the correct test?
intelligence
sport exam result
P(exam=pass|sport) > P(exam=pass| no-sport)
Benefits of Sports?
When we condition on ‘sport’ Probability for ‘exam result’ Probability for ‘intelligence’ changes
What if I decide to start sport?
p(pass|sport) > p(pass| no-sport)
73% 67%
observe
intelligence
sport exam result
Intervention v Observation
Causal effect differs from conditional probability
Mostly interested in consequence of change Causal effects can be measured by a Randomised
Control Trial Causal effect of sport on exam results not identifiable
change
P(pass|do(sport)) < P(pass| do(no sport))
intelligence
sport exam result
Benefit of Sport
New observable variable ‘attendance at lectures’
Causal effect of sport on exam results now identifiable
sport (S) exam result (E)
intelligence
attendance (A)
SA
SPASEPSAPSdoEP )().,|()|())(|(
Estimating Causal Effects
Rules to convert causal to statistical questions Generalises e.g. stratification, potential outcomes Assumptions: a causal model Some assumptions may be testable
Causal model Some variables observed, others not measured Some causal effects identifiable
Challenges Causal models for complex applications Statistical implications
Example Application
Royal London trauma service Criteria for activation of the trauma team Aim to prevent unnecessary trauma team calls
Extensive records of trauma patient outcomes US study of 1495 admissions proposed new
‘triage’ criteria Significant decrease in overtriage 51% 29% Insignificant increase in undertriage 1% 3% None of the patients undertriaged by new criteria
died Does this show safety of new criteria?
DIADEM Project
Digital Economy in Healthcare
Data Information and Analysis for clinical DEcision Making
EPSRC Digital Economy Cluster
Partnership between solution providers and clinical data analysis problem holders
Summarise unsolved data analysis needs, in relation to the analysis techniques available
Join the DIADEM cluster
Cluster Activities and Outcomes
Engage stakeholders and build a community: Creation of a community web-site and
forum Meetings with potential ‘problem holders’ Workshops
A road map: data and information Follow-up proposal
A self-sustaining website – health data analytics
Summary
Bayesian networks Prediction and diagnosis
Causal induction Identify (some) causal relationships from
(lots of) data Causal effects
Experimental results from … … non-experimental data … assumptions (causal model)
Join the DIADEM cluster