Post on 31-Mar-2015
‘what I am after’ from gR2002
Peter Green,
University of Bristol, UK
Why graphical models in R?
• Statistical modelling and analysis do not respect boundaries of model classes
• Software should encourage and support good practice - and graphical models are good practice!
• Data analysis - model-based• R for ‘reference implementation’ of new
methodology• Open software
Questions• Scope?
– Digram, MIM, CoCo, TETRAD, Hugin, BUGS?– Determined by classes of model, or classes of
algorithm?
• Market?– Statistics researcher, statistics MSc, arbitrary
Excel user?
• Delivery?– R package(s), with C code?
Markov chains
Graphical models
Contingencytables
Spatial statistics
Sufficiency
Regression
Covariance selection
Statisticalphysics
Genetics
AI
Contents
• Hierarchical models• Variable-length parameters• Models with undirected edges• Hidden Markov models• Inference on structure• Discrete graphical models/PES• Grappa
Bayesian Hierarchical models
properly integrating outall sources of variation
Repeated measures on children's weights
• Children i=1,2,…,k have their weights measured on ni occasions, tij,j=1,2,…ni obtaining weights yij.
• Suppose that, for each child, we have a linear growth equation, with independent normal errors
),(~ 2σtβαNy ijiiij
Repeated measures on children's weights, continued
• Suppose that vary across the population according to
• A Bayesian completes the model by specifying priors on
),( ii βα
),(~ 2ααi σμNα ),(~ 2
ββi σμNβ
),,,,( 222 σσμσμ ββαα
Graph for children’s weights
}{ i }{ i }{ ijt
}{ ijy
Measurement error
Explanatory variables
X subject
to error - we
only observe
U on most
cases
Contents
• Hierarchical models• Variable-length parameters • Models with undirected edges• Hidden Markov models• Inference on structure• Discrete graphical models/PES• Grappa
Mixture modelling
DAG for a
mixture model
k
jjj yfwy
1
)|(~
k
w
y
Mixture modelling
DAG for a
mixture model
k
jjj yfwy
1
)|(~
k
w
z
y
)|(~)|( jyfjzy
jwjzp )(
length=k
value set ={1,2,…,k}
Measurement error using mixture model for population
Contents
• Hierarchical models• Variable-length parameters• Models with undirected edges• Hidden Markov models• Inference on structure• Discrete graphical models/PES• Grappa
Modelling with undirected graphsDirected acyclic graphs are a natural
representation of the way we usually specify a statistical model - directionally:
• disease symptom• past future• parameters data …..
However, sometimes (e.g. spatial models) there is no natural direction
Scottish lip cancer data
The rates of lip cancer in 56 counties in Scotland have been analysed by Clayton and Kaldor (1987) and Breslow and Clayton (1993)
(the analysis here is based on the example in the WinBugs manual)
Scottish lip cancer data (2)
The data include
• a covariate measuring the percentage of the population engaged in agriculture, fishing, or forestry, and• the "position'' of each county expressed as a list of adjacent counties.
• the observed and expected cases (expected numbers based on the population and its age and sex distribution in the county),
Scottish lip cancer data (3)
County Obs Exp x SMR Adjacent
cases cases (% in counties
agric.)
1 9 1.4 16 652.2 5,9,11,19
2 39 8.7 16 450.3 7,10
... ... ... ... ... ...
56 0 1.8 10 0.0 18,24,30,33,45,55
Model for lip cancer data(1) Graph
observed counts
random spatial effects
covariate
regressioncoefficient
expected counts
Model for lip cancer data
• Data:• Link function:
• Random spatial effects:
• Priors:
)(Poisson~ iiO
iiii bxE 10/loglog 10
ji
jin
n bbbbp~
22/1 )4/)(exp()|,...,(
),(~ dr Uniform~, 10
(2) Distributions
Bugs code for lip cancer data
model{b[1:regions] ~ car.normal(adj[], weights[], num[], tau)b.mean <- mean(b[])for (i in 1 : regions) { O[i] ~ dpois(mu[i]) log(mu[i]) <- log(E[i]) + alpha0 + alpha1 * x[i] / 10 + b[i] SMRhat[i] <- 100 * mu[i] / E[i] }alpha1 ~ dnorm(0.0, 1.0E-5)alpha0 ~ dflat()tau ~ dgamma(r, d) sigma <- 1 / sqrt(tau)}
Note: declarative,rather than procedurallanguage
Bugs code for lip cancer data
model{b[1:regions] ~ car.normal(adj[], weights[], num[], tau)b.mean <- mean(b[])for (i in 1 : regions) { O[i] ~ dpois(mu[i]) log(mu[i]) <- log(E[i]) + alpha0 + alpha1 * x[i] / 10 + b[i] SMRhat[i] <- 100 * mu[i] / E[i] }alpha1 ~ dnorm(0.0, 1.0E-5)alpha0 ~ dflat()tau ~ dgamma(r, d) sigma <- 1 / sqrt(tau)}
)(Poisson~ iiO
Bugs code for lip cancer data
model{b[1:regions] ~ car.normal(adj[], weights[], num[], tau)b.mean <- mean(b[])for (i in 1 : regions) { O[i] ~ dpois(mu[i]) log(mu[i]) <- log(E[i]) + alpha0 + alpha1 * x[i] / 10 + b[i] SMRhat[i] <- 100 * mu[i] / E[i] }alpha1 ~ dnorm(0.0, 1.0E-5)alpha0 ~ dflat()tau ~ dgamma(r, d) sigma <- 1 / sqrt(tau)}
iiii bxE 10/loglog 10
Bugs code for lip cancer data
model{b[1:regions] ~ car.normal(adj[], weights[], num[], tau)b.mean <- mean(b[])for (i in 1 : regions) { O[i] ~ dpois(mu[i]) log(mu[i]) <- log(E[i]) + alpha0 + alpha1 * x[i] / 10 + b[i] SMRhat[i] <- 100 * mu[i] / E[i] }alpha1 ~ dnorm(0.0, 1.0E-5)alpha0 ~ dflat()tau ~ dgamma(r, d) sigma <- 1 / sqrt(tau)}
ji
jin
n bbbbp~
22/1 )4/)(exp()|,...,(
Bugs code for lip cancer data
model{b[1:regions] ~ car.normal(adj[], weights[], num[], tau)b.mean <- mean(b[])for (i in 1 : regions) { O[i] ~ dpois(mu[i]) log(mu[i]) <- log(E[i]) + alpha0 + alpha1 * x[i] / 10 + b[i] SMRhat[i] <- 100 * mu[i] / E[i] }alpha1 ~ dnorm(0.0, 1.0E-5)alpha0 ~ dflat()tau ~ dgamma(r, d) sigma <- 1 / sqrt(tau)}
),(~ dr
WinBugs for lip cancer data
Dynamic traces for some parameters:alpha1
iteration1695016900168501680016750167001665016600
-0.25
0.0
0.25
0.5
0.75
tau
iteration1695016900168501680016750167001665016600
0.0
2.0
4.0
6.0
mu[1]
iteration1695016900168501680016750167001665016600
0.0
5.0
10.0
15.0
WinBugs for lip cancer data
Posterior densities for some parameters:
alpha1 sample: 7000
-0.5 0.0 0.5 1.0
0.0
1.0
2.0
3.0
4.0
mu[1] sample: 7000
0.0 5.0 10.0 15.0
0.0
0.1
0.2
0.3
tau sample: 7000
0.0 2.0 4.0
0.0
0.2
0.4
0.6
0.8
Contents
• Hierarchical models• Variable-length parameters• Models with undirected edges• Hidden Markov models• Inference on structure• Discrete graphical models/PES• Grappa
Hidden Markov models
z0 z1 z2 z3 z4
y1 y2 y3 y4
e.g. Hidden Markov chain (DLM, state space model)
observed
hidden
relativerisk
parameters
Hidden Markov models
• Richardson & Green (2000) used a hidden Markov random field model for disease mapping
)(Poisson~ izi Eyi
observedincidence
expectedincidencehidden
MRF
DAG for Potts-based Hidden Markov random field
spatial fields
length=k
)(Poisson~ izi Eyi
))()(exp(),|( kzUkzp
)0.1,...,1.0,0(~ U
)10,...,2,1(~ Uk
),(~,...,1 k
Distributions for Potts-based Hidden Markov random field
Larynx cancer in females in France
SMRs
)|1( ypiz
ii Ey /
Ion channel signal restorationHodgson, JRSS(B), 1999
DAG for alternating renewal process model for ion channel data
Binary signal
Data
Sojourn timeparameters
Contents
• Hierarchical models• Variable-length parameters• Models with undirected edges• Hidden Markov models• Inference on structure• Discrete graphical models/PES• Grappa
Ion channel model choiceHodgson and Green, Proc Roy Soc Lond A, 1999
Example: hidden continuous time models
O2 O1 C1 C2
O1 O2
C1 C2 C3
DAG for hidden CTMC model for ion channel data
Binary signal
Data
Model indicator
Transition rates
Ion channelmodel DAG
levels &variances
modelindicator
transitionrates
hiddenstate
data
binarysignal
levels &variances
modelindicator
transitionrates
hiddenstate
data
binarysignal
O1 O2
C1 C2 C3
** *
******
**
Posterior model probabilities
O1 C1
O2 O1 C1
O2 O1 C1 C2
O1 C1 C2
.41
.12
.36
.10
Simultaneous inference on parameters and structure of CI graph :
Bayesian approach:
Place prior on all graphs, and conjugate prior on parameters (hyper-Markov laws, Dawid & Lauritzen), then use MCMC to update both graphs and parameters to simulate posterior distribution
Graph moves
Giudici & Green (Biometrika, 1999) develop a Bayesian methodology for model selection in Gaussian models, assuming
decomposability
(= graph triangulated
= no chordless
-cycles)
7 6 5
2 3 414
Graph moves
We can traverse graph space by adding and deleting single edges
Some are OK,but others makegraphnon-decomposable
7 6 5
2 3 41
Graph moves
Frydenberg & Lauritzen (1989) showed that all decomposable graphs are connected by single-edge moves
Can we test formaintaining decomposabilitybefore committing tomaking the change?
7 6 5
2 3 41
Deleting edges?
Deleting an edge maintains decomposability if and only if it is contained in exactly one clique of the current graph (Frydenberg & Lauritzen)
7 6 5
2 3 41
Adding edges? (Giudici & Green)
Adding an edge (a,b) maintains decomposability if and only if either:
7 6 5
2 3 41
• there exist sets R and T such that aR and bT are cliques and RT is a separator on the path in the junction tree between them
• a and b are in different connected components, or
Once the test is complete, actually committing to adding or deleting the edge is little work
7 6 5
2 3 41
12
267 236 345626 36
2
7 6 5
2 3 41
127
267 236 345626 36
27
12
2
It makes onlya (relatively)local change to the junction tree
Once the test is complete, actually committing to adding or deleting the edge is little work
Contents
• Hierarchical models• Variable-length parameters• Models with undirected edges• Hidden Markov models• Inference on structure• Discrete graphical models/PES• Grappa
DNA forensics example(thanks to Julia Mortera)
• A blood stain is found at a crime scene
• A body is found somewhere else!
• There is a suspect
• DNA profiles on all three - crime scene sample is a ‘mixed trace’: is it a mix of the victim and the suspect?
DNA forensics in Hugin
GRAPPAGRAPPA
Grappa code for the mixed-trace forensic problem
vs('alleles',c('8','10','11','x'))gene.freq<<-c(.184884,.134884,.233721,.446511)
founder('vmg'); founder('vpg')genotype('vgt','vmg','vpg')
founder('smg'); founder('spg')genotype('sgt','smg','spg')
query('T2eqv'); query('T1eqs')by('target','T2eqv','T1eqs')vs('target',c('SV','SU','UV','UU'))
select('T2mg','vmg','T2eqv')select('T2pg','vpg','T2eqv')
select('T1mg','smg','T1eqs')select('T1pg','spg','T1eqs')
genotype('T2gt','T2mg','T2pg')genotype('T1gt','T1mg','T1pg')
mix('mix','T2gt','T1gt')
compile()initcliqs()trav()
prop.evid('vgt','8-10')prop.evid('sgt','8-11')prop.evid('mix','8-10-11')
pnmarg('target') ==>> target=SV target=SU target=UV target=UU 0.7278388 0.09543417 0.1485508 0.02817623
HSSS
Highly Structured Stochastic Systems (HSSS) is the name given to a modern strategy for building statistical models for challenging real-world problems, for computing with them, and for interpreting the resulting inferences.
Complexity is handled by working up from simple local assumptions in a coherent way, and that is the key to modelling, computation, inference and interpretation.
HSSS, cont’d
HSSS emphasises common ideas and structures, such as graphical, hierarchical and spatial models, and techniques, such as Markov chain Monte Carlo methods and local exact computation.
HSSS: new challenges for research
include • developing diagnostic and analytic tools for model
criticism; • understanding sensitivity of models to local
specifications; • designing new MCMC algorithms, • identifying limits of causal interpretation in networks
representing observational studies;• introducing nonparametric elements into graphical
models; • extending the theory and methodology to systems
that develop over time.
Highly Structured Stochastic Systems book• Graphical models and causality
– T Richardson/P Spirtes, S Lauritzen, P Dawid, R Dahlhaus/M Eichler
• Spatial statistics– S Richardson, A Penttinen,
H Rue/M Hurn/O Husby
• MCMC– G Roberts, P Green, C Berzuini/W Gilks
Highly Structured Stochastic Systems book (ctd)
• Biological applications– N Becker, S Heath, R Griffiths
• Beyond parametrics– N Hjort, A O’Hagan
... with 30 discussants
editors: N Hjort, S Richardson & P Green
OUP (2003), to appear