Junction Tree Algorithm
Brookes Vision Reading Group
Outline• Graphical Models
– What are Graphical Models ?– Conditional Independence– Inference
• Junction Tree Algorithm– Moralizing a graph– Junction Tree Property– Creating a junction tree– Inference using junction tree algorithm
Don’t we all know …
• P(A) = 1 , if and only if A is certain
• P(A or B) = P(A)+P(B) if and only if A and B are mutually exclusive
• P(A,B) = P(A|B)P(B) = P(B|A)P(A)
• Conditional Independence– A is conditionally independent of C given B – P(A|B,C) = P(A|B)
Outline• Graphical Models
– What are Graphical Models ?– Conditional Independence– Inference
• Junction Tree Algorithm– Moralizing a graph– Junction Tree Property– Creating a junction tree– Inference using junction tree algorithm
Graphical ModelsCompact graphical representation of joint probability.
A
B
P(A,B) = P(A)P(B|A)
A ‘causes’ B
Graphical ModelsCompact graphical representation of joint probability.
A
B
P(A,B) = P(B)P(A|B)
B ‘causes’ A
Graphical ModelsCompact graphical representation of joint probability.
A
B
P(A,B)
A Simple Example
P(A,B,C) = P(A)P(B,C | A)
= P(A) P(B|A) P(C|B,A)
C is conditionally independent of A given B
= P(A) P(B|A) P(C|B)
Graphical Representation ???
Bayesian NetworkDirected Graphical Model
P(U) = P(Vi | Pa(Vi))A
B
C
P(A,B,C) = P(A) P(B | A) P(C | B)
Markov Random FieldsUndirected Graphical Model
A B C
Markov Random FieldsUndirected Graphical Model
AB BCB
P(U) = P(Clique) / P(Separator)
Clique CliqueSeparator
P(A,B,C) = P(A,B) P(B,C) / P(B)
Outline• Graphical Models
– What are Graphical Models ?– Conditional Independence– Inference
• Junction Tree Algorithm– Moralizing a graph– Junction Tree Property– Creating a junction tree– Inference using junction tree algorithm
Bayesian Networks
• A is conditionally independent of B given C
• Bayes ball cannot reach A from B
Markov Random Fields
• A, B, C - (set of) nodes
• C is conditionally independent of A given B
• All paths from A to C go through B
Markov Random Fields
Markov Random Fields
A node is conditionally independent of all others given its neighbours.
Outline• Graphical Models
– What are Graphical Models ?– Conditional Independence– Inference
• Junction Tree Algorithm– Moralizing a graph– Junction Tree Property– Creating a junction tree– Inference using junction tree algorithm
MAP Estimation
c*,s*,r*,w* = argmax P(C=c,S=s,R=r,W=w)
Computing Marginals
P(W=w) = c,s,r P(C=c,S=s,R=r,W=w)
Outline• Graphical Models
– What are Graphical Models ?– Conditional Independence– Inference
• Junction Tree Algorithm– Moralizing a graph– Junction Tree Property– Creating a junction tree– Inference using junction tree algorithm
Aim
• To perform exact inference efficiently
• Transform the graph into an appropriate data structure
• Ensure joint probability remains the same
• Ensure exact marginals can be computed
Junction Tree Algorithm
• Converts Bayes Net into an undirected tree– Joint probability remains unchanged– Exact marginals can be computed
• Why ???– Uniform treatment of Bayes Net and MRF– Efficient inference is possible for undirected trees
Junction Tree Algorithm
• Converts Bayes Net into an undirected tree– Joint probability remains unchanged– Exact marginals can be computed
• Why ???– Uniform treatment of Bayes Net and MRF– Efficient inference is possible for undirected trees
Let us recap .. Shall we
P(U) = P(Vi | Pa(Vi))
= a(Vi , Pa(Vi)) Potential
Lets convert this to an undirected graphical model
A
B C
D
Let us recap .. Shall weA
B C
D
Wait a second …something is wrong here.
The cliques of this graph are inconsistent with the original one.
Node D just lost a parent.
SolutionA
B C
D
Ensure that a node and its parents are part of the same clique
Marry the parents for a happy family
Now you can make the graph undirected
SolutionA
B C
D
A few conditional independences are lost.
But we have added extra edges, haven’t we ???
Outline• Graphical Models
– What are Graphical Models ?– Conditional Independence– Inference
• Junction Tree Algorithm– Moralizing a graph– Junction Tree Property– Creating a junction tree– Inference using junction tree algorithm
Moralizing a graph
• Marry all unconnected parents
• Drop the edge directions
Ensure joint probability remains the same.
Moralizing a graph
Moralizing a graphC
S R
W
CSR
SRW
SR
Clique Potentials a(Ci)
Separator Potentials a(Si)
Initialize
a(Ci) = 1
a(Si) = 1
Moralizing a graphC
S R
W
CSR
SRW
SR
Choose one node Vi
Find one clique Ci containing Vi and Pa(Vi)
Multiply a(Vi,Pa(Vi)) to a(Ci)
Repeat for all Vi
Moralizing a graphC
S R
W
CSR
SRW
SR
Choose one node Vi
Find one clique Ci containing Vi and Pa(Vi)
Multiply a(Vi,Pa(Vi)) to a(Ci)
Repeat for all Vi
Moralizing a graphC
S R
W
CSR
SRW
SR
Choose one node Vi
Find one clique Ci containing Vi and Pa(Vi)
Multiply a(Vi,Pa(Vi)) to a(Ci)
Repeat for all Vi
Moralizing a graphC
S R
W
CSR
SRW
SR
Choose one node Vi
Find one clique Ci containing Vi and Pa(Vi)
Multiply a(Vi,Pa(Vi)) to a(Ci)
Repeat for all Vi
Moralizing a graph
P(U) = a(Ci) / a(Si)
Now we can form a tree with all the cliques we chose.
That was easy. We’re ready to marginalize.
OR ARE WE ???
A few more examples …
A B
DC
A B
C D
A few more examples …
A B
DC
A B
C D
A few more examples …
AB BD
CDAC
AB BCD
Inconsistency in C
Clearly we’re missing something here
Outline• Graphical Models
– What are Graphical Models ?– Conditional Independence– Inference
• Junction Tree Algorithm– Moralizing a graph– Junction Tree Property– Creating a junction tree– Inference using junction tree algorithm
Junction Tree Property
In a junction tree, all cliques in the unique path between cliques Ci and Cj must contain CiCj
So what we want is a junction tree, right ???
Q. Do all graphs have a junction tree ???
A. NO
Decomposable GraphsDecomposition (A,B,C)
A C B
• V = A B C
• All paths between A and B go through C
• C is a complete subset of V
Undirected graph G = (V,E)
Decomposable Graphs
• A, B and/or C can be empty
• A, B are non-empty in a proper decomposition
Decomposable Graphs
• G is decomposable if and only if
• G is complete OR
• It possesses a proper decomposition (A,B,C) such that– GAC is decomposable
– GBC is decomposable
Decomposable Graphs
A B
DC
A B
C D
Not Decomposable Decomposable
Decomposable Graphs
Not Decomposable Decomposable
A
B C
ED
A
B C
ED
An Important Theorem
Theorem: A graph G has a junction tree if and only if it is decomposable.
Proof on white board.
OK. So how do I convert my graph into a decomposable one.
Time for more definitions
• Chord of a cycle– An edge between two non-successive nodes
• Chordless cycle– A cycle with no chords
• Triangulated graph– A graph with no chordless cycles
Another Important Theorem
Theorem: A graph G is decomposableif and only if it is triangulated.
Proof on white board.
Alright. So add edges to triangulate the graph.
Triangulating a Graph
A B
C D
ABC BCDBC
Triangulating a Graph
Triangulating a Graph
Some Notes on Triangulation
Can we ensure the joint probability remains unchanged ??
Of course. Adding edges preserves cliques found after moralization.
Use the previous algorithm for initializing potentials.
Aren’t more conditional independences lost ???
Yes. :-(
Some Notes on TriangulationIs Triangulation unique??
No.
Okay then. Lets find the best triangulation.
Sadly, that’s NP hard.
Hang on. We still have a graph. We were promised a tree.
Alright. Lets form a tree then.
Outline• Graphical Models
– What are Graphical Models ?– Conditional Independence– Inference
• Junction Tree Algorithm– Moralizing a graph– Junction Tree Property– Creating a junction tree– Inference using junction tree algorithm
Creating a Junction Tree
A
B D
C E
ABD
CDE
BCD
ABD
BCD
CDE
Not a junction tree
Junction tree
Clearly, we’re still missing something here.
Yet Another Theorem
Theorem: A junction tree is an MSTwhere the weights are the cardinalityof the separators.
Proof on white board.
Alright. So lets form an MST.
A
B D
C E
Forming an MST
ABD
BCD CDE
2
2
1
A
B D
C E
Forming an MST
ABD
BCD CDE
2
2
1
A
B D
C E
Forming an MST
ABD
BCD CDE
2
2
1
A
B D
C E
Forming an MST
ABD
BCD CDE
2
2
A Quick Recap
A S
BLT
E
X D
Asia Network
A Quick Recap
A S
BLT
E
X D
1. Marry unconnected parents
A Quick Recap
A S
BLT
E
X D
2. Drop directionality of edges.
A Quick Recap
A S
BLT
E
X D
3. Triangulate the graph.
A Quick Recap4. Find the MST clique tree. Voila .. The junction tree.
SBL
BLE
DBEXE
TLEAT
Whew. Done !!
But where are these marginals we were talking about ?
Outline• Graphical Models
– What are Graphical Models ?– Conditional Independence– Inference
• Junction Tree Algorithm– Moralizing a graph– Junction Tree Property– Creating a junction tree– Inference using junction tree algorithm
Inference using JTA
• Modify potentials
• Ensure joint probability is consistent
• Ensure consistency between neighbouring cliques
• Ensure clique potentials = clique marginals
• Ensure separator potentials = separator marginals
Inference using JTA
V WS
1. a*(S) = V\S a(V)
2. a*(W) = a(W) a*(S) / a(S)
3. a**(S) = W\S a*(W)
4. a*(V) = a(V) a**(S) / a*(S)
V\S a*(V) = a**(S)
= W\S a*(W)
Consistency
Inference using JTA
V WS
1. a*(S) = V\S a(V)
2. a*(W) = a(W) a*(S) / a(S)
3. a**(S) = W\S a*(W)
4. a*(V) = a(V) a**(S) / a*(S)
a*(V) a*(W) / a**(S)
= a(V) a(W) / a(S)
Joint probabilityremains same
One Last Theorem
Theorem: After JTA, Potentials = Marginals
Proof on white board.
(Then we can all go home)
Happy Marginalizing
Top Related