Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

download Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

of 38

Transcript of Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    1/38

    Lecture 10: Junction Tree Algorithm

    CSCI 780 - Machine Learning

    Andrew Rosenberg

    March 4, 2010

    1 / 1

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    2/38

  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    3/38

    Example of Graphical Model.

    x0x1

    x2

    x3

    x4

    x5

    (xi, i) =m(xi, i)

    m(i)

    3 / 1

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    4/38

    Efficient Computation of Marginals

    a

    b

    c

    d

    e

    Pass messages (small tables) around the graph.

    The messages will be small functions that propagatepotentials around an undirected graphical model.

    The inference technique is the Junction Tree Algorithm

    4 / 1

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    5/38

    Junction Tree Algorithm

    Junction Tree Algorithm

    Moralization

    Introduce Evidence

    Triangulate

    Construct Junction Tree

    Propagate Probabilities

    5 / 1

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    6/38

    Junction Tree Algorithm

    Junction Tree Algorithm

    Moralization

    Introduce Evidence

    Triangulate

    Construct Junction Tree

    Propagate Probabilities

    6 / 1

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    7/38

    M li i

  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    8/38

    Moralization

    x0x1

    x2

    x3

    x4

    x5

    p(x0)p

    (x1|x0)p

    (x2|x0)p

    (x3|x1)p

    (x4|x2)p

    (x5|x1, x

    4)

    x0

    x1

    x2

    x3

    x4

    x5

    1

    Z(x0, x1)(x0, x2)(x1, x3)(x2, x4)(x1, x4, x5)

    8 / 1

    A th M li ti E l

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    9/38

    Another Moralization Example

    9 / 1

    J ti T Al ith

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    10/38

    Junction Tree Algorithm

    Junction Tree Algorithm

    Moralization

    Introduce Evidence

    Triangulate

    Construct Junction Tree

    Propagate Probabilities

    10/1

    I t d E id

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    11/38

    Introduce Evidence

    Given a moral graph. Identify what is observed xE.

    Reduce Probability functions since we know that some valuesare fixed.

    So only keep probability functions over remaining nodes xF

    p(X) = 1Z(x1, x2, x3, x4)(x4, x5)(x4, x6)(x4, x7)

    p(XF|XE) 1

    Z(x1, x2, x3 = x3, x4 = x4)(x4 = x4, x5)(x4 = x4, x6)(x4 = x4, x7)

    1

    Z(x1, x2)(x5)(x6)(x7)

    Replace potential functions with slices.4 .1

    .12 .15

    Requires a different normalization term

    11/1

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    12/38

    Junction Tree Algorithm

  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    13/38

    Junction Tree Algorithm

    Junction Tree Algorithm

    Moralization

    Introduce Evidence

    Triangulate

    Construct Junction Tree

    Propagate Probabilities

    13/1

    Junction Trees

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    14/38

    Junction Trees

    Ultimately we want to construct Junction Trees.

    Each node is a clique of variables in a modal graph.Edges connect cliquesThere is a unique path from a node to the rootBetween each connected clique node there is a separator node.Separators contain intersections of variables

    14/1

    Triangulation

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    15/38

    Triangulation

    How do we construct a Junction Tree?

    Need to guarantee that a Junction Graph, made up of thecliques and separators of an undirected graph is a Tree

    To do this, we make triangles (3 node cliques)

    I.e. eliminate any (chordless) cycles of 4 or more nodes.

    15/1

    Triangulation

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    16/38

    Triangulation

    There are potentially many choices for which edge to add.

    Wed like to keep the largest clique size small Small tables.

    However, Triangulation that minimizes the largest clique sizeis NP-complete.

    Suboptimal triangulation is acceptable (poly-time) andgenerally doesnt introduce too many extra dimensions.

    16/1

    Triangulation

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    17/38

    Triangulation

    There are potentially many choices for which edge to add.

    Wed like to keep the largest clique size small Small tables.

    However, Triangulation that minimizes the largest clique sizeis NP-complete.

    Suboptimal triangulation is acceptable (poly-time) andgenerally doesnt introduce too many extra dimensions.

    17/1

    Triangulation Examples

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    18/38

    Triangulation Examples

    18/1

    Triangulation Examples

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    19/38

    Triangulation Examples

    19/1

    Triangulation Examples

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    20/38

    Triangulation Examples

    20/1

    Triangulation Examples

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    21/38

    Triangulation Examples

    21/1

    Triangulation Examples

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    22/38

    g p

    22/1

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    23/38

    Junction Tree Algorithm

  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    24/38

    g

    Junction Tree Algorithm

    Moralization

    Introduce Evidence

    Triangulate

    Construct Junction Tree

    Propagate Probabilities

    24/1

    Constructing Junction Trees

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    25/38

    g

    Multiple trees can be constructed from the same graph.

    Junction Trees must satisfy the Running IntersectionProperty

    On the path connecting clique node V to clique node W, allother clique nodes must include the nodes in V W.

    25/1

    Constructing Junction Trees

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    26/38

    g

    Multiple trees can be constructed from the same graph.

    Junction Trees must satisfy the Running IntersectionProperty

    On the path connecting clique node V to clique node W, allother clique nodes must include the nodes in V W.

    Also: A Junction Tree has the largest total separator cardinality.

    |(B,C)| + |(C,D)| > |(C,D)| + |(D)|

    26/1

    Forming a Junction Tree

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    27/38

    g

    Given a set of cliques, must connect the nodes, such that theRunning Intersection Property holds.

    A valid Junction Tree maximizes the cardinality of theseparators

    Kruskals algorithm1 Initialize a tree with no edges

    2 Calculate the size of separators between all pairs

    3 Connect two cliques with the largest separator cardinality(that doesnt create a loop

    4 Repeat 3 until all nodes are connected.

    27/1

    Junction Tree Algorithm

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    28/38

    Junction Tree Algorithm

    Moralization

    Introduce Evidence

    Triangulate

    Construct Junction Tree

    Propagate Probabilities

    28/1

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    29/38

    Conversion from Directed Graph

  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    30/38

    Example Conversion.

    p(X) =1

    Z

    C(xC)

    S(xS)

    We can represent CPTs as clique and separator potential functions(with a normalization term).

    30/1

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    31/38

    Junction Tree Algorithm

  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    32/38

    Send a message from each clique to its separator.

    The message is what the clique thinks the marginal should be

    Normalize the clique by each message from its separatorssuch that it agrees.

    If they agree, finished!

    V\S

    V = S = p(S) = S =

    W\S

    W

    If not....

    32/1

    Junction Tree Algorithm Message Passing

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    33/38

    S =XV\S

    V

    W =

    SS

    W

    V = V

    S =XW\S

    W

    V =

    SS

    V

    W =

    W

    X

    V\S

    V =

    X

    V\S

    SS

    V

    =SS

    X

    V\S

    V

    = S =X

    W\S

    W

    33/1

    Junction Tree algorithm

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    34/38

    When convergence is reached clique potentials aremarginals and separator potentials are submarginals

    p(x) never changes because of this message passing.

    p(x) = 1ZVWS

    = 1ZV

    S

    SW

    S

    = 1ZVW

    S

    This implies that, so long as p(x) is correctly represented in thepotential functions, the junction tree algorithm can be used to

    make each potential correspond to an appropriate marginal withoutchanging the overall probability function.

    34/1

    Converting From DAG to Junction Tree

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    35/38

    Convert the Directed Graph to the Junction Tree

    Initialize separators to 1, and the clique tables to CPTs.

    p(X) = p(x1)p(x2|x1)p(x3|x2)p(x4|x3)p(x5|x3)p(x6|x5)p(x7|x5)

    p(X) =1

    Z

    QC(Xc)Q

    S(XS)

    =1

    1

    p(x1, x2)p(x3|x2)p(x4|x3)p(x5|x3)p(x6|x5)p(x7|x5)

    1 1 1 1 1

    Run JTA to set potential functions to marginals. 35/1

    Evidence in Junction Tree

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    36/38

    Initialize the same way.

    AB = p(A,B)BC = p(C|B)

    B = 1

    Update with a slice instead of the whole table.

    B =

    X

    A

    AB(A = 1) =X

    A

    p(A,B)(A = 1) = p(A = 1,B)

    BC =

    BB

    BC =p(A = 1,B)

    1p(C|B) = p(A = 1,B,C)

    AB = AB = p(A = 1,B)

    Conditionals

    p(B,C|A = 1) =BCPB,C

    BC

    36/1

    Efficiency of The Junction Tree Algorithm

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    37/38

    All steps are efficient.

    1 Construct CPTs

    Polynomial in # of data points

    2 Moralization

    Polynomial in # of nodes (variables)

    3 Introduce Evidence

    Polynomial in # of nodes (variables)4 Triangulate

    Suboptimal=Polynomial, Optimal=NP

    5 Construct Junction Tree

    Polynomial in the number of cliquesIdentifying Cliques = Polynomial in the number of nodes.

    6 Propagate Probabilities

    Polynomial in number of cliques, Exponential in size of cliques

    37/1

    Bye

    http://find/http://goback/
  • 8/3/2019 Andrew Rosenberg- Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

    38/38

    Next

    Clustering Preview.

    38/1

    http://find/http://goback/