0904.3769v5(1)

download 0904.3769v5(1)

of 8

Transcript of 0904.3769v5(1)

  • 8/10/2019 0904.3769v5(1)

    1/8

    arXiv:0904

    .3769v5[math.ST

    ]6May2009

    Orbit-Product Representation and Correction of

    Gaussian Belief Propagation

    Jason K. Johnsona [email protected]

    Vladimir Y. Chernyakb,a [email protected]

    Michael Chertkova [email protected] Division and Center for Nonlinear Studies, LANL, Los Alamos, NM 87545, USAbDepartment of Chemistry, Wayne State University, Detroit, MI 48202, USA

    Abstract

    We present a new view of Gaussian beliefpropagation (GaBP) based on a representa-tion of the determinant as a product over or-bits of a graph. We show that the GaBPdeterminant estimate captures totally back-tracking orbits of the graph and consider howto correct this estimate. We show that themissing orbits may be grouped into equiva-lence classes corresponding to backtracklessorbits and the contribution of each equiv-alence class is easily determined from theGaBP solution. Furthermore, we demon-strate that this multiplicative correction fac-tor can be interpreted as the determinant of abacktrackless adjacency matrix of the graph

    with edge weights based on GaBP. Finally,an efficient method is proposed to computea truncated correction factor including allbacktrackless orbits up to a specified length.

    1. Introduction

    Belief Propagation is a widely used method for infer-ence in graphical models. We study this algorithmin the context of Gaussian graphical models. Therehave been several studies of Gaussian belief propa-gation (GaBP) [13, 11, 10] as well as numerous ap-

    plications [8, 2, 1]. The best known sufficient condi-tion for its convergence is the walk-summable condi-tion [6, 7] (see also [5, 9]), which also provides newinsights into the algorithm by interpreting it as com-puting weighted sums of walks (walk-sums) within thegraph. Our aim in this present paper is to extend

    Appearing inProceedings of the 26th International Confer-ence on Machine Learning, Montreal, Canada, 2009. Copy-right 2009 by the author(s)/owner(s).

    this graphical/combinatorial view of GaBP to includeestimation of the determinant (partition function) ofthe Gaussian graphical model. This work is also in-spired by the loop-series correction method for belief

    propagation [4] that was recently extended to Gaus-sian graphical models [3].

    Our present study leads to a new perspective on GaBPhaving close ties to graphical zeta functions [12]. Wefind that for walk-summable models the determinantmay be represented as a product over all orbits (cyclicwalks) of the graph. The estimate of the determi-nant provided by GaBP only captures totally back-tracking orbits, which can be embedded as orbits inthe computation tree (universal cover) of the graph.The missing orbits may then be grouped into equiv-alence classes corresponding to backtrackless orbits.

    The orbit-product over each such equivalence classmay be simply computed from the solution of GaBP.Also, the product over all backtrackless orbits may beinterpreted as the determinant of a backtrackless adja-cency matrix of the graph with appropriately definededge weights based on the GaBP solution. Finally, wepropose a simple, efficient method to compute trun-cated orbit-products including all orbits up to somespecified length and provide an error-bound on the re-sulting estimates. In certain classes of graphs (e.g.,grids), this leads to an efficient method with complex-ity linear in the number of nodes and the requiredprecision of the determinant estimate.

    This paper differs fundamentally from [3] in that werely heavily on the walk-summable property to developmultiplicative expansions using infinitely many orbitsof the graph, whereas [3] develops additive expansionsover a finite number of generalized loops (which maybe disconnected) using methods of Grassman calcu-lus. Our present approach leads naturally to approx-imation methods (with accuracy guarantees in walk-summable models) based on truncated orbit-products.

    http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5http://arxiv.org/abs/0904.3769v5
  • 8/10/2019 0904.3769v5(1)

    2/8

    Orbit-Products & Gaussian BP

    2. Preliminaries

    2.1. Walks and Orbits of a Graph

    Let G be a graph on vertices (nodes) V ={1, . . . , n}with undirected edges {i, j} G. We may alsotreat each undirected edge{i, j} as a symmetric pairof directed edges (ij) and (ji). A walk w is a se-quence of adjacent vertices (w0. . . wL) (wt V fort= 0, . . . , Land{wt, wt+1} G for t = 1, . . . , L 1)where|w| L is the length of the walk. A walkmay be equivalently specified as a sequence of stepsw= ((w0w1)(w1w2) . . . (wL1wL)) such that each stepis a (directed) edge of the graph that ends where thefollowing step begins. A walk may visit the same nodeor cross the same edge multiple times and may alsobacktrackthat is, it may step back to the precedingvertex. A walk is closed if it begins and ends at thesame node w0 = wL. A closed walk is primitive if itis not a multiple of some shorter walk (e.g., the walk

    (1231) is primitive but (1231231) is not). We define anorbit = [w] to be an equivalence class of closed prim-itive walks, where two walks are considered equivalent[w] = [w] if one is a cyclic shift of the other (i.e., ifwt = wt+s(mod ) for some s andt = 1, . . . , L). Hence,there is a one-to-one correspondence between orbitsand (non-terminating) cyclic walks.

    The following classification of walks and orbits playsan essential role in our analysis: A walk (or orbit) issaid to bereducibleif it contains abacktracking pairofconsecutive steps . . . (ij)(ji) . . . , otherwise the walk isirreducible. By repeatedly deleting backtracking pairsuntil none remain one obtains the (unique) irreduciblecore= (w) of the walk w. For closed walks it mayhappen that =, where () denotes the trivial(empty) walk. We then say that the walk is totallyreducible. We say that a walk is non-trivialif it is nottotally reducible. Totally reducible walks have beencalled backtracking [7], although totally backtrackingis perhaps a better description. Irreducible walks havebeen called backtrackless (or non-backtracking) else-where in the literature.

    Notation. L denotes the set of all orbits of G, (L)denotes the irreducible (backtrackless) orbits, and wepartition

    L into (disjoint) equivalence classes

    L

    { L|() = } for (L). In particular,Ldenotes the class of totally backtracking orbits, whichplays a special role in our interpretation of GaBP. Wewill use to denote a generic orbit and reserve todenote irreducible orbits.

    Example. Orbits [1231], [1231451], [123421561] arebacktrackless; [1234321], [1232421], [1231321] are to-tally backtracking; [123241] is both reducible and non-

    trivial (neither backtrackless nor totally backtracking).

    2.2. Gaussian Belief Propagation

    A Gaussian graphical model is a probability distribu-tion

    p(x) =

    Z1 exp

    12

    xTJx+hTx (1)of random variables x Rn where Jis a sparse, sym-metric, positive-definite matrix. The fill-pattern ofJdefines a graph G with vertices V ={1, . . . , n} andedges (ij) for all Jij = 0. The partition functionis defined byZ(h, J) exp{12xTJx+ hTx}dx =

    (2)n det J11/2

    e12h

    TJ1h so as to normalize the dis-tribution. Given such a model, we may compute themean vector

    p(x)xdx = J1h and covariance

    matrix K

    p(x)(x)(x)Tdx = J1. Thisgenerally requiresO(n3) computation in dense graphsusing Gaussian elimination. If G is sparse and onlycertain elements ofKare required (the diagonal and

    edge-wise covariances), then the complexity of Gaus-sian elimination may be substantially reduced (e.g.,O(n3/2) in planar graphs using nested dissection) butstill generally has complexity growing asO(w3) in thetree-widthw of the graph.

    Gaussian belief propagation (GaBP) is a simple, dis-tributed, iterative message-passing algorithm to es-timate the marginal distribution p(xi) of each vari-able, which is specified by its mean i and varianceKii. GaBP is parameterized by a set of messages

    mij(xj) = e12ijx

    2j+ijxj defined on each directed edge

    (ij) of the graph (mij is regarded as a message being

    passed from i to j ). The GaBP equations are:

    mij(xj)

    i(xi)

    ki\j

    mki(xi)ij(xi, xj)dxi

    where i = e12Jiix

    2i+hixi , ij = eJijxixj and i de-

    notes the set of neighbors of i in G. This reduces tothe following rules for computing (, )-messages:

    ij = J2ij(Jii i\j)1

    ij = Jij(Jii i\j)1(hi+i\j) (2)where i\j ki\jki and i\j ki\jki.These equations are solved by iteratively recomputingeach message from the other messages until conver-gence. The marginal distribution is then estimated as

    pbp(xi) i(xi)

    kimki(xi), which gives variance

    estimatesKbpi = (Jii

    kki)1 and mean estimates

    bpi = Kbpi (hi +

    kki). In trees, this method is

    equivalent to Gaussian elimination, terminates aftera finite number of steps and then provides the cor-rect marginals. In loopy graphs, it may be viewed as

  • 8/10/2019 0904.3769v5(1)

    3/8

    Orbit-Products & Gaussian BP

    performing Gaussian elimination in the computationtree (universal cover) of the graph [10, 7] (obtained byunrolling loops) and may therefore fail to convergedue to the infinite extent of the computation tree. Ifit does converge, the mean estimates are still correctbut the variances are only approximate. We also ob-tain an estimate of the pairwise covariance matrix onedges{i, j} G:

    Kbp(ij) =

    Jii i\j Jij

    Jij Jjj j\i

    1

    In this paper, we are concerned with the GaBP esti-mate of the determinant Z= det K= det J1 (whichis closely linked to computation of the partition func-tionZ). We obtain an estimate ofZ from the GaBPsolution as:

    Zbp =iV

    Zbpi

    {i,j}G

    Zbpij

    Zbpi Zbpj

    (3)

    where Zbpi = Kbpi and Z

    bpij = det K

    bp(ij). The mo-

    tivation for this form of estimate is that it becomesexact ifG is a tree. In loopy graphs, there may gen-erally be no stable solution to GaBP (or multiple un-stable solutions). The main objective of this paper isto interpret the estimate Zbp in the context of walk-summable models (described below), for which there isa well-defined stable solution, and to suggest methodsto correct this estimate. Note that the variance andcovariance estimates (and hence the determinant esti-mate Zbp) are independent ofh and the -messagesin (2), they are determined solely by the -messages

    determined byJ. Since GaBP correctly computes themeans in walk-summable models, we are mainly con-cerned with how to correct Zbp (and hence its deriva-tives, which correspond to the GaBP estimates of vari-ances/covariances).

    Walk-Sum Interpretation Our approach in thispaper may be considered as as extension of the walk-sum interpretation of GaBP [7]. LetJ be normalizedto have unit-diagonal, such that J = IR with Rhaving zeros along its diagonal. The walk-sum idea isbased on the series K = (I R)1 =kRk, whichconverges if(R)< 1 where (R) denotes the spectral

    radiusof the matrixR (the maximum modulus of theeigenvalues ofR). This allows us to interpret Kij asa sum over all walks in the graph G which begin atnodei and end at nodej where the weight of a walk is

    defined asRw =

    (ij)wrnij(w)ij andnij(w) is a count

    of how many times step (ij) occurs in the walk. Wewrite this walk-sum as Kij =

    w:ijR

    w. However,in order for the walk-sum to be well-defined, it mustconverge to the same value regardless of the order in

    which we add the walks. This is equivalent to requir-ing that it converges absolutely. Thus, we say thatR is walk-summable if

    w:ij |Rw| converges for all

    i, j V. This is equivalent to the spectral conditionthat (|R|) < 1 where|R| (|rij |) is the element-wise absolute-value matrix ofR. A number of otherequivalent or sufficient conditions are given in [7].

    In walk-summable models it then holds that variancescorrespond to closed walk-sums Kii=

    w:ii R

    w andmeans correspond to a (reweighted) walk-sum over allwalks which end at a specific node i =

    w:i hR

    w

    (here denotes the arbitrary starting point of thewalk). Moreover, we may interpret the GaBP messageparameters (, ) as recursively computing walk-sumswithin the computation tree [7]. This implies thatGaBP converges in walk-summable models and con-verges to the same walk-sum solution independentof the order in which we update messages. This inter-pretation also shows that GaBP computes the correct

    walk-sums for the means but only computes a subsetof the closed walks needed for the variances. Specifi-cally, Kbpi only includes totally backtracking walks atnodei. This is seen as a walk is totally backtracking ifand only if it can be embedded as a closed walk in thecomputation tree of the graph and it is these closedwalks of the computation tree which GaBP capturesin its variance estimates.

    3. Orbit-Product Interpretation of

    Gaussian BP

    3.1. Determinant Z as Orbit-Product

    Let Z(R) det(IR)1. In walk-summable mod-els, we may give this determinant another graphicalinterpretation as a product over orbits of a graph, oneclosely related to the so-calledzeta functionof a graph[12].

    Theorem 1 If(|R|) < 1 then it holds that Z(R) =(1 R)1

    Z where the product is taken over

    all orbits ofG andR =

    (ij)rnij()ij wherenij() is

    the number of times step (ij) occurs in orbit.

    Proof. log det(I R)1 = tr log(I R)1 = trk Rkk=

    closedwRw

    |w| =

    primitivew

    m=1

    (Rw)m

    m|w| =primitivew

    1|w|log(1 Rw)1 =

    orbits log(1

    R)1 = log

    (1R)1. We have used the iden-tity log det A = trlog A and the series expansion

    log(I A)1 = k=1 Akk . Each closed walk is ex-pressed as a multiple of a primitive walk. Every prim-itive walkw has exactly|w| distinct cyclic shifts.

  • 8/10/2019 0904.3769v5(1)

    4/8

    Orbit-Products & Gaussian BP

    We emphasize that (|R|) < 1 is necessary to in-sure that the the orbit-products we consider are well-defined. This condition is assumed throughout the re-mainder of the paper.

    3.2.Zbp as Totally Backtracking Orbits

    Totally backtracking walks play an important role inthe walk-sum interpretation of the GaBP variance es-timates. We now derive an analogous interpretation ofZbp defined by (3):

    Theorem 2 Zbp =

    LZ where the product is

    taken over the set of totally backtracking orbits ofG.

    Although this result seems intuitive in view of priorwork, its proof is non-trivial involving arguments notused previously. To prove the theorem, we first sum-marize some useful lemmas. Consider a block ma-trix A = (A11A12; A21A22). The Schur complementof block A11 is A22 A22 A21A111A21. It holdsthat det A = det A11det A22 and (A

    22)

    1 = (A1)22.Using these well-known identities, it follows:

    Lemma 1 LetR = (R11R12; R21R22) andK = (IR)1 = (K11K12; K21K22). Thendet K11 =

    Z(R)Z(R22)

    .

    For walk-summable models, we then have

    det K11 =

    G ZG2

    Z=

    G| intersectsG1

    Z

    where the final orbit-product is taken over all orbitsofG which include any node of subgraph G1 (corre-sponding to submatrix R11). Next, using this resultand the interpretation of GaBP as inference on thecomputation tree, we are led to the following inter-pretation of the quantities Zbpi and Z

    bpij appearing in

    (3). LetTi denote the computation tree of the graphG with one copy of node i marked. Let Tij denotethe computation tree with one copy of edge{i, j} Gmarked. Then,

    Lemma 2 Zbpi =

    Ti|i

    Z where the product isover all orbits of Ti that include the marked node i.Zbpij =

    Tij |i or j

    Z where the product is over allorbits ofTij that include either endpoint of the markededge{i, j}.

    Proof of Theorem 2. Using Lemma 2 and the cor-respondence between orbits of the computation treeand totally backtracking orbits ofG, we may expand(3) to express Zbp entirely as a product over totallybacktracking orbits Zbp =

    L

    ZN where N is

    the count of how many times appears in the orbit-productthe number of times it appears in the numer-ator of (3) minus the number of times in appears inthe denominator. It remains to show that N = 1 foreach totally backtracking orbit. This may be seen byconsidering the subtree T of the computation tree Ttraced out by orbit. Letv ande respectively denotethe number of nodes and edges ofT(hence,e = v 1)and letc denote the number of edges ofTwith exactlyone endpoint in T. First, we count how many powersofZ appear in the orbit product

    i Z

    bpi . For each

    vertex i T we may pick this as the marked nodein the computation tree and this shows one way that can be embedded in Ti so as to include its markednode. Thus, v gives the total number of multiplesofZ in

    i Z

    bpi . Similarly, we could mark any edge

    {i, j} T with one or both endpoints in T and thisgives one way to embedintoTij so as to intersect the

    marked edge. Thus, the productij

    Zbpij contributes

    e+ c powers ofZ. Lastly, the product

    ijZbpi Zbpjcontains 2e + cpowers ofZ. This represents the num-ber of ways we may pick a directededge (ij) ofT suchthat at least one endpoint is in T. The total count isthenN= v+ (e+c) (2e+c) = v e= 1.Combining Theorems 1 and 2, we obtain the followingorbit-product correction to Zbp:

    Corollary 1 Z=Zbp L Z.This formula includes a correction for every missingorbit, that is, for every non-trivial orbit. This impliesthat Z = Zbp for trees since all orbits of trees are

    totally backtracking.

    3.3.Zbp Error Bound

    One useful consequence of the orbit-product interpre-tation ofZbp is that it provides a simple error boundon GaBP. Let g denote the girthof the graph G, de-fined as the length of the shortest cycle ofG. We notethat the missing orbits L must all have lengthgreater than or equal to g . Then,

    Theorem 3 1n

    log Z

    bp

    Z

    (|R|)gg(1(|R|)) .

    Proof. We derive the chain of inequalities:log ZZbp (a)= L log Z

    ||g | log Z|(b)

    ||glog(1 |R|)1 (c)= tr

    k=g

    |R|k

    k n

    k=gk

    k ng

    g

    k=0

    k = ng

    g(1) . (a) Corollary 1. (b)

    | log Z| =k=1 (R)kk

    k=1 (|R|)kk = log(1|R|)1. (c) The proof of Theorem 1 showsthat tr

    k1

    Rk

    k =

    log(1 R)1. Similarly,

  • 8/10/2019 0904.3769v5(1)

    5/8

    Orbit-Products & Gaussian BP

    tr

    kg|R|k

    k =

    ||glog(1 |R|)1.This is consistent with the usual intuition that beliefpropagation is most accurate in large girth graphs withweak interactions.

    4. Backtrackless Orbit Correction

    In this section we show that the set of orbits omittedin the GaBP estimate can be grouped into equivalenceclasses corresponding to backtrackless orbits and thatthe orbit-product over each such equivalence class issimply computed with the aid of the GaBP solution:

    Theorem 4 Z= Zbp=Zwhere the product isover allbacktrackless orbits ofG and we define

    Z= (1

    (ij)

    (rij)nij())1

    where r

    ij

    rij

    1i\j and i\j =

    ki\jki is com-puted from the solution of GaBP.

    In comparison to Corollary 1, here the correction fac-tor is expressed as an orbit-product over just the back-trackless orbits (whereas Corollary 1 uses a separatecorrection for each non-trivial orbit). However, all or-bits are still correctly accounted for because we modifythe edge weights of the graph so as to include a fac-tor (1i\j)1 (computed by GaBP) which servesto factor in totally-backtracking excursions at eachpoint along the backtrackless orbit, thereby generatingall non-trivial orbits.

    The basic idea underlying this construction is depictedin Figure 1. For each backtrackless orbit we de-fine an associated computation graph G as follows.First, we start with a single directed cycle based on = [12 L] (any duplicated nodes of the orbitmap to distinct nodes in this directed cycle). Then,for each nodek of this graph, we attach a copy of thecomputation treeTk\k+1 , obtained by taking the fullcomputation tree Tk rooted at node k and deletingthe branch (k, k+1) incident to the root. This con-struction is illustrated in Figure 1(a,b) for the graphG = K4 and orbit = [(12)(23)(31)]. The cycle has

    one-way directed edges whereas each computationtree has two-way undirected edges. This is under-stood to mean that walks are allowed to backtrackwithin the computation tree but not within the cycle.The importance of this graph is based on the followinglemma (the proof is omitted):

    Lemma 3 Let be a backtrackless orbit ofG. Then,there is a one-to-one correspondence between the class

    of orbitsL ofG and the non-trivial orbits ofG.

    (a)

    24

    r23r34

    r14

    1

    3

    G = K4

    r24

    r13

    r12

    (b)

    1

    1

    2 3

    3 4 1 4 2 4

    2 4 2 3 3 4 1 3 4 1 2

    T1\2 T2\3 T3\1

    r14

    r31

    r12 r23r13

    G[(12)(23)(31)]

    (c)

    1 3

    r12 r23

    r31

    2

    1\2 =31 + 41 3\1 = 23 + 432\3 = 12 + 42

    (d)

    1 2 3

    r31 = r3113\1

    r23 = r2312\3r12 =

    r1211\2

    Figure 1. Illustration of construction to combine equivalentorbits. (a) The graph G = K4. (b) The computation graphG for = [(12)(23)(31)]. (c) Finite graph with self-loopsat each node to capture totally backtracking walks. (d)Equivalent graph with modified edge weights to capturetotally backtracking walks.

    Next, we demonstrate how to compute all of the orbitswithin an equivalence class as a simple determinantcalculation based on the backtrackless orbit and theGaBP solution. LetRbe defined as the edge-weight

    matrix of a simple single-loop graph based on withedge-weights defined by rk,k+1 =

    rk,k+11k\k+1

    . This

    construction is illustrated in Figure 1(d). Then,

    Lemma 4 Z= det(I R)1 =

    LZ.

    Proof. Using Lemma 3, we see that the orbit-productL

    Z is equal to the product over all non-trivialorbits of the graphG, that is, the product over all or-bits inGwhich intersect the subgraph correspondingto . Using Lemma 1, this is equivalent to comput-ing the determinant of the corresponding submatrixofKG = (I

    RG)

    1 where RG is the edge-weightmatrix of the computation graph. This is equivalentto first eliminating each computation tree (by Gaus-sian elimination/GaBP) attached to each node of to obtain a reduced graphical model I Rand thencomputing det(I R)1. Using the GaBP solution,the effect of eliminating each computation tree is toadd a self-loop (diagonal element) to Rwith edge-weight k\k+1 =

    v=k+1

    v,k , obtained by sum-ming the incoming messages to node k from each

  • 8/10/2019 0904.3769v5(1)

    6/8

  • 8/10/2019 0904.3769v5(1)

    7/8

    Orbit-Products & Gaussian BP

    6. Block-Resummation Method

    Next, we consider an efficient method to approxi-mateZ(A) = det(I A)1 for walk-summable models(|A|) < 1. This method can be used to either di-rectly approximate Z(R) (A = R) or to approximatethe GaBP-correctionZ (A= R).

    Given a graphG based on verticesV, we specify a setof blocksB = (Bk V, k = 1, . . . , |B|) chosen suchthat: (1) Every short orbit||< L is covered by someblock B B, and (2) IfB, B B then B B B.We also define block weights wB as follows: wB = 1for maximal blocks (not contained by another block)and wB = 1

    BBwB for non-maximal blocks

    (these weights may be negative). This insures thatBBwB = 1 for each B B. Then, we define our

    estimate

    ZB B

    ZwBB B

    (det(I AB)1)wB (5)

    whereAB denotes the|B| |B|principle submatrix ofA corresponding to B .

    This approximation method is similar in spirit to ap-proximations used elsewhere (e.g., Kikuchi approxima-tions to free-energy [14]). However, the new insightsoffered by the orbit-product view allows us to give ourestimate a precise interpretation in walk-summableGaussian models:

    Theorem 6 ZB =

    LBZ whereLB BBLB

    andLB is the set of all orbits covered byB.

    Proof. ZB =

    B

    BZ

    wB =

    LB

    ZP

    BwB =

    LBZ where

    BwB = 1 follows from the defi-

    nition of the block weights. Moreover, we can then bound the error of the estimate.Noting thatLB includes all short orbits of the graph,we can derive the following result by a similar proof asfor Theorem 3:

    Corollary 2 1n

    log ZBZ (|A|)LL(1(|A|)) .Thus, for the class of models with

  • 8/10/2019 0904.3769v5(1)

    8/8

    Orbit-Products & Gaussian BP

    of r) and block sizes L = 2, 4, 8, 16, 32. The resultsare shown in Figure 3. As expected, accuracy rapidlyimproves with increasing L in both methods and theGaBP-correction approach is more accurate.

    7. Conclusion and Future Work

    We have demonstrated an orbit-product representa-tion of the determinant (the partition function of theGaussian model) and interpreted the estimate ob-tained by GaBP as corresponding to totally back-tracking orbits. Furthermore, we have shown how tocorrect the GaBP estimate in various ways which in-volve incorporating backtrackless orbits (e.g. cycles)of the graph. In particular, we demonstrated an ef-ficient block-resummation method to compute trun-cated orbit-products in sparse graphs (demonstratedfor grids). These methods also extend to address es-timation of the matrix inverse (the covariance matrix

    of the Gaussian model), which may in turn be usedas an efficient preconditioner for iterative solution oflinear systems. We leave these extensions for a longerreport.

    In future work, we plan to extend the method of con-structing an efficient set of blocks to other classesof sparse graphs. It may also be fruitful to extendour analysis to generalized belief propagation [14] inGaussian models. In a related direction, we intendto explore methods to bootstrap GaBP using the

    factorizationZ(R) =

    k=0Z(R2k

    ))1

    , which fol-

    lows from the formula (I

    R)1 =k(I+ R2

    k

    ). By

    computing Zbp(R2k) for small values of k we maycapture short backtrackless orbits of the graph. An-other direction is to investigate generalization of theformula Z = ZbpZ to non-walksummable models,perhaps using methods of [3]. A related idea is toapproximate a non-walksummable model by a walk-summable one and then correct estimates obtainedfrom the walk-summable model to better approximatethe non-walksummable model.

    References

    [1] D. Bickson, D. Dolev, and E. Yom-Tov. AGaussian belief propagation solver for large scaleSVMs. In 5th Europ. Conf. Complex Systems,2008.

    [2] D. Bickson, O. Shental, P. Siegel, J. Wolf, andD. Dolev. Gaussian belief propagation based mul-tiuser detection. InIEEE Int. Symp. on Inform.Th., pages 18781882, 2008.

    [3] V. Chernyak and M. Chertkov. Fermions andloops on graphs I: Loop calculus for determi-nant. J. Statistical Mechanics: Theory and Ex-periments, 2008.

    [4] M. Chertkov and V. Chernyak. Loop series fordiscrete statistical models on graphs. J. Statistical

    Mechanics: Theory and Experiments, 2006.

    [5] B. Cseke and T. Heskes. Bounds on the Bethe freeenergy for Gaussian networks. In Uncertainty inArtificial Intelligence, pages 97104, 2008.

    [6] J. Johnson, D. Malioutov, and A. Willsky. Walk-sum interpretation and analysis of Gaussian beliefpropagation. InAdv. in Neural Inform. Process-ing Systems 18, pages 579586, 2006.

    [7] D. Malioutov, J. Johnson, and A. Willsky. Walk-sums and belief propagation in Gaussian graph-ical models. J. of Machine Learning Research,

    7:20312064, 2006.

    [8] C. Moallemi and B. Van Roy. Consensus propaga-tion. IEEE Trans. on Inform. Th., 52:47534766,2006.

    [9] C. Moallemi and B. Van Roy. Convergence ofmin-sum message passing for quadratic optimiza-tion. IEEE Trans. on Inform. Th., 55:24132423,May 2009.

    [10] K. Plarre and P. Kumar. Extended messagepassing algorithm for inference in loopy Gaussiangraphical models. Ad Hoc Networks, 2:153169,2004.

    [11] P. Rusmevichientong and B. Van Roy. An anal-ysis of belief propagation on the turbo decodinggraph with Gaussian densities. IEEE Trans. onInform. Th., 47:745765, 2001.

    [12] H. Stark and A. Terras. Zeta functions of finitegraphs and coverings. Adv. in Math., 121:124165, 1996.

    [13] Y. Weiss and W. Freeman. Correctness of beliefpropagation in Gaussian graphical models of ar-

    bitrary topology. Neural Computation, 13:21732200, 2001.

    [14] J. Yedidia, W. Freeman, and Y. Weiss. Construct-ing free-energy approximations and generalizedbelief propagation algorithms. IEEE Trans. In-

    form. Th., 51:22822312, 2005.