Convergent Message-Passing Algorithms for Inference over General Graphs with Convex Free Energies...
Transcript of Convergent Message-Passing Algorithms for Inference over General Graphs with Convex Free Energies...
Convergent Message-Passing Algorithms for Inference over General Graphs with Convex
Free Energies
Tamir Hazan, Amnon Shashua
School of Computer Science and EngineeringHebrew University of Jerusalem
2
Input:
Graphical Models - Background
)(),...,( 1 xxxp n ?)( ixpOutput:
3
Input: )(),...,( 1 xxxp n ?)( ixpOutput:
Graphical Models - Background
}2,1{ }4,3,2{ }5,4,2{
1x 2x 3x 4x 5x
4
Input: )(),...,( 1 xxxp n ?)( ixpOutput:
aiNc
iiciai xmxn\)(
)()(
iaNj
jajax
aiia xnxmia \)(\
)()(:)( xx
Belief Propagation:
Graphical Models - Background
}2,1{ }4,3,2{ }5,4,2{
1x 2x 3x 4x 5x
5
BP - Variational Methods
i
iixbbbHdbHxxb
i
)()1()()(ln)(min,
2x32 d
Bethe free energy: Approximating the marginals )(),( ii xbxb
6
BP - Variational Methods
Stationary Bethe free energy = BP fixed points (Yedidia, Freeman, Weiss ’01)
When the factor graph has no cycles Bethe is convex over the marginalization constraints and BP is exact.
i
iixbbbHdbHxxb
i
)()1()()(ln)(min,
Bethe free energy: Approximating the marginals )(),( ii xbxb
Factor graph has cycles: Bethe is non-convex and BP might not converge.
ixxii xbxb
\)()(
2x32 d
7
BP - Variational Methods
i
iixbHcbHcxxb )()()(ln)(min
TRW free energy (Wainwright, Jaakkola, Willsky ’02):
c Weighted number of spanning trees through an edge α
)(1
iNi cc
TRW free energy is convex over the marginalization constraints.
Convergent message passing algorithm for cliques of size 2(Globerson & Jaakkola, ’07)
8
“convex free energy” is convex (over the marginalization constraints) for all factor graphs if there exists such that0,,c ii cc
)(
c
Ni
icc
)(
ciNiii cc
and
Claim: (Pakzad and Anantharam ‘02, Heskes ’04, Weiss et al. ’07)
Convex Free Energies
i
iixbHcbHcxxb )()()(ln)(min
9
“convex free energy” is convex (over the marginalization constraints) for all factor graphs if there exists such that0,,c ii cc
)(
c
Ni
icc
)(
ciNiii cc
and
Claim: (Pakzad and Anantharam ‘02, Heskes ’04, Weiss et al. ’06)
Convex Free Energies
i
iixbHcbHcxxb )()()(ln)(min
i iNi
iiiix
aa bHbHcbHcbHcxbx)(,
))()(()()()()(lnmin
10
Background - SummaryBethe free energy TRW free energy convex free energy
0,,c ii cc
non-convex for general graphs
specific convexification of free energy
general form of convexification with parameters
Contributions:
1) Convergent message passing algorithm for convex free energy
2) Heuristic for choosing good convex free energy.
Belief Propagation Globerson Jaakkola ’07 message passing algorithm for |α|=2
?message passing algorithm for |α|≥2
11
strictly convex& differentiable
n
ii
Rbbhbf
m1
)()(min
Strictly convex& proper ( for some b) )(bhi
Convex Belief Propagation
12
strictly convex& differentiable
n
ii
Rbbhbf
m1
)()(min
Strictly convex& proper ( for some b) )(bhi
i iNiiii
xaa bHbHcbHcbHcxbx
)(
))()(()()()()(lnmin
i marginals of xi agree
Convex Belief Propagation
13
Convex Belief Propagation
strictly convex& differentiable
n
ii
Rbbhbf
m1
)()(min
Strictly convex& proper ( for some b) )(bhi
i iNiiii
xaa bHbHcbHcbHcxbx
)(
))()(()()()()(lnmin
i marginals of xi agree
))()(()()( iiii
i
bHbHcbHcbh Whenever marginals of xi agree
Otherwise
14
)()(min bhbf ib
Output:
For t=1,2,...
For i=1,2,...,n
)( *bfii
*b
ij ji
)()(minarg)(
* bhbbfb iiT
hdomainb i
Convex Message Passing
* We also have a similar parallel message passing algorithm
1h
ih
nh
1b
b
mb
1,i
mi,
,1
,n
15
)()(min bhbf ib
Output:
For t=1,2,...
For i=1,2,...,n
)( *bfii
*b
ij ji
)()(minarg)(
* bhbbfb iiT
hdomainb i
Convex Message Passing
* We also have a similar parallel message passing algorithm
1h
ih
nh
1b
b
mb
1,i
mi,
,1
,n
16
)(bh
)(bg
*b
Primer on Fenchel Duality
slope
)(* h
)(* g
)(max)(* bhxh T
b
)(min)(* bgbg T
b
convex conjugate of h(b)
concave conjugate of g(b)
Primal )()( bgbh )()( ** hg Dual
17
)(bh
)(bg
*b
Primer on Fenchel Duality
)(slope ** bg
)( ** h
)( ** g
)(),()(max)()(min **** bgandhgbgbhb
More Conveniently: )()(minmax)()(min *
hbbgbhbg Tb
b
)( ** bg
*b
18
Sequential message-passingFenchel duality theorem
)()(minmax)()(min *
hbbgbhbg Tb
b )( ** bg
19
Generalized Fenchel duality theorem
n
i ii
n
i iT
b
n
i i hbbfbhbfk
1
*
1,...,1)()()(minmax)()(min
1
Sequential message-passingFenchel duality theorem
)()(minmax)()(min *
hbbgbhbg Tb
b )( ** bg
20
Generalized Fenchel duality theorem
Sequential message-passingFenchel duality theorem
)()(minmax)()(min *
hbbgbhbg Tb
b )( ** bg
Block update approach: Optimize i Set
ij ji
)()(minmax *iii
Ti
Tb hbbbf
i
n
i ii
n
i iT
b
n
i i hbbfbhbfk
1
*
1,...,1)()()(minmax)()(min
1
21
Generalized Fenchel duality theorem
Sequential message-passingFenchel duality theorem
)()(minmax *iii
Ti
Tb hbbbf
i
)()(minmax)()(min *
hbbgbhbg Tb
b )( ** bg
Block update approach: Optimize i Set
ij ji
n
i ii
n
i iT
b
n
i i hbbfbhbfk
1
*
1,...,1)()()(minmax)()(min
1
22
Generalized Fenchel duality theorem
Sequential message-passingFenchel duality theorem
)( ** bg
)()(min)()(minmax * bhbbfhbbbf iT
biiiT
iT
bi
)()(minmax)()(min *
hbbgbhbg Tb
b
ii bf )( *
Block update approach: Optimize i Set
ij ji
n
i ii
n
i iT
b
n
i i hbbfbhbfk
1
*
1,...,1)()()(minmax)()(min
1
23
Generalized Fenchel duality theorem
Sequential message-passingFenchel duality theorem
)()(min)()(minmax * bhbbfhbbbf iT
biiiT
iT
bi
)()(minmax)()(min *
hbbgbhbg Tb
b )( ** bg
ii bf )( *
Algorithm: Repeat until convergence
Output: *b
Block update approach: Optimize i Set
ij ji
n
i ii
n
i iT
b
n
i i hbbfbhbfk
1
*
1,...,1)()()(minmax)()(min
1
24
Sequential message-passing
Bregman’s successive projection algorithm (Hildreth, Dykstra, Csiszar, Tseng)
otherwise
Cbifbh i
i
0)(
Output:
For t=1,2,...
For i=1,2,...,n
)( *bfii
*b
ij ji
)()(minarg)(
* bhbbfb iiT
hdomainb i
)()(min bhbf ib
)(min bfiCb
1h
ih
nh
1b
b
mb
1,i
mi,
,1
,n
25
Convex Belief Propagation
n
ii
Rbbhbf
m1
)()(min
i iNiiii
xaa bHbHcbHcbHcxbx
)(
))()(()()()()(lnmin
i marginals of xi agree
Output:
For t=1,2,...
For i=1,2,...,n
)( *bfii *b
ij ji
)()(minarg)(
* bhbbfb iiT
hdomainb i
1h
ih
nh
1b
b
mb
1,i
mi ,
,1
,n
26
Convex Belief Propagation
Our algorithm:
ia
i
x
c
iaNjajaaiia xnxm
\
ˆ/1
\)(
)()(:)(x
x
)(
ˆ/ˆ )()(iN
icc
ii xmxb ii
i
c
ii
ii
iNjji xm
xbxnxxn
icic
)(
)()()()(
ˆ
\)(
)()(minarg)(
* bhbbfb iiT
hdomainb i
Closed form solution for
Belief propagation:
)(
)()(iNa iiaii xmxb
)(
)()(
ii
iiiai xm
xbxn
iaNj
jajax
aiia xnxmia \)(\
)()(:)( xx
27
Convex Belief Propagation
Our algorithm:
ia
i
x
c
iaNjajaaiia xnxm
\
ˆ/1
\)(
)()(:)(x
x
)(
ˆ/ˆ )()(iN
icc
ii xmxb ii
i
c
ii
ii
iNjji xm
xbxnxxn
icic
)(
)()()()(
ˆ
\)(
)()(minarg)(
* bhbbfb iiT
hdomainb i
Closed form solution for
Belief propagation:
)(
)()(iNa iiaii xmxb
)(
)()(
ii
iiiai xm
xbxn
iaNj
jajax
aiia xnxmia \)(\
)()(:)( xx
28
3) For general and (non-convex!) our algorithm,1, ccc i
1) when f(),h i() are non-convex if the algorithm
converges then it is to a stationary point.
2) Set (Bethe free energy, non-convex!)
our message passing algorithm = belief propagation.
Convex Belief Propagation
Interesting Notes:
i ib bhbf )()(min
,0,1,1 iii cdcc
aiNci
ciai xmxn
i\)(
)(:)(
iaNj
jajx
ac
iia xnxmia
a\)(\
/1 )()(:)(x
x
Open question =? “Fractional BP” (Wiegerinck, Heskes ‘03)
,0ic
29
Determine Convex Energy
i
iix
aa bHcbHcxbx )()()()(lnmin
Heuristic: Find convex energy while is close to 1 as possible c
Motivation: - Bethe approximation1c
)(c-1c
iNi
30
Experiments – Ising Model
gridnnEvery variable has two values
variables2n}1,1{ix
i
ijfield
interaction
0ij attractive
0ij mixed
)exp(),( jiijjiij xxxx )exp()( iiii xx
i iiEji jiijn xxxxxp )(),(),...,(
,1
31
Experiments – Efficiency
Grid size
Seconds
32
Approximating Marginals – Ising model
convex free energies Vs. Bethe free energy
Marginal Error
Interaction strength
33
convex free energies Vs. Bethe free energy
Marginal Error
Interaction strength
Approximating Marginals – Ising model
34
Approximating Marginals – Random Graph
10 vertices. Edge density 30% (almost tree), 50% (far from tree)
TRW free energy Vs. L2-convex free energy
35
• Novel perspective of message passing as primal-dual optimization for the general class of
• Convex free energy mapped to
• Heuristic for setting convex free energy by “convexifying” Bethe free energy
• Algorithm applies to general clique sizes; BP is a particular case of our algorithm
• Region Graphs energies (Kikuchi)
• Future work: Theoretical foundation on our heuristic for setting the convex free energy.
Summary
i ib bhbf )()(min
i ib bhbf )()(min