ICCV2009: MAP Inference in Discrete Models: Part 3
description
Transcript of ICCV2009: MAP Inference in Discrete Models: Part 3
MAP Inference in Discrete Models
M. Pawan Kumar, Stanford University
The Problem
E(x) = ∑ fi (xi) + ∑ gij (xi,xj) + ∑ hc(xc) i ij c
Unary Pairwise Higher Order
Minimize E(x) ….. Done !!!
Problems worthy of attackProve their worth by fighting back
Energy Minimization Problems
Space of Problems
CSP
MAXCUTNP-Hard
NP-hard
The Issues
• Which functions are exactly solvable?Boros Hammer [1965], Kolmogorov Zabih [ECCV 2002, PAMI 2004] , Ishikawa [PAMI
2003], Schlesinger [EMMCVPR 2007], Kohli Kumar Torr [CVPR2007, PAMI 2008] ,
Ramalingam Kohli Alahari Torr [CVPR 2008] , Kohli Ladicky Torr [CVPR 2008, IJCV
2009] , Zivny Jeavons [CP 2008]
• Approximate solutions of NP-hard problemsSchlesinger [76 ], Kleinberg and Tardos [FOCS 99], Chekuri et al. [01], Boykov et al.
[PAMI 01], Wainwright et al. [NIPS01], Werner [PAMI 2007], Komodakis et al. [PAMI, 05
07], Lempitsky et al. [ICCV 2007], Kumar et al. [NIPS 2007], Kumar et al. [ICML 2008],
Sontag and Jakkola [NIPS 2007], Kohli et al. [ICML 2008], Kohli et al. [CVPR 2008,
IJCV 2009], Rother et al. [2009]
• Scalability and Efficiency Kohli Torr [ICCV 2005, PAMI 2007], Juan and Boykov [CVPR 2006], Alahari Kohli Torr
[CVPR 2008] , Delong and Boykov [CVPR 2008]
The Issues
• Which functions are exactly solvable?Boros Hammer [1965], Kolmogorov Zabih [ECCV 2002, PAMI 2004] , Ishikawa [PAMI
2003], Schlesinger [EMMCVPR 2007], Kohli Kumar Torr [CVPR2007, PAMI 2008] ,
Ramalingam Kohli Alahari Torr [CVPR 2008] , Kohli Ladicky Torr [CVPR 2008, IJCV
2009] , Zivny Jeavons [CP 2008]
• Approximate solutions of NP-hard problemsSchlesinger [76 ], Kleinberg and Tardos [FOCS 99], Chekuri et al. [01], Boykov et al.
[PAMI 01], Wainwright et al. [NIPS01], Werner [PAMI 2007], Komodakis et al. [PAMI, 05
07], Lempitsky et al. [ICCV 2007], Kumar et al. [NIPS 2007], Kumar et al. [ICML 2008],
Sontag and Jakkola [NIPS 2007], Kohli et al. [ICML 2008], Kohli et al. [CVPR 2008,
IJCV 2009], Rother et al. [2009]
• Scalability and Efficiency Kohli Torr [ICCV 2005, PAMI 2007], Juan and Boykov [CVPR 2006], Alahari Kohli Torr
[CVPR 2008] , Delong and Boykov [CVPR 2008]
Popular Inference Methods
Iterated Conditional Modes (ICM)
Simulated Annealing
Dynamic Programming (DP)
Belief Propagtion (BP)
Tree-Reweighted (TRW), Diffusion
Graph Cut (GC)
Branch & Bound
Relaxation methods:
…
Classical Move making algorithms
Combinatorial Algorithms
Message passing
Convex Optimization(Linear Programming,
...)
Road Map9.30-10.00 Introduction (Andrew Blake)
10.00-11.00 Discrete Models in Computer Vision (Carsten Rother)
15min Coffee break
11.15-12.30 Message Passing: DP, TRW, LP relaxation (Pawan
Kumar)
12.30-13.00 Quadratic pseudo-boolean optimization (Pushmeet
Kohli)
1hour Lunch break
14:00-15.00 Transformation and move-making methods (Pushmeet
Kohli)
15:00-15.30 Speed and Efficiency (Pushmeet Kohli)
15min Coffee break
15:45-16.15 Comparison of Methods (Carsten Rother)
16:30-17.30 Recent Advances: Dual-decomposition, higher-order,
etc. (Carsten Rother + Pawan Kumar)
Notation
Labels - li, lj, ….
Labeling - f : {a,b,..} {i,j, …}
Random variables - Va, Vb, ….
Unary Potential - a;f(a)
Pairwise Potential - ab;f(a)f(b)
Notation
MAP Estimation
f* = arg min Q(f; )
Q(f; ) = ∑a a;f(a) + ∑(a,b) ab;f(a)f(b)
Min-marginals
qa;i = min Q(f; ) s.t. f(a) = i
Energy Function
Outline
• Reparameterization
• Belief Propagation
• Tree-reweighted Message Passing
Reparameterization
Va Vb
2
5
4
2
0
1 1
0
f(a) f(b) Q(f; )
0 0 7
0 1 10
1 0 5
1 1 6
2 +
2 +
- 2
- 2
Add a constant to all a;i
Subtract that constant from all b;k
Reparameterization
f(a) f(b) Q(f; )
0 0 7 + 2 - 2
0 1 10 + 2 - 2
1 0 5 + 2 - 2
1 1 6 + 2 - 2
Add a constant to all a;i
Subtract that constant from all b;k
Q(f; ’) = Q(f; )
Va Vb
2
5
4
2
0
0
2 +
2 +
- 2
- 2
1 1
Reparameterization
Va Vb
2
5
4
2
0
1 1
0
f(a) f(b) Q(f; )
0 0 7
0 1 10
1 0 5
1 1 6
- 3 + 3
Add a constant to one b;k
Subtract that constant from ab;ik for all ‘i’
- 3
Reparameterization
Va Vb
2
5
4
2
0
1 1
0
f(a) f(b) Q(f; )
0 0 7
0 1 10 - 3 + 3
1 0 5
1 1 6 - 3 + 3
- 3 + 3
- 3
Q(f; ’) = Q(f; )
Add a constant to one b;k
Subtract that constant from ab;ik for all ‘i’
Reparameterization
Va Vb
2
5
4
2
3 1
0
1
2
Va Vb
2
5
4
2
3 1
1
0
1
- 2
- 2
- 2 + 2+ 1
+ 1
+ 1
- 1
Va Vb
2
5
4
2
3 1
2
1
0 - 4 + 4
- 4
- 4
’a;i = a;i ’b;k = b;k
’ab;ik = ab;ik
+ Mab;k
- Mab;k
+ Mba;i
- Mba;i
Q(f; ’)
= Q(f; )
Reparameterization
Q(f; ’) = Q(f; ), for all f
’ is a reparameterization of , iff
’
’b;k = b;k
’a;i = a;i
’ab;ik = ab;ik
+ Mab;k
- Mab;k
+ Mba;i
- Mba;i
Equivalently Kolmogorov, PAMI, 2006
Va Vb
2
5
4
2
0
0
2 +
2 +
- 2
- 2
1 1
Recap
MAP Estimation
f* = arg min Q(f; )
Q(f; ) = ∑a a;f(a) + ∑(a,b) ab;f(a)f(b)
Min-marginals
qa;i = min Q(f; ) s.t. f(a) = i
Q(f; ’) = Q(f; ), for all f ’
Reparameterization
Outline
• Reparameterization
• Belief Propagation
– Exact MAP for Chains and Trees
– Approximate MAP for general graphs
– Computational Issues and Theoretical Properties
• Tree-reweighted Message Passing
Belief Propagation
• Belief Propagation gives exact MAP for chains
• Some MAP problems are easy
• Exact MAP for trees
• Clever Reparameterization
Two Variables
Va Vb
2
5 2
1
0
Va Vb
2
5
40
1
Choose the right constant ’b;k = qb;k
Add a constant to one b;k
Subtract that constant from ab;ik for all ‘i’
Va Vb
2
5 2
1
0
Va Vb
2
5
40
1
Choose the right constant ’b;k = qb;k
a;0 + ab;00 = 5 + 0
a;1 + ab;10 = 2 + 1minMab;0 =
Two Variables
Va Vb
2
5 5
-2
-3
Va Vb
2
5
40
1
Choose the right constant ’b;k = qb;k
Two Variables
Va Vb
2
5 5
-2
-3
Va Vb
2
5
40
1
Choose the right constant ’b;k = qb;k
f(a) = 1
’b;0 = qb;0
Two Variables
Potentials along the red path add up to 0
Va Vb
2
5 5
-2
-3
Va Vb
2
5
40
1
Choose the right constant ’b;k = qb;k
a;0 + ab;01 = 5 + 1
a;1 + ab;11 = 2 + 0minMab;1 =
Two Variables
Va Vb
2
5 5
-2
-3
Va Vb
2
5
6-2
-1
Choose the right constant ’b;k = qb;k
f(a) = 1
’b;0 = qb;0
f(a) = 1
’b;1 = qb;1
Minimum of min-marginals = MAP estimate
Two Variables
Va Vb
2
5 5
-2
-3
Va Vb
2
5
6-2
-1
Choose the right constant ’b;k = qb;k
f(a) = 1
’b;0 = qb;0
f(a) = 1
’b;1 = qb;1
f*(b) = 0 f*(a) = 1
Two Variables
Va Vb
2
5 5
-2
-3
Va Vb
2
5
6-2
-1
Choose the right constant ’b;k = qb;k
f(a) = 1
’b;0 = qb;0
f(a) = 1
’b;1 = qb;1
We get all the min-marginals of Vb
Two Variables
Recap
We only need to know two sets of equations
General form of Reparameterization
’a;i = a;i
’ab;ik = ab;ik
+ Mab;k
- Mab;k
+ Mba;i
- Mba;i
’b;k = b;k
Reparameterization of (a,b) in Belief Propagation
Mab;k = mini { a;i + ab;ik }
Mba;i = 0
Three Variables
Va Vb
2
5 2
1
0
Vc
4 60
1
0
1
3
2 3
Reparameterize the edge (a,b) as before
l0
l1
Va Vb
2
5 5-3
Vc
6 60
1
-2
3
Reparameterize the edge (a,b) as before
f(a) = 1
f(a) = 1
-2 -1 2 3
Three Variables
l0
l1
Va Vb
2
5 5-3
Vc
6 60
1
-2
3
Reparameterize the edge (a,b) as before
f(a) = 1
f(a) = 1
Potentials along the red path add up to 0
-2 -1 2 3
Three Variables
l0
l1
Va Vb
2
5 5-3
Vc
6 60
1
-2
3
Reparameterize the edge (b,c) as before
f(a) = 1
f(a) = 1
Potentials along the red path add up to 0
-2 -1 2 3
Three Variables
l0
l1
Va Vb
2
5 5-3
Vc
6 12-6
-5
-2
9
Reparameterize the edge (b,c) as before
f(a) = 1
f(a) = 1
Potentials along the red path add up to 0
f(b) = 1
f(b) = 0
-2 -1 -4 -3
Three Variables
l0
l1
Va Vb
2
5 5-3
Vc
6 12-6
-5
-2
9
Reparameterize the edge (b,c) as before
f(a) = 1
f(a) = 1
Potentials along the red path add up to 0
f(b) = 1
f(b) = 0
qc;0
qc;1-2 -1 -4 -3
Three Variables
l0
l1
Va Vb
2
5 5-3
Vc
6 12-6
-5
-2
9
f(a) = 1
f(a) = 1
f(b) = 1
f(b) = 0
qc;0
qc;1
f*(c) = 0 f*(b) = 0 f*(a) = 1
Generalizes to any length chain
-2 -1 -4 -3
Three Variables
l0
l1
Va Vb
2
5 5-3
Vc
6 12-6
-5
-2
9
f(a) = 1
f(a) = 1
f(b) = 1
f(b) = 0
qc;0
qc;1
f*(c) = 0 f*(b) = 0 f*(a) = 1
Only Dynamic Programming
-2 -1 -4 -3
Three Variables
l0
l1
Why Dynamic Programming?
3 variables 2 variables + book-keeping
n variables (n-1) variables + book-keeping
Start from left, go to right
Reparameterize current edge (a,b)
Mab;k = mini { a;i + ab;ik }
’ab;ik = ab;ik+ Mab;k - Mab;k’b;k = b;k
Repeat
Why Dynamic Programming?
Start from left, go to right
Reparameterize current edge (a,b)
Mab;k = mini { a;i + ab;ik }
’ab;ik = ab;ik+ Mab;k - Mab;k’b;k = b;k
Repeat
Messages Message Passing
Why stop at dynamic programming?
Va Vb
2
5 5-3
Vc
6 12-6
-5
-2
9
Reparameterize the edge (c,b) as before
-2 -1 -4 -3
Three Variables
l0
l1
Va Vb
2
5 9-3
Vc
11 12-11
-9
-2
9
Reparameterize the edge (c,b) as before
-2 -1 -9 -7
’b;i = qb;i
Three Variables
l0
l1
Va Vb
2
5 9-3
Vc
11 12-11
-9
-2
9
Reparameterize the edge (b,a) as before
-2 -1 -9 -7
Three Variables
l0
l1
Va Vb
9
11 9-9
Vc
11 12-11
-9
-9
9
Reparameterize the edge (b,a) as before
-9 -7 -9 -7
’a;i = qa;i
Three Variables
l0
l1
Va Vb
9
11 9-9
Vc
11 12-11
-9
-9
9
Forward Pass Backward Pass
-9 -7 -9 -7
All min-marginals are computed
Three Variables
l0
l1
Belief Propagation on Chains
Start from left, go to right
Reparameterize current edge (a,b)
Mab;k = mini { a;i + ab;ik }
’ab;ik = ab;ik+ Mab;k - Mab;k’b;k = b;k
Repeat till the end of the chain
Start from right, go to left
Repeat till the end of the chain
Belief Propagation on Chains
• A way of computing reparam constants
• Generalizes to chains of any length
• Forward Pass - Start to End
• MAP estimate
• Min-marginals of final variable
• Backward Pass - End to start
• All other min-marginals
Won’t need this .. But good to know
Computational Complexity
• Each constant takes O(|L|)
• Number of constants - O(|E||L|)
O(|E||L|2)
• Memory required ?
O(|E||L|)
Belief Propagation on Trees
Vb
Va
Forward Pass: Leaf Root
All min-marginals are computed
Backward Pass: Root Leaf
Vc
Vd Ve Vg Vh
Outline
• Reparameterization
• Belief Propagation
– Exact MAP for Chains and Trees
– Approximate MAP for general graphs
– Computational Issues and Theoretical Properties
• Tree-reweighted Message Passing
Belief Propagation on Cycles
Va Vb
Vd Vc
Where do we start? Arbitrarily
a;0
a;1
b;0
b;1
d;0
d;1
c;0
c;1
Reparameterize (a,b)
Belief Propagation on Cycles
Va Vb
Vd Vc
a;0
a;1
’b;0
’b;1
d;0
d;1
c;0
c;1
Potentials along the red path add up to 0
Belief Propagation on Cycles
Va Vb
Vd Vc
a;0
a;1
’b;0
’b;1
d;0
d;1
’c;0
’c;1
Potentials along the red path add up to 0
Belief Propagation on Cycles
Va Vb
Vd Vc
a;0
a;1
’b;0
’b;1
’d;0
’d;1
’c;0
’c;1
Potentials along the red path add up to 0
Belief Propagation on Cycles
Va Vb
Vd Vc
’a;0
’a;1
’b;0
’b;1
’d;0
’d;1
’c;0
’c;1
Potentials along the red path add up to 0
- a;0
- a;1
Did not obtain min-marginals
Belief Propagation on Cycles
Va Vb
Vd Vc
’a;0
’a;1
’b;0
’b;1
’d;0
’d;1
’c;0
’c;1
- a;0
- a;1
Reparameterize (a,b) again
Belief Propagation on Cycles
Va Vb
Vd Vc
’a;0
’a;1
’’b;0
’’b;1
’d;0
’d;1
’c;0
’c;1
Reparameterize (a,b) again
But doesn’t this overcount some potentials?
Belief Propagation on Cycles
Va Vb
Vd Vc
’a;0
’a;1
’’b;0
’’b;1
’d;0
’d;1
’c;0
’c;1
Reparameterize (a,b) again
Yes. But we will do it anyway
Belief Propagation on Cycles
Va Vb
Vd Vc
’a;0
’a;1
’’b;0
’’b;1
’d;0
’d;1
’c;0
’c;1
Keep reparameterizing edges in some order
No convergence guarantees
Belief Propagation
• Generalizes to any arbitrary random field
• Complexity per iteration ?
O(|E||L|2)
• Memory required ?
O(|E||L|)
Outline
• Reparameterization
• Belief Propagation
– Exact MAP for Chains and Trees
– Approximate MAP for general graphs
– Computational Issues and Theoretical Properties
• Tree-reweighted Message Passing
Computational Issues of BP
Complexity per iteration O(|E||L|2)
Special Pairwise Potentials ab;ik = wabd(|i-k|)
i - k
d
Potts
i - k
d
Truncated Linear
i - k
d
Truncated Quadratic
O(|E||L|) Felzenszwalb & Huttenlocher, 2004
Computational Issues of BP
Memory requirements O(|E||L|)
Half of original BP Kolmogorov, 2006
Some approximations exist
But memory still remains an issue
Yu, Lin, Super and Tan, 2007
Lasserre, Kannan and Winn, 2007
Computational Issues of BP
Order of reparameterization
Randomly
Residual Belief Propagation
In some fixed order
The one that results in maximum change
Elidan et al. , 2006
Summary of BP
Exact for chains
Exact for trees
Approximate MAP for general cases
Convergence not guaranteed
So can we do something better?
Outline
• Reparameterization
• Belief Propagation
• Tree-reweighted Message Passing
– Integer Programming Formulation
– Linear Programming Relaxation and its Dual
– Convergent Solution for Dual
– Computational Issues and Theoretical Properties
Integer Programming Formulation
Va Vb
Label l0
Label l12
5
4
2
0
1 1
0
2
Unary Potentials
a;0 = 5
a;1 = 2
b;0 = 2
b;1 = 4
Labeling
f(a) = 1
f(b) = 0
ya;0 = 0 ya;1 = 1
yb;0 = 1 yb;1 = 0
Any f(.) has equivalent boolean variables ya;i
Integer Programming Formulation
Va Vb
2
5
4
2
0
1 1
0
2
Unary Potentials
a;0 = 5
a;1 = 2
b;0 = 2
b;1 = 4
Labeling
f(a) = 1
f(b) = 0
ya;0 = 0 ya;1 = 1
yb;0 = 1 yb;1 = 0
Find the optimal variables ya;i
Label l0
Label l1
Integer Programming Formulation
Va Vb
2
5
4
2
0
1 1
0
2
Unary Potentials
a;0 = 5
a;1 = 2
b;0 = 2
b;1 = 4
Sum of Unary Potentials
∑a ∑i a;i ya;i
ya;i {0,1}, for all Va, li
∑i ya;i = 1, for all Va
Label l0
Label l1
Integer Programming Formulation
Va Vb
2
5
4
2
0
1 1
0
2
Pairwise Potentials
ab;00 = 0
ab;10 = 1
ab;01 = 1
ab;11 = 0
Sum of Pairwise Potentials
∑(a,b) ∑ik ab;ik ya;iyb;k
ya;i {0,1}
∑i ya;i = 1
Label l0
Label l1
Integer Programming Formulation
Va Vb
2
5
4
2
0
1 1
0
2
Pairwise Potentials
ab;00 = 0
ab;10 = 1
ab;01 = 1
ab;11 = 0
Sum of Pairwise Potentials
∑(a,b) ∑ik ab;ik yab;ik
ya;i {0,1}
∑i ya;i = 1
yab;ik = ya;i yb;k
Label l0
Label l1
Integer Programming Formulation
min ∑a ∑i a;i ya;i + ∑(a,b) ∑ik ab;ik yab;ik
ya;i {0,1}
∑i ya;i = 1
yab;ik = ya;i yb;k
Integer Programming Formulation
min Ty
ya;i {0,1}
∑i ya;i = 1
yab;ik = ya;i yb;k
= [ … a;i …. ; … ab;ik ….]
y = [ … ya;i …. ; … yab;ik ….]
Integer Programming Formulation
min Ty
ya;i {0,1}
∑i ya;i = 1
yab;ik = ya;i yb;k
Solve to obtain MAP labeling y*
Integer Programming Formulation
min Ty
ya;i {0,1}
∑i ya;i = 1
yab;ik = ya;i yb;k
But we can’t solve it in general
Outline
• Reparameterization
• Belief Propagation
• Tree-reweighted Message Passing
– Integer Programming Formulation
– Linear Programming Relaxation and its Dual
– Convergent Solution for Dual
– Computational Issues and Theoretical Properties
Linear Programming Relaxation
min Ty
ya;i {0,1}
∑i ya;i = 1
yab;ik = ya;i yb;k
Two reasons why we can’t solve this
Linear Programming Relaxation
min Ty
ya;i [0,1]
∑i ya;i = 1
yab;ik = ya;i yb;k
One reason why we can’t solve this
Linear Programming Relaxation
min Ty
ya;i [0,1]
∑i ya;i = 1
∑k yab;ik = ∑kya;i yb;k
One reason why we can’t solve this
Linear Programming Relaxation
min Ty
ya;i [0,1]
∑i ya;i = 1
One reason why we can’t solve this
= 1∑k yab;ik = ya;i∑k yb;k
Linear Programming Relaxation
min Ty
ya;i [0,1]
∑i ya;i = 1
∑k yab;ik = ya;i
One reason why we can’t solve this
Linear Programming Relaxation
min Ty
ya;i [0,1]
∑i ya;i = 1
∑k yab;ik = ya;i
No reason why we can’t solve this*
*memory requirements, time complexity
Dual of the LP RelaxationWainwright et al., 2001
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
min Ty
ya;i [0,1]
∑i ya;i = 1
∑k yab;ik = ya;i
Dual of the LP RelaxationWainwright et al., 2001
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
1
2
3
4 5 6
1
2
3
4 5 6
i i =
i ≥ 0
Dual of the LP RelaxationWainwright et al., 2001
1
2
3
4 5 6
q*( 1)
i i =
q*( 2)
q*( 3)
q*( 4) q*( 5) q*( 6)
i q*( i)
Dual of LP
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
i ≥ 0
max
Dual of the LP RelaxationWainwright et al., 2001
1
2
3
4 5 6
q*( 1)
i i
q*( 2)
q*( 3)
q*( 4) q*( 5) q*( 6)
Dual of LP
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
i ≥ 0
i q*( i)max
Dual of the LP RelaxationWainwright et al., 2001
i i
max i q*( i)
I can easily compute q*( i)
I can easily maintain reparam constraint
So can I easily solve the dual?
Outline
• Reparameterization
• Belief Propagation
• Tree-reweighted Message Passing
– Integer Programming Formulation
– Linear Programming Relaxation and its Dual
– Convergent Solution for Dual
– Computational Issues and Theoretical Properties
TRW Message PassingKolmogorov, 2006
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
VaVb Vc
VdVe Vf
VgVh Vi
1
2
3
1
2
3
4 5 6
4 5 6
i i
i q*( i)
Pick a variable Va
TRW Message PassingKolmogorov, 2006
i i
i q*( i)
Vc Vb Va
1c;0
1c;1
1b;0
1b;1
1a;0
1a;1
Va Vd Vg
4a;0
4a;1
4d;0
4d;1
4g;0
4g;1
TRW Message PassingKolmogorov, 2006
1 1 + 4 4 + rest
1 q*( 1) + 4 q*( 4) + K
Vc Vb Va Va Vd Vg
Reparameterize to obtain min-marginals of Va
1c;0
1c;1
1b;0
1b;1
1a;0
1a;1
4a;0
4a;1
4d;0
4d;1
4g;0
4g;1
TRW Message PassingKolmogorov, 2006
1 ’1 + 4 ’4 + rest
Vc Vb Va
’1c;0
’1c;1
’1b;0
’1b;1
’1a;0
’1a;1
Va Vd Vg
’4a;0
’4a;1
’4d;0
’4d;1
’4g;0
’4g;1
One pass of Belief Propagation
1 q*( ’1) + 4 q*( ’4) + K
TRW Message PassingKolmogorov, 2006
1 ’1 + 4 ’4 + rest
Vc Vb Va Va Vd Vg
Remain the same
1 q*( ’1) + 4 q*( ’4) + K
’1c;0
’1c;1
’1b;0
’1b;1
’1a;0
’1a;1
’4a;0
’4a;1
’4d;0
’4d;1
’4g;0
’4g;1
TRW Message PassingKolmogorov, 2006
1 ’1 + 4 ’4 + rest
1 min{ ’1a;0, ’1a;1} + 4 min{ ’4a;0, ’4a;1} + K
Vc Vb Va Va Vd Vg
’1c;0
’1c;1
’1b;0
’1b;1
’1a;0
’1a;1
’4a;0
’4a;1
’4d;0
’4d;1
’4g;0
’4g;1
TRW Message PassingKolmogorov, 2006
1 ’1 + 4 ’4 + rest
Vc Vb Va Va Vd Vg
Compute weighted average of min-marginals of Va
’1c;0
’1c;1
’1b;0
’1b;1
’1a;0
’1a;1
’4a;0
’4a;1
’4d;0
’4d;1
’4g;0
’4g;1
1 min{ ’1a;0, ’1a;1} + 4 min{ ’4a;0, ’4a;1} + K
TRW Message PassingKolmogorov, 2006
1 ’1 + 4 ’4 + rest
Vc Vb Va Va Vd Vg
’’a;0 = 1 ’1a;0+ 4 ’4a;0
1 + 4
’’a;1 = 1 ’1a;1+ 4 ’4a;1
1 + 4
’1c;0
’1c;1
’1b;0
’1b;1
’1a;0
’1a;1
’4a;0
’4a;1
’4d;0
’4d;1
’4g;0
’4g;1
1 min{ ’1a;0, ’1a;1} + 4 min{ ’4a;0, ’4a;1} + K
TRW Message PassingKolmogorov, 2006
1 ’’1 + 4 ’’4 + rest
Vc Vb Va Va Vd Vg
’1c;0
’1c;1
’1b;0
’1b;1
’’a;0
’’a;1
’’a;0
’’a;1
’4d;0
’4d;1
’4g;0
’4g;1
1 min{ ’1a;0, ’1a;1} + 4 min{ ’4a;0, ’4a;1} + K
’’a;0 = 1 ’1a;0+ 4 ’4a;0
1 + 4
’’a;1 = 1 ’1a;1+ 4 ’4a;1
1 + 4
TRW Message PassingKolmogorov, 2006
1 ’’1 + 4 ’’4 + rest
Vc Vb Va Va Vd Vg
’1c;0
’1c;1
’1b;0
’1b;1
’’a;0
’’a;1
’’a;0
’’a;1
’4d;0
’4d;1
’4g;0
’4g;1
1 min{ ’1a;0, ’1a;1} + 4 min{ ’4a;0, ’4a;1} + K
’’a;0 = 1 ’1a;0+ 4 ’4a;0
1 + 4
’’a;1 = 1 ’1a;1+ 4 ’4a;1
1 + 4
TRW Message PassingKolmogorov, 2006
1 ’’1 + 4 ’’4 + rest
Vc Vb Va Va Vd Vg
1 min{ ’’a;0, ’’a;1} + 4 min{ ’’a;0, ’’a;1} + K
’1c;0
’1c;1
’1b;0
’1b;1
’’a;0
’’a;1
’’a;0
’’a;1
’4d;0
’4d;1
’4g;0
’4g;1
’’a;0 = 1 ’1a;0+ 4 ’4a;0
1 + 4
’’a;1 = 1 ’1a;1+ 4 ’4a;1
1 + 4
TRW Message PassingKolmogorov, 2006
1 ’’1 + 4 ’’4 + rest
Vc Vb Va Va Vd Vg
( 1 + 4) min{ ’’a;0, ’’a;1} + K
’1c;0
’1c;1
’1b;0
’1b;1
’’a;0
’’a;1
’’a;0
’’a;1
’4d;0
’4d;1
’4g;0
’4g;1
’’a;0 = 1 ’1a;0+ 4 ’4a;0
1 + 4
’’a;1 = 1 ’1a;1+ 4 ’4a;1
1 + 4
TRW Message PassingKolmogorov, 2006
1 ’’1 + 4 ’’4 + rest
Vc Vb Va Va Vd Vg
( 1 + 4) min{ ’’a;0, ’’a;1} + K
’1c;0
’1c;1
’1b;0
’1b;1
’’a;0
’’a;1
’’a;0
’’a;1
’4d;0
’4d;1
’4g;0
’4g;1
min {p1+p2, q1+q2} min {p1, q1} + min {p2, q2}≥
TRW Message PassingKolmogorov, 2006
1 ’’1 + 4 ’’4 + rest
Vc Vb Va Va Vd Vg
Objective function increases or remains constant
’1c;0
’1c;1
’1b;0
’1b;1
’’a;0
’’a;1
’’a;0
’’a;1
’4d;0
’4d;1
’4g;0
’4g;1
( 1 + 4) min{ ’’a;0, ’’a;1} + K
TRW Message Passing
Initialize i. Take care of reparam constraint
Choose random variable Va
Compute min-marginals of Va for all trees
Node-average the min-marginals
REPEAT
Kolmogorov, 2006
Can also do edge-averaging
Example 1
Va Vb
0
1 1
0
2
5
4
2l0
l1
Vb Vc
0
2 3
1
4
2
6
3
Vc Va
1
4 1
0
6
3
6
4
2 =1 3 =11 =1
5 6 7
Pick variable Va. Reparameterize.
Example 1
Va Vb
-3
-2 -1
-2
5
7
4
2
Vb Vc
0
2 3
1
4
2
6
3
Vc Va
-3
1 -3
-3
6
3
10
7
2 =1 3 =11 =1
5 6 7
Average the min-marginals of Va
l0
l1
Example 1
Va Vb
-3
-2 -1
-2
7.5
7
4
2
Vb Vc
0
2 3
1
4
2
6
3
Vc Va
-3
1 -3
-3
6
3
7.5
7
2 =1 3 =11 =1
7 6 7
Pick variable Vb. Reparameterize.
l0
l1
Example 1
Va Vb
-7.5
-7 -5.5
-7
7.5
7
8.5
7
Vb Vc
-5
-3 -1
-3
9
6
6
3
Vc Va
-3
1 -3
-3
6
3
7.5
7
2 =1 3 =11 =1
7 6 7
Average the min-marginals of Vb
l0
l1
Example 1
Va Vb
-7.5
-7 -5.5
-7
7.5
7
8.75
6.5
Vb Vc
-5
-3 -1
-3
8.75
6.5
6
3
Vc Va
-3
1 -3
-3
6
3
7.5
7
2 =1 3 =11 =1
6.5 6.5 7
Value of dual does not increase
l0
l1
Example 1
Va Vb
-7.5
-7 -5.5
-7
7.5
7
8.75
6.5
Vb Vc
-5
-3 -1
-3
8.75
6.5
6
3
Vc Va
-3
1 -3
-3
6
3
7.5
7
2 =1 3 =11 =1
6.5 6.5 7
Maybe it will increase for Vc
NO
l0
l1
Example 1
Va Vb
-7.5
-7 -5.5
-7
7.5
7
8.75
6.5
Vb Vc
-5
-3 -1
-3
8.75
6.5
6
3
Vc Va
-3
1 -3
-3
6
3
7.5
7
2 =1 3 =11 =1
Strong Tree Agreement
Exact MAP Estimate
f1(a) = 0 f1(b) = 0 f2(b) = 0 f2(c) = 0 f3(c) = 0 f3(a) = 0
l0
l1
Example 2
Va Vb
0
1 1
0
2
5
2
2
Vb Vc
1
0 0
1
0
0
0
0
Vc Va
0
1 1
0
0
3
4
8
2 =1 3 =11 =1
4 0 4
Pick variable Va. Reparameterize.
l0
l1
Example 2
Va Vb
-2
-1 -1
-2
4
7
2
2
Vb Vc
1
0 0
1
0
0
0
0
Vc Va
0
0 1
-1
0
3
4
9
2 =1 3 =11 =1
4 0 4
Average the min-marginals of Va
l0
l1
Example 2
Va Vb
-2
-1 -1
-2
4
8
2
2
Vb Vc
1
0 0
1
0
0
0
0
Vc Va
0
0 1
-1
0
3
4
8
2 =1 3 =11 =1
4 0 4
Value of dual does not increase
l0
l1
Example 2
Va Vb
-2
-1 -1
-2
4
8
2
2
Vb Vc
1
0 0
1
0
0
0
0
Vc Va
0
0 1
-1
0
3
4
8
2 =1 3 =11 =1
4 0 4
Maybe it will decrease for Vb or Vc
NO
l0
l1
Example 2
Va Vb
-2
-1 -1
-2
4
8
2
2
Vb Vc
1
0 0
1
0
0
0
0
Vc Va
0
0 1
-1
0
3
4
8
2 =1 3 =11 =1
f1(a) = 1 f1(b) = 1 f2(b) = 1 f2(c) = 0 f3(c) = 1 f3(a) = 1
f2(b) = 0 f2(c) = 1
Weak Tree Agreement
Not Exact MAP Estimate
l0
l1
Example 2
Va Vb
-2
-1 -1
-2
4
8
2
2
Vb Vc
1
0 0
1
0
0
0
0
Vc Va
0
0 1
-1
0
3
4
8
2 =1 3 =11 =1
Weak Tree Agreement
Convergence point of TRW
l0
l1
f1(a) = 1 f1(b) = 1 f2(b) = 1 f2(c) = 0 f3(c) = 1 f3(a) = 1
f2(b) = 0 f2(c) = 1
Obtaining the Labeling
Only solves the dual. Primal solutions?
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
’ = i i
Fix the label
Of Va
Obtaining the Labeling
Only solves the dual. Primal solutions?
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
’ = i i
Fix the label
Of Vb
Continue in some fixed order
Meltzer et al., 2006
Outline
• Problem Formulation
• Reparameterization
• Belief Propagation
• Tree-reweighted Message Passing
– Integer Programming Formulation
– Linear Programming Relaxation and its Dual
– Convergent Solution for Dual
– Computational Issues and Theoretical Properties
Computational Issues of TRW
• Speed-ups for some pairwise potentials
Basic Component is Belief Propagation
Felzenszwalb & Huttenlocher, 2004
• Memory requirements cut down by half
Kolmogorov, 2006
• Further speed-ups using monotonic chains
Kolmogorov, 2006
Theoretical Properties of TRW
• Always converges, unlike BP
Kolmogorov, 2006
• Strong tree agreement implies exact MAP
Wainwright et al., 2001
• Optimal MAP for two-label submodular problems
Kolmogorov and Wainwright, 2005
ab;00 + ab;11 ≤ ab;01 + ab;10
Summary
• Trees can be solved exactly - BP
• No guarantee of convergence otherwise - BP
• Strong Tree Agreement - TRW-S
• Submodular energies solved exactly - TRW-S
• TRW-S solves an LP relaxation of MAP estimation