A Study of Lagrangean Decompositions and Dual Ascent ... · 1 CRF: dualize constraints between...

1
A Study of Lagrangean Decompositions and Dual Ascent Solvers for Graph Matching Paul Swoboda 1 Carsten Rother 2 Hassan Abu Alhaija 2 Dagmar Kainmueller 3 Bogdan Savchynskyy 2 1 IST Austria 2 TU Dresden, Germany 3 MPI CBG, Germany e-mail: [email protected] Goal and contribution Goal: We study efficient Lagrangean decomposition based algorithms for graph matching and propose new state-of-the-art ones. Contribution: Several Lagrangean decompositions of the graph matching problems: 1 as a CRF 2 with a network flow subproblem 3 with additional label factors Associated dual block coordinate ascent algorithms. Extensive numerical comparison against leading Lagrangean decomposition based solvers [4, 5]. Practical impact: Test data and C++ implementation available at https://github.com/pawelswoboda/LP_MP. Theoretical foundations of used algorithms [3]. Graph matching problem Graph G =(V, E). Discrete universe of labels L. Discrete label space X = Q u V X u with X u ⊆L. Unary costs θ u : X u R for u V. Pairwise costs θ uv : X u × X v R for uv E. min x X X u V θ u (x u )+ X uv E θ uv (x uv ) | {z } CRF s.t. x u 6= x v u 6= v | {z } uniqueness constraints 1) Transformation to CRF Modify pairwise potentials: Penalize same label for different nodes. For each u , v V with X u X v 6= : Set θ uv (x , x )= x X u X v . Example: uv E, X u = {1, 2, 3, 4} = X v . θ uv = θ uv (1, 2) θ uv (1, 3) θ uv (1, 4) θ uv (2, 1) θ uv (2, 3) θ uv (2, 4) θ uv (3, 1) θ uv (3, 2) θ uv (3, 4) θ uv (4, 1) θ uv (4, 2) θ uv (4, 3) LP-relaxation for CRF Optimization over local polytope: Simplex Δ n = {x R n + : n i =1 x i = 1}. min μL G hθ, μi := X u V hθ u u i + X uv E hθ uv uv i L G = μ u Δ |X u | , u V, μ uv Δ |X uv | , uv E, x v X v μ uv (x u , x v )= μ u (x u ), uv E, x u X u . Example: constraints between unary and pairwise variables. x u μ u (x u ) = 1 x u μ u (x u ) = 1 μ u (1) = x v μ uv (1, x v ) x u μ uv (x u , 3) = μ v (3) 2) With underlying network flow subproblem Model uniqueness constraint x u 6= x v u 6= V as network flow problem: s M = min μL G ∩M hθ, μi x v x u x w l 2 l 1 l 3 t μ u (l 1 )+μ u (l 2 )+μ u (l 3 )=1 v (l 1 )+μ v (l 2 )+μ v (l 3 )=1 μ w (l 1 )+μ w (l 2 )+μ w (l 3 )=1 μ u (l 1 ) μ u (l 2 ) μ u (l 3 ) μ v (l 1 ) μ v (l 2 ) μ v (l 3 ) μ w (l 1 ) μ w (l 2 ) μ w (l 3 ) μ u (l 1 )+μ v (l 1 )+μ w (l 1 )=1 μ u (l 2 )+μ v (l 2 )+μ w (l 2 )=1 μ u (l 3 )+μ v (l 3 )+μ w (l 3 )=1 3) With label factors μ u (l 1 ) μ u (l 2 ) μ u (l 3 ) = 1 μ u , u V μ v (l 1 ) μ v (l 2 ) μ v (l 3 ) = 1 μ w (l 1 ) μ w (l 2 ) μ w (l 3 ) = 1 ˜ μ l 1 (u ) ˜ μ l 1 (v ) ˜ μ l 1 (w ) = 1 ˜ μ l , l ∈L ˜ μ l 2 (u ) ˜ μ l 2 (v ) ˜ μ l 2 (w ) = 1 ˜ μ l 3 (u ) ˜ μ l 3 (v ) ˜ μ l 3 (w ) = 1 = = = = = = = = = Label factor for each label l ∈L. Each label factor forces label to be taken at most once. Optimization problem: min μL G , ˜ μ hθ, μi s.t. μ L G ˜ μ s conv(X s ), s ∈L μ u (s ) = ˜ μ s (u ), s X u . Optimization Lagrangean decomposition, constraint dualization: 1 CRF: dualize constraints between unary/pairwise variables. 2 Network flow: additionally dualize constraints between unary/network flow variables. 3 Label factor: additionally dualize constraints between unary/label factors. Optimize with dual block coordinate ascent [3]. 1 CRF: dual block coordinate ascent TRWS [1]. 2 Network flow: augment TRWS [1] by passing messages to network flow factor M G . 3 Label factor: augment TRWS [1] by passing messages to and from label factors. Round primal solution 1 CRF: via dynamic programming. 2 Network flow: use matching from network flow subproblem. 3 Label factor: via dynamic programming. Higher order extension Triplets T V 3 . Ternary costs θ uvw : X u × X v × X w R for uvw T. Higher order graph matching problem: min x X X u V θ u (x u )+ X uv E θ uv (x uv )+ X uvw T θ uvw (x uvw ) s.t. x u 6= x v u 6= v Relaxations and optimzers can be easily adapted to higher order. We use triplets for tightening relaxation [2]. Experimental results: averaged log(gap) vs runtime plots 10 -2 10 -1 10 0 10 1 10 -6 10 -4 10 -2 10 0 10 2 10 4 runtime(s) house log(energy - lower bound) AMP-C AMCF-C GM-O HBP-C DD 10 -1 10 0 10 1 10 -6 10 -4 10 -2 10 0 10 2 runtime(s) hotel log(energy - lower bound) AMP-C AMCF-C GM-O HBP-C DD 10 0 10 1 10 2 10 0 10 1 10 2 runtime(s) car log(energy - lower bound) AMP-C AMCF-C GM-O HBP-C DD 10 0 10 1 10 2 10 -1 10 0 10 1 10 2 runtime(s) motor log(energy - lower bound) AMP-C AMCF-C GM-O HBP-C DD 10 -1 10 0 10 1 10 0 10 1 10 2 10 3 runtime(s) graph flow log(energy - lower bound) AMP-O AMCF-O GM-O HBP-O DD 10 -1 10 0 10 1 10 2 10 1 10 2 10 3 10 4 10 5 runtime(s) worms log(energy - lower bound) AMP-O AMCF-O GM-O HBP-O DD Our solvers: 1 CRF: GM [1]. 2 Network flow: AMCF. 3 Label factors: AMP. Competing solvers: DD [4], HBP [5] Experimental results: averaged dual lower bound, primal energy, runtime Dataset/Algorithm AMP AMCF GM [1] HBP [5] DD [4] #I 105 LB -4293.00 -4293.00 -4293.94 -4294.98 -4293.00 #V 30 UB -4293.00 -4293.00 -4290.12 -4287.00 -4293.00 hotel #L 30 time(s) 3.07 4.31 14.91 16.76 0.42 #I 105 LB -3778.13 -3778.13 -3778.13 -3778.13 -3778.13 #V 30 UB -3778.13 -3778.13 -3778.13 -3778.13 -3778.13 house #L 30 time(s) 2.81 3.02 1.63 14.76 2.36 #I 6 LB -2849.96 -2853.95 -2865.20 -2888.88 -2840.29 #V 126 UB -2836.57 -2829.49 -2798.11 -2181.96 -2840.00 graph flow #L 126 time(s) 218.22 231.53 216.39 194.43 17.92 #I 30 LB -69.56 -69.57 -69.78 -69.77 -74.17 #V 49 UB -69.27 -68.58 -68.32 -69.23 -57.11 car #L 49 time(s) 656.86 698.74 776.58 572.41 83.71 #I 20 LB -62.97 -62.98 -63.02 -62.98 -64.25 #V 52 UB -62.95 -62.83 -62.71 -62.93 -59.60 motor #L 52 time(s) 70.82 98.57 285.19 317.61 69.22 #I 30 LB -48471.21 -48491.81 -48517.50 -48495.92 -62934.54 #V 605 UB -48453.90 -48305.90 -3316.69 -48288.65 -5254.08 worms #L 1500 time(s) 213.89 614.80 229.21 295.67 1471.95 References [1] V. Kolmogorov. Convergent tree-reweighted message passing for energy minimization. IEEE Trans. Pattern Anal. Mach. Intell., 28(10):1568–1583, 2006. [2] D. Sontag, D. K. Choe, and Y. Li. Efficiently searching for frustrated cycles in map inference. In UAI, pages 795–804. AUAI Press, 2012. [3] P. Swoboda, J. Kuske, and B. Savchynskyy. A dual ascent framework for Lagrangean decomposition of combinatorial problems. In CVPR, 2017. [4] L. Torresani, V. Kolmogorov, and C. Rother. A dual decomposition approach to feature correspondence. IEEE Trans. Pattern Anal. Mach. Intell., 35(2):259–271, 2013. [5] Z. Zhang, Q. Shi, J. McAuley, W. Wei, Y. Zhang, and A. van den Hengel. Pairwise matching through max-weight bipartite belief propagation. In CVPR, 2016. Acknowledgements. The first author was supported by the European Research Council under the European Unions Seventh Framework Programme (FP7/2007-2013)/ERC grant agreement no 616160. The three last authors were supported by the European Research Council under the European Union ˆ as Horizon 2020 research and innovation programme (grant agreement No 647769) and the last author by DFG Grant ”ERBI” SA 2640/1-1.

Transcript of A Study of Lagrangean Decompositions and Dual Ascent ... · 1 CRF: dualize constraints between...

A Study of Lagrangean Decompositions and Dual Ascent Solvers for Graph MatchingPaul Swoboda1 Carsten Rother2 Hassan Abu Alhaija2 Dagmar Kainmueller3 Bogdan Savchynskyy2

1 IST Austria 2 TU Dresden, Germany 3 MPI CBG, Germanye-mail: [email protected]

Goal and contributionGoal: We study efficient Lagrangean decomposition based algorithms for graph matchingand propose new state-of-the-art ones.Contribution:

Several Lagrangean decompositions of the graph matching problems:1 as a CRF2 with a network flow subproblem3 with additional label factorsAssociated dual block coordinate ascent algorithms.Extensive numerical comparison against leading Lagrangean decomposition basedsolvers [4, 5].

Practical impact:Test data and C++ implementation available athttps://github.com/pawelswoboda/LP_MP.

Theoretical foundations of used algorithms [3].

Graph matching problemGraph G = (V,E).Discrete universe of labels L.Discrete label space X =

∏u∈V Xu with Xu ⊆ L.

Unary costs θu : Xu → R for u ∈ V.Pairwise costs θuv : Xu × Xv → R for uv ∈ E.

minx∈X

∑u∈V

θu(xu) +∑uv∈E

θuv(xuv)︸ ︷︷ ︸CRF

s.t. xu 6= xv ∀u 6= v︸ ︷︷ ︸uniqueness constraints

1) Transformation to CRFModify pairwise potentials:

Penalize same label for different nodes.For each u, v ∈ V with Xu ∩ Xv 6= ∅:

Set θuv(x , x) =∞ ∀x ∈ Xu ∩ Xv .

Example: uv ∈ E, Xu = {1,2,3,4} = Xv . θuv =

∞ θuv(1,2) θuv(1,3) θuv(1,4)θuv(2,1) ∞ θuv(2,3) θuv(2,4)θuv(3,1) θuv(3,2) ∞ θuv(3,4)θuv(4,1) θuv(4,2) θuv(4,3) ∞

LP-relaxation for CRFOptimization over local polytope: Simplex ∆n = {x ∈ Rn

+ :∑n

i=1 xi = 1}.

minµ∈LG

〈θ, µ〉 :=∑u∈V

〈θu, µu〉 +∑uv∈E

〈θuv , µuv〉

LG =

µu ∈ ∆|Xu|, u ∈ V,µuv ∈ ∆|Xuv |, uv ∈ E,∑xv∈Xv

µuv(xu, xv) = µu(xu), uv ∈ E, xu ∈ Xu .

Example: constraints between unary and pairwise variables.

∑ x uµ

u(x

u)=

1 ∑xu

µu (x

u )=

1

µu(1) =∑

xvµuv(1, xv)

∑xuµuv(xu,3) = µv(3)

2) With underlying network flow subproblemModel uniqueness constraint xu 6= xv ∀u 6= V as network flow problem:

sM =

minµ∈LG∩M 〈θ, µ〉

xv

xu

xw

l2

l1

l3

tµu(l1)+µu(l2)+µu(l3)=1

v(l 1)+µ v(

l 2)+µ v(

l 3)=1

µw (l1 )+

µw (l2 )+

µw (l3 )=1

µu(l1)µu (l2 )µ

u (l3 )

µ v(l 1)

µv(l2)µv (l3 )

µ w(l 1)

µw(l 2)

µw(l3)

µu (l1 )+

µv (l1 )+

µw (l1 )=1

µu(l2)+µv(l2)+µw(l2)=1

µ u(l 3)+µ v(

l 3)+µw(l 3)=

1

3) With label factors

µu(l1)µu(l2)µu(l3)∑ =

1

µu, u ∈ V

µv(l1)µv(l2)µv(l3)∑ =

1µw(l1)µw(l2)µw(l3)∑ =

1

µl1(u)µl1(v)µl1(w)

∑=

1

µl, l ∈ L

µl2(u)µl2(v)µl2(w)

∑=

1

µl3(u)µl3(v)µl3(w)

∑=

1

=

=

=

=

=

=

=

=

=

Label factor for each label l ∈ L.Each label factor forces label to betaken at most once.

Optimization problem:

minµ∈LG,µ 〈θ, µ〉

s.t.µ ∈ LGµs ∈ conv(Xs), s ∈ Lµu(s) = µs(u), s ∈ Xu .

OptimizationLagrangean decomposition, constraint dualization:1 CRF: dualize constraints between unary/pairwise variables.2 Network flow: additionally dualize constraints between unary/network flow variables.3 Label factor: additionally dualize constraints between unary/label factors.

Optimize with dual block coordinate ascent [3].1 CRF: dual block coordinate ascent TRWS [1].2 Network flow: augment TRWS [1] by passing messages to network flow factorMG.3 Label factor: augment TRWS [1] by passing messages to and from label factors.

Round primal solution1 CRF: via dynamic programming.2 Network flow: use matching from network flow subproblem.3 Label factor: via dynamic programming.

Higher order extension

Triplets T ⊂(

V3

).

Ternary costs θuvw : Xu × Xv × Xw → R for uvw ∈ T.

Higher order graph matching problem:

minx∈X

∑u∈V

θu(xu) +∑uv∈E

θuv(xuv) +∑

uvw∈T

θuvw(xuvw) s.t. xu 6= xv ∀u 6= v

Relaxations and optimzers can be easily adapted to higher order.We use triplets for tightening relaxation [2].

Experimental results: averaged log(gap) vs runtime plots

10−2 10−1 100 101

10−6

10−4

10−2

100

102

104

runtime(s)house

log(energy-lower

bound)

AMP-CAMCF-CGM-OHBP-CDD

10−1 100 101

10−6

10−4

10−2

100

102

runtime(s)hotel

log(energy-lower

bound)

AMP-CAMCF-CGM-OHBP-CDD

100 101 102

100

101

102

runtime(s)car

log(energy-lower

bound)

AMP-CAMCF-CGM-OHBP-CDD

100 101 102

10−1

100

101

102

runtime(s)motor

log(en

ergy

-lower

bou

nd)

AMP-CAMCF-CGM-OHBP-CDD

10−1 100 101

100

101

102

103

runtime(s)graph flow

log(en

ergy

-lower

bou

nd)

AMP-OAMCF-OGM-OHBP-ODD

10−1 100 101 102101

102

103

104

105

runtime(s)worms

log(en

ergy

-lower

bou

nd)

AMP-OAMCF-OGM-OHBP-ODD

Our solvers:1 CRF: GM [1].2 Network flow: AMCF.3 Label factors: AMP.

Competing solvers: DD [4], HBP [5]

Experimental results: averaged dual lower bound, primal energy, runtimeDataset/Algorithm AMP AMCF GM [1] HBP [5] DD [4]

#I 105 LB -4293.00 -4293.00 -4293.94 -4294.98 -4293.00#V 30 UB -4293.00 -4293.00 -4290.12 -4287.00 -4293.00

hotel

#L 30 time(s) 3.07 4.31 14.91 16.76 0.42#I 105 LB -3778.13 -3778.13 -3778.13 -3778.13 -3778.13#V 30 UB -3778.13 -3778.13 -3778.13 -3778.13 -3778.13

house

#L 30 time(s) 2.81 3.02 1.63 14.76 2.36#I 6 LB -2849.96 -2853.95 -2865.20 -2888.88 -2840.29#V ≤ 126 UB -2836.57 -2829.49 -2798.11 -2181.96 -2840.00

graph

flow

#L ≤ 126 time(s) 218.22 231.53 216.39 194.43 17.92#I 30 LB -69.56 -69.57 -69.78 -69.77 -74.17#V ≤ 49 UB -69.27 -68.58 -68.32 -69.23 -57.11c

ar

#L ≤ 49 time(s) 656.86 698.74 776.58 572.41 83.71#I 20 LB -62.97 -62.98 -63.02 -62.98 -64.25#V ≤ 52 UB -62.95 -62.83 -62.71 -62.93 -59.60

motor

#L ≤ 52 time(s) 70.82 98.57 285.19 317.61 69.22#I 30 LB -48471.21 -48491.81 -48517.50 -48495.92 -62934.54#V ≤ 605 UB -48453.90 -48305.90 -3316.69 -48288.65 -5254.08

worms

#L ≤ 1500 time(s) 213.89 614.80 229.21 295.67 1471.95

References[1] V. Kolmogorov. Convergent tree-reweighted message passing for energy minimization. IEEE Trans.

Pattern Anal. Mach. Intell., 28(10):1568–1583, 2006.

[2] D. Sontag, D. K. Choe, and Y. Li. Efficiently searching for frustrated cycles in map inference. In UAI,pages 795–804. AUAI Press, 2012.

[3] P. Swoboda, J. Kuske, and B. Savchynskyy. A dual ascent framework for Lagrangean decompositionof combinatorial problems. In CVPR, 2017.

[4] L. Torresani, V. Kolmogorov, and C. Rother. A dual decomposition approach to featurecorrespondence. IEEE Trans. Pattern Anal. Mach. Intell., 35(2):259–271, 2013.

[5] Z. Zhang, Q. Shi, J. McAuley, W. Wei, Y. Zhang, and A. van den Hengel. Pairwise matching throughmax-weight bipartite belief propagation. In CVPR, 2016.

Acknowledgements. The first author was supported by the European Research Council under the European Unions Seventh Framework Programme(FP7/2007-2013)/ERC grant agreement no 616160. The three last authors were supported by the European Research Council under the European Unionas Horizon

2020 research and innovation programme (grant agreement No 647769) and the last author by DFG Grant ”ERBI” SA 2640/1-1.