ECE 6504: Advanced Topics in Machine Learnings14ece6504/slides/L17_mrf_inference… · ECE 6504:...

21
ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Dhruv Batra Virginia Tech Topics Markov Random Fields: MAP Inference Integer Programming, LP formulation Dual Decomposition Readings: KF 13.1-5, Barber 5.1,28.9

Transcript of ECE 6504: Advanced Topics in Machine Learnings14ece6504/slides/L17_mrf_inference… · ECE 6504:...

Page 1: ECE 6504: Advanced Topics in Machine Learnings14ece6504/slides/L17_mrf_inference… · ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

ECE 6504: Advanced Topics in Machine Learning

Probabilistic Graphical Models and Large-Scale Learning

Dhruv Batra Virginia Tech

Topics –  Markov Random Fields: MAP Inference

–  Integer Programming, LP formulation –  Dual Decomposition

Readings: KF 13.1-5, Barber 5.1,28.9

Page 2: ECE 6504: Advanced Topics in Machine Learnings14ece6504/slides/L17_mrf_inference… · ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

Administrativia •  HW1

–  Solutions and grades released

•  HW2 –  Solutions released –  Grades next week

•  Project Presentations –  When: April 22, 24 –  Where: in class –  5 min talk

•  Main results •  Semester completion 2 weeks out from that point so nearly finished

results expected •  Slides due: April 21 11:55pm

(C) Dhruv Batra 2

Page 3: ECE 6504: Advanced Topics in Machine Learnings14ece6504/slides/L17_mrf_inference… · ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

Recap of Last Time

(C) Dhruv Batra 3

Page 4: ECE 6504: Advanced Topics in Machine Learnings14ece6504/slides/L17_mrf_inference… · ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

MAP Inference

(C) Dhruv Batra 4

MAP

Inference

Most Likely Assignment

y1

y2

yn

Person

Table Plate

The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open

S(y) =�

i∈Vθi(yi) +

(i,j)∈E

θij(yi, yj)

P (y) =1

Z eS(y)

kx1 1 1 10 0

kxk

10

10 10

10

0

0

Node Scores / Local Rewards

Edge Scores / Distributed Prior

P (y)

y

Page 5: ECE 6504: Advanced Topics in Machine Learnings14ece6504/slides/L17_mrf_inference… · ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

MAP Inference •  Why is MAP difficult?

•  What if we independently maximize the terms?

(C) Dhruv Batra 5

Page 6: ECE 6504: Advanced Topics in Machine Learnings14ece6504/slides/L17_mrf_inference… · ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

MAP in Pairwise MRFs

•  Over-Complete Representation

6 (C) Dhruv Batra

G = (V, E)

X1

X2

Xn Xi kxk

… …

θij(1, 1) θij(1, k)

θij(k, 1) θij(k, k)

θ1(1)

θ1(k)

kx1

x1 (x1, x2) (xn−1, xn)

θn(1)

θn(k)

… …

θij(1, 1) θij(1, k)

θij(k, 1) θij(k, k)

… xn…

kx1

µi(s)1

1

0

0

0

0

0

0

θ = θ1(1) . . . θ1(k) θn(1) . . . θn(k) θn−1,n(1, 1) . . . θn−1,n(k, k)θ12(1, 1) . . . θ12(k, k)

1

0

0

0

xi = 1

0

1

0

0

xi = 2

µ1(1) . . . µ1(k) µn(1) . . . µn(k)

xi

Page 7: ECE 6504: Advanced Topics in Machine Learnings14ece6504/slides/L17_mrf_inference… · ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

MAP in Pairwise MRFs

•  Over-Complete Representation

7 (C) Dhruv Batra

G = (V, E)

x1

x2

xn Xi

x1 (x1, x2) (xn−1, xn)… xn…

θ = θ1(1) . . . θ1(k) θn(1) . . . θn(k) θn−1,n(1, 1) . . . θn−1,n(k, k)θ12(1, 1) . . . θ12(k, k)

k2x1

µij(s, t)

xi

xj

1 0 0 0 0 0 0 0 0 0 0 0

xi = 1xj = 1

0 1 0 0 0 0 0 0 0 0 0 0

xj = 2xi = 1

µ1(1) . . . µ1(k) µn(1) . . . µn(k) µ12(1, 1) . . . µ12(k, k) µn−1,n(1, 1) . . . µn−1,n(k, k)µx =

S(x) = θ · µx

Page 8: ECE 6504: Advanced Topics in Machine Learnings14ece6504/slides/L17_mrf_inference… · ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

MAP in Pairwise MRFs •  Integer Program

(C) Dhruv Batra 8

s

µij(s, t) = µj(t)

s,t

µi,j(s, t) = 1

s

µi(s) = 1

µij(s, t) ∈ {0, 1}

µi(s) ∈ {0, 1}Indicator Variables

Unique Label

Consistent Assignments

maxµ

θTµ

Page 9: ECE 6504: Advanced Topics in Machine Learnings14ece6504/slides/L17_mrf_inference… · ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

MAP in Pairwise MRFs •  MAP Integer Program

(C) Dhruv Batra 9

µx5

µx4µx3

µx2

µx1

maxµ

θTµ

s.t. Aµ = b

µ(·) ∈ {0, 1}

Page 10: ECE 6504: Advanced Topics in Machine Learnings14ece6504/slides/L17_mrf_inference… · ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

MAP in Pairwise MRFs •  MAP Linear Program

(C) Dhruv Batra 10

µx5

µx4µx3

µx2

µx1

A = O(|E|)

O(|E|)

Off-the-shelf solvers CPLEX Mosek

etc

maxµ

θTµ

s.t. Aµ = b

µ(·) ∈ [0, 1]

Page 11: ECE 6504: Advanced Topics in Machine Learnings14ece6504/slides/L17_mrf_inference… · ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

Plan for today •  MRF Inference

–  (Specialized) MAP Inference •  Integer Programming Formulation •  Linear Programming Relaxation

–  Understanding the LP better –  When is it tight? –  When is it not?

•  Dual Decomposition –  Algorithm for solving this LP

(C) Dhruv Batra 11

Page 12: ECE 6504: Advanced Topics in Machine Learnings14ece6504/slides/L17_mrf_inference… · ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

MAP in Pairwise MRFs •  MAP Integer Program

(C) Dhruv Batra 12

µx5

µx4µx3

µx2

µx1

maxµ

θTµ

s.t. Aµ = b

µ(·) ∈ {0, 1}

Page 13: ECE 6504: Advanced Topics in Machine Learnings14ece6504/slides/L17_mrf_inference… · ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

Marginal Polytope

(C) Dhruv Batra 13

•  a

Figure Credit: David Sontag

Page 14: ECE 6504: Advanced Topics in Machine Learnings14ece6504/slides/L17_mrf_inference… · ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

MAP in Pairwise MRFs •  MAP Linear Program

•  Properties –  If LP-opt is integral, MAP is found –  LP always integral for trees –  Efficient message-passing schemes for solving LP

(C) Dhruv Batra 14

µx5

µx4µx3

µx2

µx1

maxµ

θTµ

s.t. Aµ = b

µ(·) ∈ [0, 1]

Page 15: ECE 6504: Advanced Topics in Machine Learnings14ece6504/slides/L17_mrf_inference… · ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

LP Relaxation •  Block Co-ordinate / Sub-gradient Descent on Dual

(C) Dhruv Batra 15

A = O(|E|)

O(|E|)

λij→j

λji→i

Page 16: ECE 6504: Advanced Topics in Machine Learnings14ece6504/slides/L17_mrf_inference… · ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

LP Relaxation •  Block Co-ordinate / Sub-gradient Descent on Dual

(C) Dhruv Batra 16

A = O(|E|)

O(|E|)

λ(t)ji→i

λ(t)ij→j

Page 17: ECE 6504: Advanced Topics in Machine Learnings14ece6504/slides/L17_mrf_inference… · ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

LP Relaxation •  Block Co-ordinate / Sub-gradient Descent on Dual

(C) Dhruv Batra 17

A = O(|E|)

O(|E|)

λ(t+1)ij→j

λ(t+1)ji→i

Distributed Message-Passing

Still inefficient!

Page 18: ECE 6504: Advanced Topics in Machine Learnings14ece6504/slides/L17_mrf_inference… · ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

Linear Programming Duality

(C) Dhruv Batra 18 Figure Credit: David Sontag

Page 19: ECE 6504: Advanced Topics in Machine Learnings14ece6504/slides/L17_mrf_inference… · ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

Dual Decomposition •  For MAP Inference

–  On board

(C) Dhruv Batra 19

Page 20: ECE 6504: Advanced Topics in Machine Learnings14ece6504/slides/L17_mrf_inference… · ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

MAP in Pairwise MRFs •  MAP Integer Program

(C) Dhruv Batra 20

maxµ

θTµ

s.t. Aµ = b

µ(·) ∈ {0, 1}

µx5

µx4µx3

µx2

µx1

Page 21: ECE 6504: Advanced Topics in Machine Learnings14ece6504/slides/L17_mrf_inference… · ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

MAP LP •  Lagrangian Relaxation

(C) Dhruv Batra 21

Dual

Convex (Non-smooth)

Upper-Bound on MAP

f(λ) =

Subgradient Descent

∇f(λ(0))

λ(0)

∇f(λ(1))

λ(1)

maxµ∈C

i

θi · µi +�

(i,j)

θij · µij

s.t. µi(·),µij(·) ∈ {0, 1}

minλ≥0

f(λ)

λ

MAP score

f(λ)

−λ · (Aµ− b)