Gaussian Mixture Models and Expectation-Maximization Algorithm.

Gaussian Mixture Models andExpectation-Maximization

Algorithm

The RGB Domain

A regular image

The RGB Domain

Image pixels in RGB space

Pixel Clusters

Suppose we cluster the points for 2 clusters

Pixel Clusters

The result in image space

Normal Distribution (1D Gaussian)

1( , ) exp

2 , std

d = 2 x = random data point (2D vector) = mean value (2D vector) = covariance matrix (2D matrix)

2D Gaussians

( , ) exp22 det( )

x xf x

The same equation holds for a 3D Gaussian

2D Gaussians

( , ) exp22 det( )

x xf x

Exploring Covariance Matrix

cov( , )1

cov( , )

i ii h

x random vector w h

w hx x

is symmetric has eigendecomposition (svd)

* * TV D V

1 2 ... d

Covariance Matrix Geometry

TV D V

3D Gaussians

( , , )

cov( , ) cov( , )1

cov( , ) cov( , )

i i gi

x r g b

g r b r

x x r g b gN

r b g b

GMMs – Gaussian Mixture Models

Suppose we have 1000 data points in 2D space (w,h)

GMMs – Gaussian Mixture Models

Assume each data point is normally distributed Obviously, there are 5 sets of underlying gaussians

The GMM assumption

There are K components (Gaussians) Each k is specified with three parameters: weight, mean,

covariance matrix The total density function is:

1( ) exp

22 det( )

{ , , }

Kj j j j

x xf x

weight j

The EM algorithm (Dempster, Laird and Rubin, 1977)

Raw data GMMs (K = 6) Total Density Function

EM Basics

Objective:Given N data points, find maximum likelihood estimation of :

Algorithm:1. Guess initial

2. Perform E step (expectation) Based on , associate each data point with specific gaussian

3. Perform M step (maximization) Based on data points clustering, maximize

4. Repeat 2-3 until convergence (~tens iterations)

1argmax ( ,..., )Nf x x

EM Details

E-Step (estimate probability that point t associated to gaussian j):

M-Step (estimate new parameters):

( , )1,..., 1,...,

j t j j

i t i ii

f xw j K t N

( )( )

Nnewj t j

t j tnew tj N

N new new Tt j t j t jnew t

EM Example

Gaussian j

data point t

blue: wt,j

EM Example

Back to Clustering

We want to label “close” pixels with the same label Proposed metric: label pixels from the same gaussian

with same label Label according to max probability:

Number of labels = K

,( ) argmax( )t jj

label t w

Graph-Cut Optimization

Motivation for Graph-Cuts

Let’s recall the car example

Suppose we have two clusters in color-space Each pixel is colored by it’s associated gaussian index

A Problem: Noise

Why? Pixel labeling is done independently for each pixel, ignoring the spatial relationships between pixels!

Previous model for labeling:

A new model for labeling. Minimize E:

f = Labeling function, assigns label fp for each pixel p Edata = Data Term Esmooth = Smooth Term Lamda is a free parameter

Formalizing a New Labeling Problem

1,..., ( )( ) argmax( )

( )p p jj

j K gaussiansf label p w

p image pixels

( ) ( ) ( )data smoothE f E f E f

Labels Set: { j=1,…,K } Edata:

Penalize disagreement between pixel and the GMM

Esmooth: Penalize disagreement between two pixels, unless it’s a natural

edge in the image

dist(p,q) = normalized color-distance between p,q

The Energy Function

,( ) ( ) 1pp p p p p f

p Pixels

D f D f w

0( , ) ( , )

1 ( , )p q

p q p q p q p qp q

neighbors

f fV f f V f f

dist p q ow

( ) ( ) ( )data smoothE f E f E f

Solving Min(E) is NP-hard It is possible to approximate the solution using iterative

methods Graph-Cuts based methods approximate the global

solution (up to constant factor) in polynomial time

Read: “Fast Approximate Energy Minimization via Graph Cuts”, Y. Boykov, O. Veksler and R. Zabih, PAMI 2001

Minimizing the Energy

When using iterative methods, each iteration some of the pixels change their labeling

Given a label α, a move from partition P (labeling f) to a new partition P’ (labeling f’) is called an α-expansion move if:

α-expansion moves

Current Labeling

One Pixel Move

α-β-swapMove

α-expansionMove

' 'l lP P P P l

Algorithm for Minimizing E(f)

1. Start with an arbitrary labeling

2. Set success = 0

3. For each label j3.1 Find f’ = argmin(E(f’)) among f’ within one α-expansion of f

3.2 If E(f’) < E(f), set f = f’ and success = 1

4. If (success == 1) Goto 2

5. Return f

How to find argmin(E(f’)) ?

A Reminder: min-cut / max-flow

Given two terminal nodes α and β in G=(V,E), a cut is a set of edges C E that separates α from β in G’=(V,E\C) Also, no proper subset of C separates α from β in G’.

The cost of a cut is defined as the sum of all the edge weights in the cut. The minimum-cut of G is the cut C with the lowest cost.

The minimum-cut problem is solvable in practically linear time.

Finding the Optimal Expansion Move

Problem:

Find f’ = argmin(E(f’)) among f’ within one α-expansion of f

Solution:

Translate the problem to a min-cut problem on an appropriately

defined graph.

Graph Structure for Optimal Expansion Move

if nodet Cf

f if nodet C

Terminal α

Terminal not(α)Cut C

1-1 correspondence between cut and labeling E(f) is minimized!

Each pixel gets a node

A Closer Look

P1 P2 Pα

Add auxiliary nodes between pixel with different labels

A Closer Look

P1 P2 Pα

Add two terminal nodes for α and not(α)

A Closer Look

P1 P2 Pα

A Closer Look

P1 P2 Pα

A Closer Look

P1 P2 Pα

A Closer Look

P1 P2 Pα

A Closer Look

P1 P2 Pα

A Closer Look

P1 P2 Pα

Implementation Notes

Neighboring system can be 4-connected pixels,8-connected and even more.

Lamda allows to determine the ratio between the data term and the smooth term.

Solving Min(E) is simpler and possible in polynomial time if only two labels involved(see “Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in N-D Images”, Y. Boykov and M-P. Jolly 2001)

There is a ready-to-use package for solving max-flow(see http://www.cs.cornell.edu/People/vnk/software/maxflow-v2.2.src.tar.gz)

Final ProjectOptimized Color Transfer

www.cs.tau.ac.il/~gamliela/color_transfer_project/color_transfer_project.htm

Gaussian Mixture Models and Expectation-Maximization Algorithm.

Documents

Transcript of Gaussian Mixture Models and Expectation-Maximization Algorithm.

GENERAL CONVERGENT EXPECTATION MAXIMIZATION … · Expectation maximization (EM), Richardson-Lucy, simultaneous alge- braic reconstruction technique (SART), image reconstruction,

Expectation Maximization Method

Gaussian Mixture Models and Expectation Maximization

07 Machine Learning - Expectation Maximization

Expectation-Maximization for Learning Determinantal Point ...jgillenw.com/nips2014.pdf · Expectation-Maximization for Learning Determinantal Point Processes Jennifer Gillenwater

EM Demystified: An Expectation-Maximization Tutorial

Progressive Expectation–Maximization for Hierarchical Volumetric Photon Mappingwjarosz/publications/jakob11... · 2013. 7. 7. · Progressive Expectation–Maximization for Hierarchical

Performance Comparison of K-Means and Expectation ... · Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project

Expectation Maximization A “Gentle” Introduction

Expectation-Maximization · Expectation Maximization {Convenientalgorithm forcertain Maximum Likelihood problems. { Viablealternativeto Newton or Conjugate Gradient algorithms. {

Text Clustering, K-Means, Gaussian Mixture Models ...smaskey/CS6998-0412/slides/... · Mixture Models, Expectation-Maximization, Hierarchical Clustering ... of Clustering Algorithm.

Expectation Maximization and Gibbs Sampling

K-Means, Expectation Maximization and Segmentationluthuli.cs.uiuc.edu/~daf/courses/cs543computervision/week 6/emseg.pdf · ... Expectation Maximization and Segmentation ... Color

Gaussian Mixture Models and Expectation Maximization

CSC 2515 Lecture 9: Expectation-Maximization

Natural Language Processing Expectation Maximization.

Unified Expectation Maximization

The Expectation-maximization Algorithm

LECTURE 11: EXPECTATION MAXIMIZATION (EM)

Maximization-Expectation Algorithmwelling/publications/papers/BKM_techrep.pdfBayesian K-Means as a \Maximization-Expectation" Algorithm Kenichi Kurihara kurihara@mi.cs.titech.ac.jp