Learning Inhomogeneous Gibbs Models

Ce Liuceliu@microsoft.com

How to Describe the Virtual World

Histogram Histogram: marginal distribution of image variances

Non Gaussian distributed

Texture Synthesis (Heeger et al, 95)

Image decomposition by steerable filters Histogram matching

FRAME (Zhu et al, 97) Homogeneous Markov random field (MRF) Minimax entropy principle to learn homogeneous

Gibbs distribution Gibbs sampling and feature selection

Our Problem To learn the distribution of structural signals

Challenges• How to learn non-Gaussian distributions in high

dimensions with small observations?• How to capture the sophisticated properties of the

distribution?• How to optimize parameters with global convergence?

Inhomogeneous Gibbs Models (IGM)

A framework to learn arbitrary high-dimensional distributions• 1D histograms on linear features to describe high-

dimensional distribution

• Maximum Entropy Principle– Gibbs distribution

• Minimum Entropy Principle– Feature Pursuit

• Markov chain Monte Carlo in parameter optimization

• Kullback-Leibler Feature (KLF)

1D Observation: Histograms Feature f(x): Rd→ R

• Linear feature f(x)=fTx• Kernel distance f(x)=||f-x||

Marginal distribution

Histogram - dxxfxzzh T )()()( ff

)(1 ff )0,,0,1,0,,0()( iT xf

Intuition

Learning Descriptive Models)(xf

obsH1f

obsH2f

synH1f

synH2f=

)()( xpxf

Learning Descriptive Models Sufficient features can make the learnt model f(x)

converge to the underlying distribution p(x) Linear features and histograms are robust

compared with other high-order statistics Descriptive models

},,1),()(|)({ mizhzhxp pff ii

Maximum Entropy Principle Maximum Entropy Model

• To generalize the statistical properties in the observed• To make the learnt model present information no more

than what is available Mathematical formulation

})(log)(max{arg

))((maxarg)(*

dxxpxp

xpentropyxp

miHH fpii

,,1,: tosubjected ff

Intuition of Maximum Entropy Principle

)}()(|)({11zHzHxp pf

synH1f

)(* xp

Solution form of maximum entropy model

Parameter:

})(),(exp{)(

Tii zZ

Inhomogeneous Gibbs Distribution

Gibbs potential

)( xTif )(),( xz Tii f

Estimating Potential Function Distribution form

Normalization

Maximizing Likelihood Estimation (MLE)

1st and 2nd order derivatives

})(,exp{)(

- dxxZm

Ti })(,exp{)(

)(maxarg);(log)(:Let *

1)( obsTixp i

HxE ff - )();(

Parameter Learning Monte Carlo integration

Algorithm

synTixp i

HxE ff )]([);(obssyn

)}({},{:Input zH obsi if

fsi},{:Initialize

);(~}{:Sampling xpximiH syn

i:1,:histograms syn Compute f

miHHs obssyni ii

:1),(:parameters Update - ff

),(:sdivergence Histogram1

obssynm

i iiHHKLD ff

D:Untill

s Reduce

}{Λ:Output ix,

Gibbs Sampling

),,,|(~ )()(3

ttt xxxxx

),,,|(~ )()(3

ttt xxxxπx

),,|(~ )1(1

tK xxxx

Minimum Entropy Principle Minimum entropy principle

• To make the learnt distribution close to the observed

Feature selection

dxxpxfxfxpfKL

)(log)());(,( **

)];([log)]([log *- xpExfE ff

))(());(( * xfentropyxpentropy -

}{ )(if

));((minarg ** Ipentropy

})(,)(,exp{)(

Tii ff

})(,exp{)(

Feature Pursuit A greedy procedure to learn the feature set

Reference model

Approximate information gain

));(),(());(),(()( -xpxfKLxpxfKLd ref

Kii 1}{ f

));(),((maxarg);(

xpxfKLp

Proposition

The approximate information gain for a new feature is

and the optimal energy function for this feature is

),()( pobs HHKLd fff

Kullback-Leibler Feature Kullback-Leibler Feature

Pursue feature by• Hybrid Monte Carlo• Sequential 1D optimization• Feature selection

obsobssynobs

KL zHzH

zHHHKL)()(

log)(maxarg),(maxargf

Acceleration by Importance Sampling

Gibbs sampling is too slow… Importance sampling by the reference model

})(,exp{)(

refiref

ref xZ

})(),(exp{1

refiij xw f

),(~ refrefj xpx

Flowchart of IGM

IGM Syn Samples

Obs Samples

FeaturePursuit

KL Feature

Output

Obs Histograms

Toy Problems (1)

Synthesizedsamples

Gibbs potential

Observedhistograms

Synthesizedhistograms

Featurepursuit

Mixture of two Gaussians Circle

Toy Problems (2)

Swiss Roll

Applied to High Dimensions In high-dimensional space

• Too many features to constrain every dimension• MCMC sampling is extremely slow

Solution: dimension reduction by PCA Application: learning face prior model

• 83 landmarks defined to represent face (166d)• 524 samples

Face Prior Learning (1)

Observed face examples Synthesized face samples without any features

Synthesized with 10 features Synthesized with 20 features

Synthesized with 30 features Synthesized with 50 features

Observed Histograms

Synthesized Histograms

Gibbs Potential Functions

Learning Caricature Exaggeration

Synthesis Results

Learning 2D Gibbs Process

Observed Pattern Triangulation Random Pattern

Obs Histogram (1)

Synthesized Histogram1

Syn Pattern (1)Syn Histogram (1)

Obs Histogram (2)

Obs Histogram (3)

Obs Histogram (4)

Syn Pattern (2)

Syn Pattern (3)

Syn Pattern (4)

Syn Histogram (2)

Syn Histogram (3)

Syn Histogram (4)

Thank you!

celiu@csail.mit.edu

Learning Inhomogeneous Gibbs Models

Documents

Transcript of Learning Inhomogeneous Gibbs Models

Efficient Phase Diagram Information from Computational ... · Clearly defined module for Gibbs energy models. 4. Clearly defined module for Gibbs energy minimization and phase equilibria

Models for the excess Gibbs energy: models with three or more sublattices, models for phases with order-disorder transitions, Gibbs energy for phases that.

Learning Sequence Motif Models Using Gibbs Sampling

Fast Collapsed Gibbs Sampler for Dirichlet Process ...rajarshd.github.io/talks/DPGMM_Cholesky.pdfFast Collapsed Gibbs Sampler for Dirichlet Process Gaussian Mixture Models using Rank

Parameter Estimation in LNSV Models: Griddy Gibbs versus ...

Distributed Gibbs Sampling of Latent Topic Models: The Gritty Details

Gibbs Max-margin Topic Models with Data Augmentationjmlr.csail.mit.edu/papers/volume15/zhu14a/zhu14a.pdf · Gibbs Max-margin Topic Models with Data Augmentation sampling substeps

Models for the Excess Gibbs Energy - Tongji Universitymestudio.tongji.edu.cn/_upload/article/files/ed/e5/d420622646d5829… · Models for the Excess Gibbs Energy •GE/RT is a function

Gibbs estimation of microstructure models: Teaching notes ...

Disagreement percolation for Gibbs ball models

Directed - Bayes Nets Undirected - Markov Random Fields Gibbs Random Fields Causal graphs and causality GRAPHICAL MODELS.

Eﬃcient Gibbs Sampling for Markov Switching GARCH · 2012. 12. 24. · arXiv:1212.5397v1 [math.ST] 21 Dec 2012 Eﬃcient Gibbs Sampling for Markov Switching GARCH Models MonicaBillio†

ﬀ in Inhomogeneous Systems

Control of Inhomogeneous Spin Ensembles. Robust Control of Inhomogeneous Spin Ensembles M x y M.

Gibbs Sampling in Hierarchical Models

Inference in Graphical Models - Uni Stuttgart · 2016-12-21 · Machine Learning Inference in Graphical Models Sampling methods (Rejection, Importance, Gibbs), Variable Elimination,

Fully Bayesian inference for neural models with negative ... · negative-binomial models. We focus ﬁrst on regression models, for which we derive simple Gibbs sampling and EM algorithms.

Inhomogeneous Hypergraph Clustering with …papers.nips.cc/paper/6825-inhomogeneous-hypergraph...Inhomogeneous Hypergraph Clustering with Applications Pan Li Department ECE UIUC panli2@illinois.edu

Graphical Models Marc Toussaint University of Stuttgart ... · B. Inference in Graphical Models – Sampling methods (Rejection, Importance, Gibbs) – Variable Elimination & Factor

Methods Gibbs Sampling and Variational · Gibbs Sampling and Variational Methods ... Mixture Models More applications: genetics, populations as mixture of ancestral populations 2