Recursive Cavity Modeling for Estimation of Gaussian...

25
Recursive Cavity Modeling for Estimation of Gaussian MRFs * Stochastic Systems Group Jason K. Johnson October 9, 2002 * /mit/jasonj/Public/SSG-OCT9-02

Transcript of Recursive Cavity Modeling for Estimation of Gaussian...

Page 1: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

Recursive Cavity Modeling for

Estimation of Gaussian MRFs∗

Stochastic Systems Group

Jason K. Johnson

October 9, 2002

∗/mit/jasonj/Public/SSG-OCT9-02

Page 2: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

Overview

• Background

– Graphical Models (MRFs)

– Exponential Families

– Gaussian MRFs

– Information Geometry and Projections

• Model-Thinning Projections

– Model Selection by greedy edge-removal proce-dure.

– Parameters optimized by Iterative Scaling.

• Recursive Cavity Modeling

– Nested Dissection

– Cavity Modeling

– Blanket Modeling

– Examples

1

Page 3: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

Graphical Models∗

Undirected graph G = (V, E) based upon ver-

tices V with E (unordered pairs of vertices).

Random variables x = (xi, i ∈ V) are said to

be Markov w.r.t G when

p(xA, xB|xS) = p(xA|xS)p(xB|xS)

for all A,B, S ⊂ V where S seperates A from

B.

Hammersley-Clifford, 71.† x is Markov w.r.t.

G if and only if p(x) factors according to G as

p(x) =1

Z(ψ)

c∈Cψc(xc)

with positive potential functions ψ and Z(ψ)is normalization constant.

Markov structure of random process x allows

for compact specification of p(x) as graphical

models.∗Lauritzen, 96; Jordan, 99.†Grimmett, 73.

2

Page 4: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

Example MRF

x1 x2

x4 x3

Graph Factorization

p(x) ∝ ψ1(x1)ψ2(x2)ψ3(x3)ψ4(x4)

ψ1,2(x1, x2)ψ2,3(x2, x3)

ψ3,4(x3, x4)ψ4,1(x4, x1)

Conditional Independence

p(x1,3|x2,4) = p(x1|x2,4)p(x3|x2,4)

p(x2,4|x1,3) = p(x2|x1,3)p(x4|x1,3)

3

Page 5: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

Exponential Families∗

Specified by a base measure q(x) > 0 and a set

of sufficient statistics t(x) both defined over

some specified state-space X. We take X =

Rn so that model is specified by pdf of the

form

f(x; θ) = q(x) exp{θ · t(x)− ϕ(θ)}

where the cumulant function ϕ(θ) is the nor-

malization constant

ϕ(θ) = log∫

q(x) exp{θ · t(x)}dx

Only consider admissable parameters Θ s.t.

pdf is normalizable ϕ(θ) < ∞. The family

is regular if Θ has non-empty interior. The

statistics are minimal if the t(x) are linearly-

independent. Then, dual parameterization pro-

vided by moment coordinates η = Eθ{t(x)}

over the set of achievable moments η(Θ).

∗Chentsov, 66; Barndorff-Nielsen, 78.

4

Page 6: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

Gaussian Markov Random Fields

Consider Gaussian process x ∼ N (µ,Σ) with

mean vector µ = E{x} and covariance matrix

Σ = E{xx′} − µµ′.

Information Filter Form. Say that x ∼ N−1(h, J)

if

h = Σ−1µ

J = Σ−1

s.t. density function is parameterized as

p(x) = exp{−1

2x′Jx+ h′x− ϕ(h, J)}

where

ϕ(h, J) =1

2{h′J−1h− log |J |+ n log 2π}.

This is an exponential family model with

θ = (h,−J/2)

t(x) = (x, xx′)

η = (µ,Σ + µµ′)

ϕ(θ) = ϕ(h, J)

5

Page 7: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

Example GMRF

ψ1 ψ2 ψ3

ψ1,2 ψ2,3

p(x) ∝ ψ1(x1)ψ2(x2)ψ3(x3)ψ1,2(x1, x2)ψ2,3(x2, x3)

ψ1(x1) = exp{−1

2x′1J1,1x1 + h′1x1}

ψ2(x2) = exp{−1

2x′2J2,2x2 + h′2x2}

ψ3(x3) = exp{−1

2x′3J3,3x3 + h′3x3}

ψ1,2(x1, x2) = exp{−x′1J1,2x2}

ψ2,3(x2, x3) = exp{−x′2J2,3x3}

h =

h1h2h3

, J =

J1,1 J1,2 0J ′1,2 J2,2 J2,3

0 J ′2,3 J3,3

6

Page 8: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

Information Geometry∗

Based upon the Kullback-Leibler divergence†,

a measure of contrast between probability dis-

tributions.

D(p‖q) = Ep

logp(x)

q(x)

Bregman distance in θ based upon ϕ(θ),

D(θ∗‖θ) = ϕ(θ)−∇ϕ(θ∗) · (θ − θ∗)

Legendre transform ϕ∗(η) of ϕ(θ):

ϕ∗(η) = θ(η) · η − ϕ(θ)

“Slope transform”

η(θ) =∂ϕ(θ)

∂θ

θ(η) =∂ϕ∗(η)

∂η

Convex bifunction in (η(p), θ(q)),

D(η‖θ) = ϕ∗(η) + ϕ(θ)− η · θ

∗Chentsov, 72; Csiszar, 75; Efron, 78; Amari, 01.†Kullback and Leibler, 51.

7

Page 9: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

Bregman distance∗

ϕ(θ)

D(θ0||θ)

θ0 θ

ϕ(θ; θ0)

∗Bregman, 67.

8

Page 10: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

Triangle Relation

D(θ0||θ1)

ϕ(θ)

θ0 θ1 θ2

D(θ0||θ2)

D(θ1||θ2)

∆ · (θ2 − θ1)

D(θ0‖θ2) = D(θ0‖θ1)+D(θ1‖θ2)+(η1−η0)·(θ2−θ1)

9

Page 11: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

Information Projections

Let F be a regular exponential family with min-

imal statistics t(x), exponential coordinates Θ,

and moment coordinates η(Θ).

M-projection. Let p ∈ F , H ⊂ F e-flat sub-

manifold. Exists unique q∗ ∈ H satisfying the

following equivalent conditions:

(i) D(p‖q∗) = infq∈HD(p‖q)

(ii) ∀q ∈ H : (η(p)−η(q∗)) ·(θ(q)−θ(q∗)) = 0

(iii) ∀q ∈ H : D(p‖q) = D(p‖q∗) +D(q∗‖q)

We call q∗ = arg minq∈HD(p‖q) them-projection

of p to H.

10

Page 12: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

M-projection

Θ η(Θ)

p

q q∗

p

q∗

q

D(p‖q∗)

D(q∗‖q)

D(p‖q)

∂θ(q)D(p‖q) = η(q)− η(p)

11

Page 13: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

Dual E-projection

E-projection. Let q ∈ F , H′ ⊂ F m-flat sub-

manifold. Exists unique p∗ ∈ H′ satisfying the

following equivalent conditions:

(i) D(p∗‖q) = infp∈H′D(p‖q)

(ii) ∀p ∈ H′ : (η(p)−η(p∗))·(θ(q)−θ(p∗)) = 0

(iii) ∀p ∈ H′ : D(p‖q) = D(p‖p∗) +D(p∗‖q)

We call p∗ = arg minp∈H′D(p||q) the e-projectionof q to H′.

Duality. Let H and H′ be I-orthogonal sub-manifolds such that exists r in intersection and

∀p ∈ H′, q ∈ H : (η(p)−η(r))·(θ(q)−θ(r)) = 0

Then, r is both the m-projection of p ∈ H′ toH and the e-projection of q ∈ H to H′.

12

Page 14: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

E-projection

Θ η(Θ)

p

q

p

q

D(p‖q)D(p‖p∗)

p∗

D(p∗‖q)

p∗

∂η(p)D(p‖q) = θ(p)− θ(q)

13

Page 15: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

Model Thinning

Let t(x) = (tH(x), t′H(x)), θ = (θH, θ′H) and

η = (ηH, η′H).

Objective. M-project p ∈ F to lower-order

exponential family,

H = {q ∈ F | θ′H(q) = 0}

Dual Problem. E-projection q ∈ H to the m-

flat submanifold:

H′(p) = {r ∈ F | ηH(r) = ηH(p)} (1)

The latter e-projection problem may be solved

by iterative scaling techniques which adjust pa-

rameters θH(q) until ηH(q) = ηH(p) (moment

matching).

For GMRF x ∼ N−1(h, J), impose sparsity

on J . Moment-matching gives classical covari-

ance selection problem (Dempster, 72).

14

Page 16: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

Iterative Scaling

Alternating e-projections to set of m-flat sub-

manifolds converges to e-projection to inter-

section (Csiszar, 75). Special case of method

of alternating Bregman projections (Bregman,

67).

Iterative Proportional Fitting.∗ m-flat subman-

ifolds impose marginal moment constraints specif-

ing marginal distribution p∗(xC).

ψ(xC)← ψ(xC)×p∗(xC)

p(xC)

Covariance Selection.† Updates exponential

parameters (hC, JC) to impose moment con-

straints (µ∗C,Σ∗C).

JC ← JC + (J∗C − JC)

hC ← hC + (h∗C − hC)

where (h∗C, J∗C) = ((Σ∗C)−1µ∗C, (Σ

∗C)−1) and

(h∗C, J∗C) = (Σ−1

C µC,Σ−1C ) (marginal informa-

tion models).∗Ireland and Kullback, 68.†Speed and Kivveri, 86.

15

Page 17: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

Greedy Edge-Removal

Prunes edges from graphical model by forc-

ing selected off-diagonal entries of J to zero

(m-projections implemented by iterative scal-

ing techniques).

Selects weak interactions to prune according

to conditional mutual information

I(xi; xj|x\ij) = −1

2log

1−det Ji,j

det Ji,i det Jj,j

which gives tractable lower-bound estimate of

KL under m-projection.

Selects batch K ⊂ V of weakest edges to prune

satisfying

KIi;j <

δ

|K|

Continues thinning until no more weak inter-

actions relative to δ. Related to Akaike infor-

mation criterion (Akaike, 74).

16

Page 18: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

Nested Dissection

(1) vertical cut.

(2) horizontal cut.

(3) vertical cut

(4) horizontal cut.

17

Page 19: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

Variable Elimination

Integrate over subset Λ ⊂ V of random vari-

ables:

p(x\Λ) =∫

p(x)dxΛ

Local parameter update in (h, J) representa-

tion:

h∂Λ ← h∂Λ − J∂Λ,ΛJ−1Λ,ΛhΛ

J∂Λ ← J∂Λ − J∂Λ,ΛJ−1Λ,ΛJΛ,∂Λ

Eliminates vertices in graphical model but adds

“fill” edges between neighbors. Only updates

local parameters and structure of “boundary”

∂Λ of subfield.

18

Page 20: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

Cavity Models (Initialization)

(1) Partial model of subfield (zero boundary).

(2) Elimination gives model of surface.

(3) Model thinning gives “cavity model”.

19

Page 21: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

“Upwards” Cavity Modeling

(1) Initialization.

(2) Merge. (3) Eliminate.

(4) Thin.

20

Page 22: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

“Downwards” Blanket Modeling

(1) Initialization.

(2) Merge. (3) Eliminate.

(4) Thin.

21

Page 23: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

Conclusion

RCM appears to provide a powerful and flex-

ible framework for tractable yet near-optimal

computation in MRFs.

Much work remains to better characterize per-

formance and explore promising extensions:

• Develop information geometry of RCM.

• Consider more general families of graphical

models.

• Employ alternative modeling techniques.

• Applications

– Model Identification

– Image Processing

– Data Compression and Coding

– Monte-Carlo Simulation

22

Page 24: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

References

Akaike, 74. A new look at the statistical model identi-fication. IEEE Trans. Auto. Control, AC-19:716:723.

Amari, 01. Information geometry of hierarchy of proba-bility distributions. IEEE Trans. Inf. Theory, 47(5):1701-1711.

Chentsov, 66. A systematic theory of exponential fam-ilies. Theory of Prob. and Appl., 11.

Chentsov, 72. Statistical decision rules and optimal in-ference. AMS Trans. Math. Mono., v.53 (reprint 82).

Barndorff-Nielsen, 78. Information and Exponential Fam-

ilies. John Wiley.

Bregman, 67. The relaxation method of finding thecommon point of convex sets. USSR Comp. Math.

and Physics, 7:200-217.

Csiszar, 75. I-divergence geometry of probability distri-butions and minimization problems. Annals of Prob.,3(1):146-158.

Dempster, 72. Covariance Selection. Biometrics, 28(1):157-175.

Efron, 78. The geometry of exponential families. An-

nals of Stat., 6(2):362-376.

Grimmett, 73. A thoerem about random fields. Bull.

of London Math. Soc., 5:81-84.

23

Page 25: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model

Ireland and Kullback, 68. Contingency tables with givenmarginals. Biometrika, 55:179-188.

Jordan (editor), 99. Learning in Graphical Models. MITPress.

Kullback and Leibler, 51. On information and suffi-ciency. Annals of Math. Stat., 22(1):79-86.

Lauritzen, 96. Graphical Models. Oxford UniversityPress.

Speed and Kiiveri, 86. Gaussian Markov distributionsover finite graphs. Annals of Stat., 14(1):138-150.