1 Graphical Models in Data Assimilation Problems Alexander Ihler UC Irvine [email protected]...

1

Graphical Models in Data Assimilation Problems

Alexander Ihler

UC Irvine

[email protected]

Collaborators: Sergey Kirshner Andrew Robertson Padhraic Smyth

2

Outline• Graphical models

– Convenient description of structure among random variables

• Use this structure to– Organize inference computations

• Finding optimal (ML, etc.) estimates• Calculate data likelihood• Simulation / drawing samples

– Suggest sub-optimal (approximate) inference computations• e.g. when optimal computations too expensive

• Some examples from data assimilation– Markov chains, Kalman filtering– Rainfall models

• Mixtures of trees• Loopy graphs

– Image analysis (de-noising, smoothing, etc.)

3

set of nodes

set of edges connecting nodes

Nodes are associated with random variables

An undirected graph is defined by

Graph Separation

Conditional Independence

Graphical Models

4

Graphical Models:

Factorization• Sufficient condition

– Distribution factors into product of “potential functions” defined on cliques of G

– Condition also necessary if distribution strictly positive

• Examples

5

Graphical Models:

Inference• Many possible inference goals

– Given a few observed RVs, compute:• Marginal distributions• Joint, Maximum a-posteriori (MAP) values• Data likelihood of observed variables• Samples from posterior

• Use graph structure to do computations efficiently– Example: compute posterior marginal p(x2 | x5=X5)

6

Combine the observations from all nodes in the graph through a series of local message-passing operations

neighborhood of node s (adjacent nodes)

message sent from node t to node s

(“sufficient statistic” of t’s knowledge about s)

Finding marginals via Belief Propagation(aka sum-product; other goals have similar algorithms)

7

II. Message Propagation: Transform distribution from node t to node s using the pairwise interaction potential

Integrate over to form distribution summarizing node t’s knowledge about

BP Message Updates

I. Message Product: Multiply incoming messages (from all nodes but s) with the local observation to form a distribution over

8

Example: sequential estimation• Well-known example

– Markov Chain– Jointly Gaussian uncertainty

• Gives integrals a simple, closed form

– Optimal inference (in many senses) given by Kalman filter– Convert large (T) problem to collection of smaller problems

– “exact” non-Gaussian: particle & ensemble filtering & extensions– Same general results hold for any tree-structured graph

• Partial elimination ordering of nodes

– Complexity limited by dimension ofeach variable

9

Exact estimation in non-trees• Often our variables aren’t so well-behaved

– May be able to convert using variable augmentation

• Often the case in Bayesian parameter estimation– Treat parameters as variables, include them in the graph– (increases nonlinearities!)

• But, dimensionality problem– Computation increases (maybe a lot!)

• Jointly Gaussian, d3

• Otherwise often exponential in d

– Can trade off graph complexity with dimensionality…

10

Example: rainfall data• 41 stations in India• Rainfall occurrence &

amounts for ~30 years• Some stations/days missing

• Tasks– Impute missing entries

– Simulate realistic rainfall

– Short term predictions

– …

• Can’t deal with joint distribution – too large to even manipulate• Conditional independence structure?

– Unlikely to be tree-structured

11

Example: rainfall data• “True” relationships

– not tree-like at all

– High tree-width

• Need some approximations– Approximate model,

exact inference

– Correct model,

approximate inference

• Even harder:– May get multiple observation

modalities (satellite data, etc.)

– Have own statistical structure

& relationships to stations

12

Example: rainfall data• Consider a single time-slice• Option 1: mixtures of trees

– Add “hidden” variable indicating which of several trees

– (Generally) marginalize over this variable

• Option 2: use loopy graph, ignore loops in inference– Utility depends on task:

– Works well for filling in missing data

– Perhaps less well for other tasks

+ + +

13

Multi-scale models• Another example of graph structure• Efficient computation if tree-structured

• Again, don’t really believe any particular tree– Perhaps average over (use mixture of) several

• (see e.g. Willsky 2002)• (also w/ loops,

similar to multi-grid)

14

Summary• Explicit structure among variables

– Prior knowledge / learned from data– Structure organizes computation, suggests approximations– Can provide computational efficiency– (often naïve distribution too large to represent / estimate)

• Offers some choice– Where to put the complexity?– Simple graph structure with high-dimensional variables– Complex graph structure with more manageable variables

• Approximate structure, exact computations

• Improved structures, approximate computations

1 Graphical Models in Data Assimilation Problems Alexander Ihler UC Irvine [email protected]...

Documents

Transcript of 1 Graphical Models in Data Assimilation Problems Alexander Ihler UC Irvine [email protected]...