Statistical Inverse Problems, Model Reduction and Inverse...

60
Statistical Inverse Problems, Model Reduction and Inverse Crimes Erkki Somersalo, Helsinki University of Technology, Finland Firenze, March 22–26, 2004

Transcript of Statistical Inverse Problems, Model Reduction and Inverse...

Page 1: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Statistical Inverse Problems,

Model Reduction and

Inverse Crimes

Erkki Somersalo, Helsinki University of Technology, Finland

Firenze, March 22–26, 2004

Page 2: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

CONTENTS OF THE LECTURES

1. Statistical inverse problems: A brief review

2. Model reduction, discretization invariance

3. Inverse crimes

Material based on the forthcoming book

Jari Kaipio and Erkki Somersalo: Computational and Statistical Inverse Prob-lems. Springer-Verlag (2004)

Page 3: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

STATISTICAL INVERSE PROBLEMS

Bayesian paradigm, or “subjective probability”:

1. All variables are random variables

2. The randomness reflects the subject’s uncertainty of the actual values

3. The uncertainty is encoded into probability distributions of the variables

Notation: Random variables X, Y , E etc.

Realizations: If X : Ω → Rn, we denote

X(ω) = x ∈ Rn.

Probability densities:

PX ∈ B

=

∫B

πX(x)dx =∫

B

π(x)dx.

Page 4: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Hierarchy of the variables:

1. Unobservable variables of primary interest, X

2. Unobservable variables of secondary interest, E

3. Observable variables, Y

Example: Linear inverse problem with additive noise,

y = Ax + e, A ∈ Rm×n.

Stochastic extension:Y = AX + E.

Page 5: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Conditioning: Joint probability density of X and Y :

PX ∈ A, Y ∈ B

=

∫A×B

π(x, y)dx dy.

Marginal densities:

PX ∈ A

= P

X ∈ A, Y ∈ Rm

=

∫A×Rm

π(x, y)dx dy,

in other words,

π(x) =∫

Rm

π(x, y)dy.

Conditional probability:

PX ∈ A | Y ∈ B

=

∫A×B

π(x, y)dx dy∫B

π(y)dy.

Page 6: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Shrink B into a single point y:

PX ∈ A | Y = y

=

∫A

π(x, y)π(y)

dx =∫

A

π(x | y)dx,

where

π(x | y) =π(x, y)π(y)

or π(x, y) = π(x | y)π(y).

Bayesian solution of an inverse problem: Given a measurement y = yobserved

of the observable variable Y , find the posterior density of X,

πpost(x) = π(x | yobserved).

Page 7: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Prior density, πpr(x) expresses all prior information independent of the mea-surement.

Likelihood density π(y | x) is the likelihood of a measurement outcome y givenx.

Bayes formula:

π(x | y) =πpr(x)π(y | x)

π(y).

Three steps of Bayesian inversion:

1. Construct the prior density

2. Construct the likelihood density

3. Extract useful information from the posterior density

Page 8: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Example: Linear model with additive noise,

Y = AX + E,

where the density πnoise is known. Fixing X = x yields

π(y | x) = πnoise(y −Ax),

and soπ(x | y) = πpr(x)πnoise(y −Ax).

Assume that X and E are mutually independent and Gaussian,

X ∼ N (x0,Γpr), E ∼ N (0,Γe),

where Γpr ∈ Rn×n and Γe ∈ Rm×m are symmetric positive (semi)definite.

Page 9: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

πpr(x) ∝ exp(−1

2(x− x0)TΓ−1

pr (x− x0))

,

π(y | x) ∝ exp(−1

2(y −Ax)TΓ−1

e (y −Ax))

.

From Bayes formula, the posterior covariance is Gaussian,

π(x | y) ∼ N (x∗,Γpost),

where

x∗ = x0 + ΓprAT(AΓprA

T + Γe)−1(y −Ax0),

Γpost = Γpr − ΓprAT(AΓprA

T + Γe)−1AΓpr.

Page 10: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Special case: Assume that

x0 = 0, Γpr = γ2I, Γe = σ2I.

In this case,x∗ = AT(AAT + α2I)−1y, α =

σ

γ,

known as Wiener filtered solution (m×m problem), or, equivalently,

x∗ = (ATA + α2I)−1ATy,

which is the Tikhonov regularized solution (n× n problem).

Engineering rule of thumb: If n < m, use Tikhonov, if m < n use Wiener.

(In practice, ATA or AAT should often not be calculated.)

Page 11: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Frequently asked question: How do you determine α?

Bayesian paradigm: Either

1. You know γ and σ; then α = σ/γ,

or

2. You don’t know them; make them part of the estimation problem.

This is the empirical Bayes approach.

Example: If γ in the previous example in unknown, write

πpr(x | γ) ∝ 1γn

exp(− 1

2γ2‖x‖2

),

and writeπpr(x, γ) = πpr(x | γ)πh(γ),

where πh is a hyperprior or hierarchical prior.

Determine π(x, γ | y).

Page 12: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

BAYESIAN ESTIMATION

Classical inversion methods produce estimates of the unknown.

In contrast, Bayesian approach produces a probability density that can beused

• to produce estimates,

• to assess the quality of estimates (statistical and classical).

Example: Conditional mean (CM) and maximum a posteriori (MAP) esti-mates:

xCM =∫

Rn

xπ(x | y)dx,

xMAP = arg maxπ(x | y).

Page 13: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Calculating MAP esitmate is an optimization problem, CM estimate and in-tegration problem.

Monte Carlo integration: If n is large, quadrature methods not feasible.

MC methods: Assume that we have a sample,

S =x1, x2, . . . , xN

, xj ∈ Rn.

Write

xCM =∫

Rn

xπ(x | y)dx ≈N∑

j=1

wjxj ,

wherewj = π(xj | y).

Importance sampling: Generate the sample S randomly.

Simple but inefficient (in particular when n is large).

Page 14: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

A better idea: Generate the sample using the density π(x | y).

Ideal case: The points xj are distributed according to the density π(x | y),and

xCM =∫

Rn

xπ(x | y)dx ≈ 1N

N∑j=1

xj .

Markov chain Monte Carlo methods (MCMC): Generate the sample sequen-tially,

x0 → x1 → . . . xj → x+1 → . . . → xN .

Idea: Define a transition probability P (xj , Bj+1),

P (xj , Bj+1) = PXj+1 ∈ Bj+1, provided that Xj = xj

.

Page 15: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Assuming that Xj has probability density πj(xj),

Pxj+1 ∈ Bj+1

=

∫Rn

P (xj , Bj+1)πj(xj)dxj = πj+1(Bj+1).

Choose the transition kernel so that π(x | y) is invariant measure:

∫B

π(x | y)dx =∫

Rn

P (x′, B)π(x′ | y)dx′.

Then all the variables Xj are distributed according to π(x | y).

Best known algorithms:

Metropolis-Hastings, Gibbs sampler.

Page 16: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

−2 −1 0 1 2−1

−0.5

0

0.5

1

1.5

2

Page 17: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

−2 −1 0 1 2−1

−0.5

0

0.5

1

1.5

2

(d)

Page 18: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Gibbs sampler: Update one component at the time as follows:

Given xj = [xj1, x

j2, . . . , x

jn].

Draw xj+11 from t 7→ π(t, xj

2, . . . , xjn | y),

draw xj+12 from t 7→ π(xj+1

1 , t, xj3, . . . , x

jn | y),

...

draw xj+1n from t 7→ π(xj+1

1 , xj+12 , . . . , xj+1

n−1, t | y).

Page 19: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

−2 −1 0 1 2−1

−0.5

0

0.5

1

1.5

2

Page 20: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Define a cost function Ψ : Rn × Rn → R.

The Bayes cost of an estimator x = x(y) is defined as

B(x) = EΨ(X, x(Y ))

=

∫ ∫Ψ(x, x(y))π(x, y)dx dy.

Further, we can write

B(x) =∫ ∫

Ψ(x, x)π(y | x)dy πpr(x)dx

=∫

B(x | x)πpr(x)dx = EB(x | x)

,

whereB(x | x) =

∫Ψ(x, x)π(y | x)dy

is the conditional Bayes cost.

Page 21: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

The Bayes cost method: Fix Ψ and define the estimator xB so that

B(xB) ≤ B(x)

for all estimators x of x.

By Bayes formula,

B(x) =∫ ∫

Ψ(x, x)π(x | y)dx π(y)dy.

Since π(y) ≥ 0 and x(y) depends only on y,

xB(y) = arg min ∫

Ψ(x, x)π(x | y)dx

= arg min

E

Ψ(x, x)

∣∣ y

.

Page 22: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Mean square error criterion: Choose Ψ(x, x) = ‖x− x‖2, giving

B(x) = E‖X − X‖2

= trace

(corr(X − X)

),

where X = x(Y ), and

corr(X − X

)= E

(X − X)(X − X)T

∈ Rn×n.

This Bayes estimator is called the mean square estimator xMS. We have

xMS =∫

xπ(x | y) dx = xCM.

Page 23: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

We have

E‖X − x‖2 | y

= E

‖X‖2 | y

− 2E

X | y

Tx + ‖x‖2

= E‖X‖2 | y

∥∥EX | y

2∥∥ +∥∥E

X | y

− x

∥∥2

≥ E‖X‖2 | y

∥∥EX | y

2∥∥,

and the equality holds only if

x(y) = EX | y = xCM.

Furthermore,E

X − xCM

= E

X − E

X | y

= 0.

Page 24: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Question: xCM is optimal, but is it informative?

0 0.5 1 1.50

1

2

3

4

5

6

CM MAP

(a)0 0.5 1 1.5

0

1

2

3

4

5

6

MAP CM

(b)

No estimate is foolproof. Optimality is subjective.

Page 25: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

DISCRETIZED MODELS

Consider a linear model with additive noise,

y = Af + e, f ∈ H, y, e ∈ Rm.

Discretization, e.g. by collocation,

xn = [f(p1); f(p2); . . . ; f(pn)] ∈ Rn,

Af ≈ Anxn, An ∈ Rm×n.

Assume that the discretization scheme is convergent,

limn→∞

‖Af −Anxn‖ = 0.

Accurate discrete model:

y = ANxN + e, ‖ANxN −Af‖ < tol .

Page 26: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Stochastic extension:Y = ANXN + E,

where Y , XN and E are random variables.

Passing into a coarse mesh. Possible reasons:

1. 2D and 3D applications, problems too large

2. Real time applications

3. Inverse modelling based on prescribed meshing

Coarse mesh model with n < N ,

Af ≈ Anxn, ‖Anxn −Af‖ > tol .

Stochastic extension of the simple reduced model is

Y = AnXn + E.

Page 27: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Inverse crime:

• WriteY = Y = AnXn + E, (1)

and develop the inversion scheme based on this model,

• generate data with the simple reduced model and test the inversionmethod with this data.

Usually, inverse crime results are overly optimistic.

Questions:

1. How to model the discretization error?

2. How to model the prior information?

3. Is the inverse crime always significant?

Page 28: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

PRIOR MODELLING

Assume a Gaussian model,

XN ∼ N (xN0 ,ΓN ),

i.e., the prior density is

πpr(xN ) ∝ exp(−1

2(xN − xN

0

)T(ΓN

)−1(xN − xN

0

)).

Projection (e.g. interpolation, averaging or downsampling),

P : RN → Rn, XN 7→ Xn.

Then,

EXn

= E

PXN

= PE

XN

= PxN

0 ,

EXn

(Xn

)T= E

PXN

(XN

)TPT

= PE

XN

(XN

)TPT,

Page 29: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

and therefore,Xn ∼ N (xn

0 ,Γn) = N (PxN0 , P ΓN PT).

However, this is not what we normally do!

Example: H = continuous functions on [0, 1].

Discretization by multiresolution bases. Let

ϕ(t) =

1, if 0 ≤ t < 1,0, if t < 0 or t ≥ 1.

Define V j , 0 ≤ j < ∞, V j ⊂ V j+1,

V j = spanϕj

k|1 ≤ k ≤ 2j,

whereϕj

k(t) = 2j/2ϕ(2jt− k − 1).

Page 30: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Discrete representation,

f j(t) =2j∑

k=1

xjkϕj

k(t) ∈ V j .

Projector P : xj 7→ xj−1

P = Ij−1 ⊗ e1 =1√2

1 1 0 0 . . . 0 00 0 1 1 . . . 0 0...

...0 0 0 0 . . . 1 1

∈ R2j−1×2j

.

Page 31: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Assume the prior information f ∈ C20 ([0, 1]).

Second order smoothness prior of XN , N = 2j :

πpr(xN ) ∝ exp(−1

2α‖LNxN‖2

)= exp

(−1

2(xN )T

[α(LN

)TLN

]xN

),

where

LN = 22j

−2 1 0 . . . 01 −2 1

0 1 −2...

.... . . 1

0 . . . 1 −2

∈ RN×N .

The prior covariance is

ΓN =[α(LN

)TLN

]−1

.

Page 32: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Passing to level n = 2j−1 = N/2:

Ln = 22(j−1)

−2 1 0 . . . 01 −2 1

0 1 −2...

.... . . 1

0 . . . 1 −2

= PLNPT ∈ Rn×n.

Natural candidate for the smoothness prior for Xn is

πpr(xn) ∝ exp(−1

2α‖Lnxn‖2

)= exp

(−1

2(xn)T

[α(Ln

)TLn

]xn

),

But this is inconsistent, since

Γn =[α(Ln

)TLn

]−1

6= P[α(LN

)TLN

]−1

PT = Γn.

Page 33: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Numerical example:

Af(t) =∫ 1

0

K(t− s)f(s)ds, K(s) = e−κs2,

where κ = 15. Sampling:

yj = Af(tj) + ej , tj = (j − 1/2)/50, 1 ≤ j ≤ 50,

andE ∼ N (0, σ2I), σ = 2% of max

(Af(tj)

).

Smoothness prior

πpr(xN ) ∝ exp(−1

2α‖LNxN‖2

), N = 512.

Reduced model with n = 8.

Page 34: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−4

−2

0

2

4

6

8x 10

−3

Figure 1: MAP estimate with N = 512, n = 8. Black dots correspond to Γn,red dots to Γn.

Page 35: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

DISCRETIZATION ERROR

From fine mesh to coarse mesh: Complete error model

Y = ANXN + E (2)

= AnXn + (AN −AnP )XN + E

= AnXn + Ediscr + E.

Error covariance: Assume that E, XN are mutually independent,

E ∼ N (0,Γe), XN ∼ N (xN0 ,ΓN ).

The complete error E = Ediscr + E is Gaussian,

E ∼ N (e0, Γe),

where

e0 = (AN −AnP )xN0 ,

Γe = (AN −AnP )ΓN (AN −AnP )T + Γe.

Page 36: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Error variance:

var(E

)= E

‖E− e0‖2

= E

‖Ediscr − e0‖2

+ E

‖E‖2

= = trace

((AN −AnP )ΓN (AN −AnP )T

)+ trace

(Γe

)= var

(Ediscr

)+ var

(E

).

The complete error model is noise dominated, if

var(Ediscr

)< var

(E

),

and modelling error dominated if

var(Ediscr

)> var

(E

).

Page 37: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Enhanced error model: Use the likelihood and prior

π(y | xn) ∝ exp(−1

2(y −Anxn − y0)TΓ−1

e (y −Anxn − y0))

,

πpr(xn) ∝ exp(−1

2(xn − xn

0 )T(Γn

pr

)−1(xn − xn0 )

),

where

y0 = EY

= AnEXn+ e0

= AnPxN0 +

(AN −AnP

)xN

0

= ANxN0 .

Page 38: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

MAP estimate, denoted by xneem is

xneem = argmin‖Ln

pr

(xn − xn

0

)‖2 + ‖Le

(Anxn − y − y0

)‖2

= argmin∥∥∥∥[

Lnpr

LeAn

]xn −

[Ln

prxn0

Le(y − y0)

]∥∥∥∥2

,

whereLpr = chol

(Γn

pr

)−1, Le = chol

(Γn

e

)−1.

This leads to a normal equation of size n× n.

Note: Enhanced error model is not the complete error model, because Xn iscorrelated with the complete error E through XN .

Page 39: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Complete error model: Assume, for a while that Xn and Y are zero mean. Wehave

Xn = PXN , Y = ANXN + E.

Variable Z = [Xn;Y ] is Gaussian, with mean an covariance

EZZT

=

[E

Xn(Xn)T

E

XnY T

E

Y (Xn)T

E

Y Y T

]=

[PΓNP PΓN (AN )T

ANΓN ANΓN (AN )T + Γe

].

From this, calculate the conditional density π(xn | y).

Page 40: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

π(xn | y) ∼ N (xncem,Γn

cem),

where

xncem = PxN

0 + PΓNpr

(AN )T

[ANΓN

pr

(AN

)T + Γe

]−1 (y −ANxN

0

),

and

Γncem = PΓN

prPT − PΓN

pr

(AN

)T[ANΓN

pr

(AN

)T + Γe

]−1

ANΓNprP

T.

Note: The computation of xncem requires soving an m ×m system, indepen-

dently of n. (Compare to xneem).

Page 41: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Example: Full angle tomography.

X−ray source

Detector

Figure 2: True object and the discretized model.

Page 42: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Intensity decrease along a line segment d`:

dI = −Iµd`,

where µ = µ(p) ≥ 0, p ∈ Ω is the mass absorption.

Let I0 be the intensity of the transmitted X-ray.

The received intensity I is

log(

I

I0

)=

∫ I

I0

dI

I= −

∫`

µ(p)d`(p).

Inverse problem of X-ray tomography: Estimate µ : Ω → R+ from the valuesof its integrals along a set of straight lines passing through Ω.

Page 43: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Figure 3: Sinogram data.

Page 44: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Gaussian structural smoothness prior: Three weakly correlated subregions.Inside each region pixels mutually correlated.

20 40 60 80

10

20

30

40

50

60

70

80

Figure 4: Prior geometry

Page 45: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Construction of the prior: Pixel centers pj , 1 ≤ j ≤ N .

Divide the pixels in clicques C1, C2 and C3. In medical imaging, this is calledimage segmenting.

Define the neighbourhood systemN = Ni | 1 ≤ i ≤ N, Ni ⊂ 1, 2, . . . , N,where

j ∈ Ni if and only if pixels pi and pj are neighbours and in the same clicque.

Define the density of a Markov random field X as

πMRF(x) ∝ exp

−12α

N∑j=1

|xj − cj

∑i∈Nj

xi|2

= exp(−1

2αxTBx

),

where the coupling constant cj depends of the clicque.

Page 46: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

The matrix B is singular.

Remedy: Select few points pj | j ∈ I ′′, where I ′′ ⊂ I = 1, 2, . . . , N. LetI ′ = I \ I ′′.

Denote x = [x′;x′′].

The conditional density πMRF(x′ | x′′), (i.e., x′′ fixed), is a proper measurewith respect to x′.

Defineπpr(x) = πMRF(x′ | x′′)π0(x′′),

where π0 is Gaussian, e.g.,

π0 ∼ N (0, γ20I).

Page 47: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Figure 5: Four random draws from the prior density.

Page 48: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Data generated in a N = 84 × 84 mesh, inverse solutions computed in an = 42× 42 mesh.

Proper data y ∈ Rm and inverse crime data yic ∈ Rm:

y = ANxNtrue + e, yic = AnPxN

true + e,

where xNtrue is drawn from the prior density, e is a realization of

E ∼ N (0, σ2I),

where

σ2 = κm−1trace((AN −AnP )ΓN (AN −AnP )T

), 0.1 ≤ κ ≤ 10.

In other words,

0.1 ≤ κ =noise variance

discretization error variance≤ 10.

Page 49: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

What is the structure of the discretization error? Can we approximate it byGaussian white noise?

5 10 15 20 25 30 35 40

0

0.02

0.04

0.06

0.08

0.1

Γ A

Projection number

ΓA(k,k)

ΓA(k,k+1)

Figure 6: The diagonal and the first off-diagonal of discretization error covari-ance.

Page 50: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Error analysis:

1. Draw a sample xN1 , xN

2 , . . . , xNS , S = 500, from the prior density.

2. Choose the noise level σ = σ(κ) and generate data y1(κ), y2(κ), . . . , yS(κ),both proper and inverse crime version.

3. Calculate the estimates x(y1(κ)), x(y2(κ)), . . . , x(yS(κ).

4. Estimate the estimation error,

E‖X − X(κ)‖2

≈ 1

S

S∑j=1

‖x(yj(κ))− xj‖2.

Estimators: CM, CM with enhanced error model and truncated CGNR byMorozov discrepancy principle, discrepancy

δ2 = τE‖E‖2

= τmσ(κ)2, τ = 1.1

Page 51: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

10−2

10−1

10−3

10−2

10−1

100

||^x

− x

||2

Noise level

CG

CG IC

CM

CM Corr

Figure 7: Estimation errors with various noise levels. Dashed line isvar(Ediscr).

Page 52: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Error level 0.0029247

Error level 0.0047491

Error level 0.0060516

Error level 0.0077115

Error level 0.11093

Page 53: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

Example: Estimate error: If x = x(y) is an estimator, define the relativeestimation error as

D(x) =E

‖X − X‖2

E

‖X‖2

.

Observe:D(0) = 1.

D(xCM) ≤ D(x)

for any estimator x.

Test case: Limited angle tomography, Reconstructions with truncated singularvalue decomposition (TSVD) versus CM estimate.

Calculate D(xTSVD) and D(xCM) by ensemble averaging (S = 500).

Page 54: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

TSVD estimate:y = Ax + e.

SVD decomposition: A = UDV T, where

U = [u1, u2, . . . , um] ∈ Rm×m, V = [v1, v2, . . . , vn] ∈ Rm×n,

and

D = diag(d1, d2, . . . , dmin(n,m)) ∈ Rm×n, d1 ≥ d2 ≥ . . . ≥ dmin(n,m) ≥ 0.

xTSVD(y, r) =r∑

j=1

1dj

(uT

j y)vj ,

and the truncation parameter r is chosen, e.g., by the Morozov discrepancyprinciple,

‖y −AxTSVD(y, r)‖2 ≤ τE‖E‖2

< ‖y −AxTSVD(y, r − 1)‖2.

Page 55: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

5 10 15 20 25 30 35 40

10

20

30

40

50

60

5 10 15 20 25 30 35 40

10

20

30

40

50

60

5 10 15 20 25 30 35 40

10

20

30

40

50

60

Page 56: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization
Page 57: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80

20

40

60

80

100

120

140

160

180

||^x − x||2

Den

sity

CM

TSVD

Page 58: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization
Page 59: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

0 0.1 0.2 0.3 0.4 0.5 0.6 0.70

10

20

30

40

50

60

70

80

||^x − x||2

Den

sity CM

TSVD

Page 60: Statistical Inverse Problems, Model Reduction and Inverse ...sgallari/Meetings/GNCS-GNAMPA/firenze2004.pdf · 1. Statistical inverse problems: A brief review 2. Model reduction, discretization

CONCLUSIONS

• The Bayesian approach is useful for incorporating complex prior infor-mation into inverse solvers.

• It is not a method of producing a single estimator - although it can beused as a tool for that, too.

• It facilitates error analysis of discretization, modelling and estimationby deterministic methods.

• Working with ensembles makes possible to analyze non-linear problemsas well (e.g. EIT, OAST).