Numerical Methods for Optimization-based Model Validation · 2016-09-02 · Numerical Methods for...

Numerical Methods forOptimization-based Model Validation

Ekaterina A. KostinaInstitute for Aplplied Mathematics (IAM)

Interdisciplinary Center for Scientific Computing (IWR)

Heidelberg Collaboratory for Industrial Optimization (HCO)

SAMSI Optimization WorkshopAugust 29 - September 2, 2016, Research Triangle Park, USA

What Models We Are Looking For?Why Do We Need These Models?

What Models We Are Looking For?

These? ... No!


Maybe this? ...Also no!

Photo by Manfred Werner-Tsui

Barbara Meier, studied mathematics,the winner of the second cycle of Germany’s Next

Topmodel


... these also no!

Möbius band

Klein bottle

But The Models That Allow Simulation

From Merriam-Webster’s Dictionary

Why Do We Need Simulation Models?

E.g. the concept of “digital twins”(GE, Siemens, ...)

GE Report Oct 4, 2015

every new product or a physical economical process has a digital twin whichincludes a collection of models and algorithms;which accompanies the process from the “origin” (modeling) through its lifetime, “growing” together with the process;is used among others to analyze data, predict malfunctioning and performoptimal operation.“The Wall Street Journal”, Jan 15, 2016: GE Digital CEO Says “Digital Twins”Will Optimize Machinery, Human Health

Another example: development and admission of drugs

Necessary precondition: validated models with reliable parameterestimates

Life Cycle of the Mathematical Model

Outline

1 Parameter Estimation

2 Sensitivity Analysis of Parameter Estimates

3 Design of Optimal Experiments

4 Applications

First Step in Model Validation: Parameter Estimation

Constrained Parameter Estimation Problem

determine parameters p and states y to minimize deviation of the modelresponseM[y; p] from measurements η in a suitable norm (e.g. weighted l2,l1, hybrid)

miny,p

||η −M[y; p]|| ,

such that model equations

F [y; p] = 0

and additional “experimental constraints” are satisfied

C[y; p] ≥ 0

Mathematical Model Classes for DynamicalProcesses F [y; p] = 0

Differential-Algebraic Equations (DAE)

B(x, z, p)x = f (x, z, p, q, u)

0 = g(x, z, p, q, u)

x: differential variables

z: algebraic variables, y = (x, z)

p: unknown parameters (to be estimated by PE)

q: design parameters (to be chosen in OED)

u: controls (to be chosen OED)

Partial-Differential Equations (PDE)

ut −∇(K∇u) = f (u, p)

→ Semidiscretization in space or Rothe method

initial and boundary conditions!

Observation Model and Data

ηi = M(ti, ytrue(ti), ptrue) + εi

at time points ti, i = 1, ...,m1

Data from multiple experiments under varying conditions

M: nonlinear function of states (differential and algebraic)

εi: measurement errors

independentin this talk: εi ∼ N (0, σ2

i )

Numerical Methods for Parameter Estimation inODE/DAE/PDE

Bock et al 1981, ..., K. et al 1997, ...

Direct “all-at-once” multiple shooting method

Constrained Gauss Newton method

Structure exploitation (multiple experiments, parameterization in time,spatial discretization of PDE)

Efficient line search strategies

Use of inexact Jacobians

Robust parameter estimation based on l1 or Huber

After (Multiple Shooting) Discretization:Large-Scale l2 Parameter Estimation Problem

minx

12‖F1(x)‖2

2, s.t. F2(x) = 0

Optimization variables: xT := (pT , sT) ∈ Rn, with dicretization variables s

Cost function: F1(x) := Σ−1

η1 −M(t1, y(t1; p, s), p)...

ηm1 −M(tm1 , y(tm1 ; p, s), p)

∈ Rm1

Σ−1 = diag(

wiσi, i = 1, ...,m1

), wi ∈ 0, 1

Constraints: F2(x) :=

(C(p, s)

Discretized BVP

)∈ Rm2

Jacobians: J1(x) :=∂F1(x)∂x and J2(x) :=

∂F2(x)∂x

Solution Approach: Generalized Gauss-Newton

Solve iteratively, given initial guess x0

xk+1 = xk + [tk]∆xk

∆xk solves linearized problem

min∆x∈Ωk

12||F1(xk) + J1(xk)∆x||22

s.t. F2(xk) + J2(xk)∆x = 0

∆xk = −J+(xk)F(xk)

J+is a generalized inverse: J+JJ+ = J+

J =

(J1

J2

), F =

(F1

F2

)

Parameter Estimation and Sensitivity Analysis

Dynamical Process

Measurements with Errors Mathematical Model

Parameter Estimation Problem

Estimates x∗ Quantitative Output

Qualitative Assessment of x∗ Qualitative Output?

Likelihood Ratio Confidence Regions

Dlr(α) = x | F2(x) = 0, ‖F1(x)‖22 − ‖F1(x∗)‖2

2 ≤ γ2m(α)

Beale 60, Draper and Smith 81, Pazman 93

γ2m(α) is (1−α)-quantile of χ2-distribu-

tion with m := n − m2 degrees of free-dom

Advantage

High approximation quality

Disadvantage

Very difficult to compute

Linearized Confidence Regions

Dlin(α) = x | J2(x∗)(x− x∗) = 0, ‖J1(x∗)(x− x∗)‖22 ≤ γ2

m(α)

Beale 60, Draper and Smith 81, Pazman 93

Advantage

Can be computed very efficiently

Another View:Parameter Sensitivity w.r.t. Measurement Errors

Introduce an error weight τ ∈ [0, 1]:

minx‖F1(x)‖2

2

s.t. F2(x) = 0

minx‖F1(x, τ)‖2

2

s.t. F2(x) = 0

where

F1(x, τ) = Σ−1

(M(t1) + τε1)−M(t1, y(t1; s), p)...

(M(tm1 ) + τεm1 )−M(tm1 , y(tm1 ; s), p)

MeasurementsM ⇒ estimates x(M) resp. x(0)

MeasurementsM+ τε ⇒ estimates x(M+ τε) resp. x(τ)

Question: what is the distribution of x(τ)?

LemmaTaylor series for the solution of the modified problem

• x(τ) = x + τJ+

(Σ−1ε

0

)+O(τ 2)

Another View: Linearized Confidence Regions

Dlin(α) = x∗ + ∆x | ∆x = −J+(x∗)(η0

), ‖η‖2

2 ≤ γ2m(α)

Dlin(α) = x | J2(x∗)(x− x∗) = 0, ‖J1(x∗)(x− x∗)‖22 ≤ γ2

m(α)Linear approximation of the covariance matrix

C = E(∆x∆xT)

= J+(x∗)(Im1 00 0m2

)J+T(x∗)

Dlin(α) is enclosed by a (minimal) box

GL(α) ⊂n×

i=1

[x∗i − θi, x∗i + θi],

where θi =√Ciiγ2

m(α) and Cii are the diagonal elements of thecovariance matrix

Bock 1987

Next Step in Model Validation: Design of OptimalExperiments

Design of Optimal Experiments

BBBBBN

bad good

ExperimentalSetup 1

ExperimentalSetup 2

Confidence Regions

Aim: choose optimalexperimental conditions(design variables) to gaininformation about parameterestimates (to reduceparameter uncertainty)

Design variables

ξT :=(qT ,wT) ∈ Ξ

q ∈ Rnq (discretized)control functions andcontrol parametersw ∈ 0, 1K sampling

Design of Optimal Experiments:Problem Formulation

minξφ(ξ; x)

s. t. ml ≤ ψ(t, x, ξ) ≤ mu, t ∈ T

0 = χ(t, x, ξ), t ∈ T

w ∈ 0, 1M

ξT = (qT ,wT) are design variables

x is an estimate of true parameter and (discretized) states values xtrue

satisfying the underlying parameter estimation problem

Design of Optimal Experiments: Cost FunctionalsTypical cost functionals

φ is a functional of the covariance matrix

C = J+(x)

(Im1 00 0m2

)J+T(x) =

(Im1 0

)(JT1 J1 JT

2

J2 0

)−1(Im1

0

)

A: φA(C) = 1n Trace(C)

D: φD(C) = det(KTCK)1n mit K ∈ Rn×nk

E: φE(C) = maxλ| λ is an eigenvalue of C

OED problem is a complex non-standard nonlinear mixed-integeroptimal control problem

Cost function already involves 1st order derivatives of PEproblem and is implicitly defined by a generalized inverseInteger decisions: sampling design

Optimal Experimental Design is a ComplexNon-Standard Nonlinear Mixed-Integer OptimalControl Problem

Theory originally developed for linear modelse.g. Kiefer, Wolfowitz 50’s, Box, Draper 1987, Pukelsheim 1993

Algorithmic developments:Lohmann 1990, Körkel, 1998, ..., Bauer, Körkel et al, 1998, ..., K. et al 2004, ...,

Software VPLAN

structure exploiting, direct, all-at-once optimization algorithms forDE models, based on multiple shooting

efficient evaluation of higher order derivatives by structure exploiting“internal” and “algorithmic” differentiation

optimal integer controls by exact lower bounds and intelligentapproximation strategies

robustification of optimization against uncertainties

experimental design for model discrimination

What Improvements Can We Expect From OptimalExperimental Designs?

Application: Transport and Degradation ofXenobiotics in Soil

Transport and Degradation of Xenobiotics in Soil

Dieses, Bock, Schlöder, Richter 2002

investigation of fate ofpesticides in soil needsexpensive lysimeterexperiments for registration

replacement by computerexperiments requiresvalidated models!

here: optimal mini-lysimeterexperiments

optimal irrigationoptimal soluteapplication

Typical Column Outflow ExperimentWater transport: =⇒ 3 unknown parameters n, α, Ks

C(ψ)∂ψ

∂t=

∂

∂z

(K(ψ)

∂

∂z(ψ − z)

)

K(ψ) = Ks(1− (α|ψ|)n−1(1 + (α|ψ|n)1/n−1)2

(1 + (α|ψ|)n)1−1/n

2

C(ψ) = α(n− 1)(θs − θr)(α|ψ|)n−1(1 + (α|ψ|)n)1/n−2

Initial condition: ψ(0, z) = 670, z ≥ 0Upper boundary condition:

q(t, 0) = −K(ψ(t, 0))(∂ψ(t, 0)

∂z− 1

)control: water flux

Lower boundary condition: ∂ψ(t,Le)∂z = 0, t ≥ 0

Typical Column Outflow ExperimentSolute transport: =⇒ 3 unknown parameters k, b, Dm

∂(θc)∂t

=∂(θDh(θ)

∂c∂z

)∂z

− ∂(qc)∂z− kc

Dh(θ) =0.0046ebθ

θ+ Dm

Initial condition: c(0, z) = 0, z ≥ 0Upper boundary condition:

−Dh(θ(t, 0))∂c(t, 0)∂z

+q(t, 0)θ(t, 0)

c(t, 0) =q(t, 0)θ(t, 0)

c0(t, 0) control: solute input

Lower boundary condition: ∂c(t,Le)∂z = 0, t ≥ 0

Typical Column Outflow Experiment

=⇒ Coupled water and solute transport: 6 unknown parameters

Optimization of Experimental Conditions

2 boundary controls - piecewise constant

water flux q(t, 0)

changeble only every 24 h0 ≤ q(t, 0) ≤ 0.6

solute input c0(t, 0)

changeble only every 24 h50 ≤ c0(t, 0) ≤ 200

sampling scheme: fixed measurement times

24 outflow measurements - 12 days, every 12h (σ = 0.01)

6 profile measurements (σ = 0.1)

only at end of experimentevaluation of solute concentrations at 6 locations

Parameterized Expert Designs

Input water fluxq(t, 0)

Substance input concentrationc0(t, 0)

Quality of Expert Designs:Criterion Values and Standard Deviations

Expert Design A

A-Criterion = 3.8859

scaled standard deviationn 1.0 ± 0.1604α 1.0 ± 1.5968Ks 1.0 ± 3.3752k 1.0 ± 0.0124b 1.0 ± 3.0460

Dm 1.0 ± 0.2644

Expert Design B

A-Criterion = 0.75

scaled standard deviationn 1.0 ± 0.0800α 1.0 ± 0.8600Ks 1.0 ± 1.6736k 1.0 ± 0.0083b 1.0 ± 0.9712

Dm 1.0 ± 0.1033

Optimal Experimental Conditions

Input water fluxq(t, 0)

Substance input concentrationc0(t, 0)

Comparison:Optimal Design vs Expert Designs A and B

Optimal design Expert design A Expert design B

A-Criterion 0.0472 3.8859 0.750

n 1.0 ± 0.0191 ± 0.1604 ± 0.0800α 1.0 ± 0.2438 ± 1.5968 ± 0.8600Ks 1.0 ± 0.4348 ± 3.3751 ± 1.6735k 1.0 ± 0.0086 ± 0.0124 ± 0.0083b 1.0 ± 0.1852 ± 3.0460 ± 0.9712

Dm 1.0 ± 0.0108 ± 0.2644 ± 0.1032

All parameters can be simultaneously estimated!

Ongoing DevelopmentsRich and well advanced theory of optimal experimental design forlinear modelsVery high relevance in industrial applicationsBASF: “We save up to 80% costs with these methods and getmuch more precise results!”

“Bildzeitung”, 17.06.2008

Actual Developments

Rich and well advanced theory of optimal experimental design forlinear modelsVery high relevance in industrial applicationsBASF: “We save up to 80% costs with these methods and getmuch more precise results!”

Efficient numerical methods forrobustification of OED against uncertainties in parameter estimatesparabolic PDE modelson-line design of optimal experiments

Robustification of OED Based on Second OrderSensitivity Analysis

Robustification of OED Based on Second OrderSensitivity Analysis

Consider again parametric PE problem depending on an error weightτ ∈ [0, 1]:

minx‖F1(x)‖2

2

s.t. F2(x) = 0

minx‖F1(x, τ)‖2

2

s.t. F2(x) = 0

where

F1(x, τ) = Σ−1

(M(t1) + τε1)−M(t1, y(t1; s), p)

...(M(tm1 ) + τεm1 )−M(tm1 , y(tm1 ; s), p)

LemmaTaylor series for the solution of the modified problem

• x(τ) = x + τJ+

(Σ−1ε

0

)+O(τ 2)

• x(τ) = x + τJ+

(Σ−1ε

0

)+ τ 2 (dJ+(I− JJ+)− 1

2 J+(dJ)J+)(Σ−1ε

0

)+O(τ 3)

Properties of Taylor Series

Consider Taylor expansion at the parameter estimates x = x∗ + ∆x + 12 ∆x with

∆x = −J+

(η0

), ∆x = −2

(dJ+(I− JJ+)− 1

2 J+(dJ)J+)(η0

), ‖η‖2

2 ≤ γ2m(α)

Then:

‖∆x‖2 ≤ θ,∥∥∥∥∆x +

12

∆x∥∥∥∥

2

≤ θ +

(κ(x∗) +

12ω(x∗)θ

)θ

where

θ :=√

Trace(C(x∗))γ2m(α),

ω(x∗) :=

∥∥∥∥J+(x∗)∂J(x∗)∂x

∥∥∥∥ , κ(x∗) :=

∥∥∥∥C(∂JT1 (x∗)∂x

(I⊗ F1(x∗)))∥∥∥∥

Design of Optimal Experiments: “Q” Cost FunctionalK., Nattermann 2015, 2016

New cost functional

φQ(·) = Trace(C(·)) +(κ(·) + ω(·)

2Trace(C(·))

)Trace(C(·)),

ω is a weighted Lipschitz constant of J (nonlinearity of the modelfunctions)κ is a weighted Lipschitz constant of J+ (compatibility of the dataand the model, asymptotic convergence rate of Gauss-Newton)

Numerical experiments show thatQ-optimal designs are significantly insensitive to uncertainties inparameter estimates,the condition of the corresponding PE problem is improved

→

OED for PDE Models

Optimal Experimental Design for Parabolic PDEModels

Choose design controls ζ as solution of the optimal control problem

minξ∈L2(Ω×[0,T]),u∈W(0,T)

tr(Cov[u, p, ξ]) + c‖ζ‖2L2(Ω×[0,T])

s.t. ∂tu + a(u, p, ξ) = 0,u(0) = u0,

Cov = (JredTJred)

−1,

+ additional constraints on controls

Optimal Experimental Design for Parabolic PDEModels: Numerical Framework

K., Kriwet 2016

Parametrization of the underlying parameter estimation problem withmultiple shooting approach

Condensing technique in function space and efficient computation of thereduced ovariance matrix for parameters

Formulation of the OED problem as a multi-stage optimal control problem

Efficient computation of reduced gradients by an adjoint approach whichrequires np + 1 PDE solves

Efficient computation of Hesse operator by Lagrange function and adjointequations

Positive definite approximation of Hesse operator which requires solutionof decoupled np × np adjoint equations

Software on the basis of DEAL-II

Application: Catalytic Plug Flow Reactor

A fluid flux through a packed bed of a porous catalyst, where anexothermic reaction takes place

Aim: to calibrate the mathematical model for the reactor

Catalytic Plug Flow Reactor

Mathematical Model

Mass balance:

ε∂ci

∂t= −∂(uf ci)

∂z+ Deff ,z

∂2ci

∂z2 + Deff ,r

(∂2ci

∂r2 +1r∂ci

∂r

)+ρs,bed

ρfRi

Energy balance:

(ε ρf cp,f + (1− ε)ρs,bed cp,s)∂T∂t

= −cp,f∂(uf ρf T)

∂z+ λeff ,z

∂2T∂z2

+ λeff ,r

(∂2T∂r2

+1r∂T∂r

)

+ρs,bed

N∑i=1

∆HR,iRi

Catalyst activity:

∂acat

∂t= cA

(k0,des + A0,des exp

(−Ea,des

Rg T

))

The Optimal Experiment Design Problem

4 parameters to estimate: activation energies and frequencyfactors of the main and deactivation reaction16 measurements of the temperature in the middle of the reactorat 5 different time pointsBondary control: The temperature of the gas at the inlet of thereactor Tent, and the concentration cent

λeff ,z∂T(t, 0, r)

∂z= −uf ρf cp,f (Tent(t, r)− T(t, 0, r)) z = 0

c(t, 0, r) = cent(t, r) z = 0

Intuitive vs Optimal Design

Parameter True Values Estimates Uncertaintyk1 −1.76416 −1.79401± 0.949 52.9%k2 −3.37131 −3.83068± 0.232 6.1%k1,deac −5.60809 −4.98277± 8.117 162.9%k2,deac −5.62849 −5.61164± 1.160 20.6%trace - 7.18573 -

Estimates Based on Intuitive Experimental Design


Estimates Based on the Optimal Experiment

Refinements: Online Optimal Experimental Design

Idea:

new measurements gained in experiments can be used toimprove the parameter estimates p∗

re-optimize optimum experimental design over remainingexperiment

minξ∈Ω×[t,tN ]

ϕ(C(ξ, p∗))

Challenges:

real-time iteration and multi-level iteration for optimumexperimental design


Estimates Based on Offline Optimal Experimental Design


Estimates Based on Online Optimal Experimental Design

One More Application:Select Optimal Catalyst Geometry by Heat TransferCoefficient – High Impact on the Value Chain

Estimation of Catalyst Properties Cooperation with

silver catalyst on aluminumsupports of different geometriesultimate aim: maximize selectivityof catalytic production process,e.g., ethene to ethene oxide

important criterion: high heat transfer coefficient to avoid hotspots or explosion!

Experimental setup Cooperation with

standard heat conductiontest reactor

special features:non-stationary heating,temperature range 20-350Ctemperature gradient over200C

Experimental setup Cooperation with

2D nonstationary PDE

τ∂T∂t

=∂

∂z

(λz∂T∂z

)+

1r∂

∂r

(rλr

∂T∂r

)− Gcp

∂T∂z

≈ 1000 spatial grid points2 controls: boundarytemperature TW , mass currentdensity G4 parameters to estimate:

λz, λr (heat conductivity)τ , cp (heat capacity)

measurements: outlettemperature

Simulation of Heat ConductionCooperation with

Optimal Experimental Design Cooperation with

intuitive expert de-sign

experimentaltime 16h

four experimentswith constant G = 0,0.9, 1.8, 4 kg/m2snon-stationaryheating with rampTW = 20− 350C

optimal designexperimentaltime 2.5huncertainty ishalved

one experimentwith varying steps ofG = 0, 0.49, 4 kg/m2sstationary heatingwith TW = 350C

Optimal Experimental Design Cooperation with

Results: total heat transfer coefficients for 3 candidate shapesincluding error bars

high accuracy decision basis for reactor designmarket idea revised: shape 1 instead of shape 2 (original choice)

Impact on value chain: catalyst investment over 1 Me!

ConclusionsNumerical methods and software for parameter estimation and optimumexperimental design

Complex nonlinear problems modelled by ODE/DAE/PDE can be treated

Optimal experimental design leads to acceleration of modeling process,allows to reduce e.g. development times for new products, prevents frommaking wrong decisions based on not calibrated models

Challenges:

High potential, but practically no theory for nonlinear dynamicalprocesses availableAlso more efficient numerical methods needed

robustification of OED against uncertaintiesmore general problem classes, e.g., hybrid systems, PDE modelsincorporation of OED in real time state and parameter estimation andfeedback control

New application areas

THANK YOU!

Numerical Methods for Optimization-based Model Validation · 2016-09-02 · Numerical Methods for...

Documents

Transcript of Numerical Methods for Optimization-based Model Validation · 2016-09-02 · Numerical Methods for...