Numerical Methods for Optimization-based Model Validation · 2016-09-02 · Numerical Methods for...
Transcript of Numerical Methods for Optimization-based Model Validation · 2016-09-02 · Numerical Methods for...
Numerical Methods forOptimization-based Model Validation
Ekaterina A. KostinaInstitute for Aplplied Mathematics (IAM)
Interdisciplinary Center for Scientific Computing (IWR)
Heidelberg Collaboratory for Industrial Optimization (HCO)
SAMSI Optimization WorkshopAugust 29 - September 2, 2016, Research Triangle Park, USA
What Models We Are Looking For?Why Do We Need These Models?
What Models We Are Looking For?
These? ... No!
What Models We Are Looking For?
Maybe this? ...Also no!
Photo by Manfred Werner-Tsui
Barbara Meier, studied mathematics,the winner of the second cycle of Germany’s Next
Topmodel
What Models We Are Looking For?
... these also no!
Möbius band
Klein bottle
But The Models That Allow Simulation
From Merriam-Webster’s Dictionary
Why Do We Need Simulation Models?
E.g. the concept of “digital twins”(GE, Siemens, ...)
GE Report Oct 4, 2015
every new product or a physical economical process has a digital twin whichincludes a collection of models and algorithms;which accompanies the process from the “origin” (modeling) through its lifetime, “growing” together with the process;is used among others to analyze data, predict malfunctioning and performoptimal operation.“The Wall Street Journal”, Jan 15, 2016: GE Digital CEO Says “Digital Twins”Will Optimize Machinery, Human Health
Another example: development and admission of drugs
Necessary precondition: validated models with reliable parameterestimates
Life Cycle of the Mathematical Model
Outline
1 Parameter Estimation
2 Sensitivity Analysis of Parameter Estimates
3 Design of Optimal Experiments
4 Applications
First Step in Model Validation: Parameter Estimation
Constrained Parameter Estimation Problem
determine parameters p and states y to minimize deviation of the modelresponseM[y; p] from measurements η in a suitable norm (e.g. weighted l2,l1, hybrid)
miny,p
||η −M[y; p]|| ,
such that model equations
F [y; p] = 0
and additional “experimental constraints” are satisfied
C[y; p] ≥ 0
Mathematical Model Classes for DynamicalProcesses F [y; p] = 0
Differential-Algebraic Equations (DAE)
B(x, z, p)x = f (x, z, p, q, u)
0 = g(x, z, p, q, u)
x: differential variables
z: algebraic variables, y = (x, z)
p: unknown parameters (to be estimated by PE)
q: design parameters (to be chosen in OED)
u: controls (to be chosen OED)
Partial-Differential Equations (PDE)
ut −∇(K∇u) = f (u, p)
→ Semidiscretization in space or Rothe method
initial and boundary conditions!
Observation Model and Data
ηi = M(ti, ytrue(ti), ptrue) + εi
at time points ti, i = 1, ...,m1
Data from multiple experiments under varying conditions
M: nonlinear function of states (differential and algebraic)
εi: measurement errors
independentin this talk: εi ∼ N (0, σ2
i )
Numerical Methods for Parameter Estimation inODE/DAE/PDE
Bock et al 1981, ..., K. et al 1997, ...
Direct “all-at-once” multiple shooting method
Constrained Gauss Newton method
Structure exploitation (multiple experiments, parameterization in time,spatial discretization of PDE)
Efficient line search strategies
Use of inexact Jacobians
Robust parameter estimation based on l1 or Huber
After (Multiple Shooting) Discretization:Large-Scale l2 Parameter Estimation Problem
minx
12‖F1(x)‖2
2, s.t. F2(x) = 0
Optimization variables: xT := (pT , sT) ∈ Rn, with dicretization variables s
Cost function: F1(x) := Σ−1
η1 −M(t1, y(t1; p, s), p)...
ηm1 −M(tm1 , y(tm1 ; p, s), p)
∈ Rm1
Σ−1 = diag(
wiσi, i = 1, ...,m1
), wi ∈ 0, 1
Constraints: F2(x) :=
(C(p, s)
Discretized BVP
)∈ Rm2
Jacobians: J1(x) :=∂F1(x)∂x and J2(x) :=
∂F2(x)∂x
Solution Approach: Generalized Gauss-Newton
Solve iteratively, given initial guess x0
xk+1 = xk + [tk]∆xk
∆xk solves linearized problem
min∆x∈Ωk
12||F1(xk) + J1(xk)∆x||22
s.t. F2(xk) + J2(xk)∆x = 0
∆xk = −J+(xk)F(xk)
J+is a generalized inverse: J+JJ+ = J+
J =
(J1
J2
), F =
(F1
F2
)
Parameter Estimation and Sensitivity Analysis
Dynamical Process
Measurements with Errors Mathematical Model
Parameter Estimation Problem
Estimates x∗ Quantitative Output
Qualitative Assessment of x∗ Qualitative Output?
Likelihood Ratio Confidence Regions
Dlr(α) = x | F2(x) = 0, ‖F1(x)‖22 − ‖F1(x∗)‖2
2 ≤ γ2m(α)
Beale 60, Draper and Smith 81, Pazman 93
γ2m(α) is (1−α)-quantile of χ2-distribu-
tion with m := n − m2 degrees of free-dom
Advantage
High approximation quality
Disadvantage
Very difficult to compute
Linearized Confidence Regions
Dlin(α) = x | J2(x∗)(x− x∗) = 0, ‖J1(x∗)(x− x∗)‖22 ≤ γ2
m(α)
Beale 60, Draper and Smith 81, Pazman 93
Advantage
Can be computed very efficiently
Another View:Parameter Sensitivity w.r.t. Measurement Errors
Introduce an error weight τ ∈ [0, 1]:
minx‖F1(x)‖2
2
s.t. F2(x) = 0
minx‖F1(x, τ)‖2
2
s.t. F2(x) = 0
where
F1(x, τ) = Σ−1
(M(t1) + τε1)−M(t1, y(t1; s), p)...
(M(tm1 ) + τεm1 )−M(tm1 , y(tm1 ; s), p)
MeasurementsM ⇒ estimates x(M) resp. x(0)
MeasurementsM+ τε ⇒ estimates x(M+ τε) resp. x(τ)
Question: what is the distribution of x(τ)?
LemmaTaylor series for the solution of the modified problem
• x(τ) = x + τJ+
(Σ−1ε
0
)+O(τ 2)
Another View: Linearized Confidence Regions
Dlin(α) = x∗ + ∆x | ∆x = −J+(x∗)(η0
), ‖η‖2
2 ≤ γ2m(α)
Dlin(α) = x | J2(x∗)(x− x∗) = 0, ‖J1(x∗)(x− x∗)‖22 ≤ γ2
m(α)Linear approximation of the covariance matrix
C = E(∆x∆xT)
= J+(x∗)(Im1 00 0m2
)J+T(x∗)
Dlin(α) is enclosed by a (minimal) box
GL(α) ⊂n×
i=1
[x∗i − θi, x∗i + θi],
where θi =√Ciiγ2
m(α) and Cii are the diagonal elements of thecovariance matrix
Bock 1987
Next Step in Model Validation: Design of OptimalExperiments
Design of Optimal Experiments
BBBBBN
bad good
ExperimentalSetup 1
ExperimentalSetup 2
Confidence Regions
Aim: choose optimalexperimental conditions(design variables) to gaininformation about parameterestimates (to reduceparameter uncertainty)
Design variables
ξT :=(qT ,wT) ∈ Ξ
q ∈ Rnq (discretized)control functions andcontrol parametersw ∈ 0, 1K sampling
Design of Optimal Experiments:Problem Formulation
minξφ(ξ; x)
s. t. ml ≤ ψ(t, x, ξ) ≤ mu, t ∈ T
0 = χ(t, x, ξ), t ∈ T
w ∈ 0, 1M
ξT = (qT ,wT) are design variables
x is an estimate of true parameter and (discretized) states values xtrue
satisfying the underlying parameter estimation problem
Design of Optimal Experiments: Cost FunctionalsTypical cost functionals
φ is a functional of the covariance matrix
C = J+(x)
(Im1 00 0m2
)J+T(x) =
(Im1 0
)(JT1 J1 JT
2
J2 0
)−1(Im1
0
)
A: φA(C) = 1n Trace(C)
D: φD(C) = det(KTCK)1n mit K ∈ Rn×nk
E: φE(C) = maxλ| λ is an eigenvalue of C
OED problem is a complex non-standard nonlinear mixed-integeroptimal control problem
Cost function already involves 1st order derivatives of PEproblem and is implicitly defined by a generalized inverseInteger decisions: sampling design
Optimal Experimental Design is a ComplexNon-Standard Nonlinear Mixed-Integer OptimalControl Problem
Theory originally developed for linear modelse.g. Kiefer, Wolfowitz 50’s, Box, Draper 1987, Pukelsheim 1993
Algorithmic developments:Lohmann 1990, Körkel, 1998, ..., Bauer, Körkel et al, 1998, ..., K. et al 2004, ...,
Software VPLAN
structure exploiting, direct, all-at-once optimization algorithms forDE models, based on multiple shooting
efficient evaluation of higher order derivatives by structure exploiting“internal” and “algorithmic” differentiation
optimal integer controls by exact lower bounds and intelligentapproximation strategies
robustification of optimization against uncertainties
experimental design for model discrimination
What Improvements Can We Expect From OptimalExperimental Designs?
Application: Transport and Degradation ofXenobiotics in Soil
Transport and Degradation of Xenobiotics in Soil
Dieses, Bock, Schlöder, Richter 2002
investigation of fate ofpesticides in soil needsexpensive lysimeterexperiments for registration
replacement by computerexperiments requiresvalidated models!
here: optimal mini-lysimeterexperiments
optimal irrigationoptimal soluteapplication
Typical Column Outflow ExperimentWater transport: =⇒ 3 unknown parameters n, α, Ks
C(ψ)∂ψ
∂t=
∂
∂z
(K(ψ)
∂
∂z(ψ − z)
)
K(ψ) = Ks(1− (α|ψ|)n−1(1 + (α|ψ|n)1/n−1)2
(1 + (α|ψ|)n)1−1/n
2
C(ψ) = α(n− 1)(θs − θr)(α|ψ|)n−1(1 + (α|ψ|)n)1/n−2
Initial condition: ψ(0, z) = 670, z ≥ 0Upper boundary condition:
q(t, 0) = −K(ψ(t, 0))(∂ψ(t, 0)
∂z− 1
)control: water flux
Lower boundary condition: ∂ψ(t,Le)∂z = 0, t ≥ 0
Typical Column Outflow ExperimentSolute transport: =⇒ 3 unknown parameters k, b, Dm
∂(θc)∂t
=∂(θDh(θ)
∂c∂z
)∂z
− ∂(qc)∂z− kc
Dh(θ) =0.0046ebθ
θ+ Dm
Initial condition: c(0, z) = 0, z ≥ 0Upper boundary condition:
−Dh(θ(t, 0))∂c(t, 0)∂z
+q(t, 0)θ(t, 0)
c(t, 0) =q(t, 0)θ(t, 0)
c0(t, 0) control: solute input
Lower boundary condition: ∂c(t,Le)∂z = 0, t ≥ 0
Typical Column Outflow Experiment
=⇒ Coupled water and solute transport: 6 unknown parameters
Optimization of Experimental Conditions
2 boundary controls - piecewise constant
water flux q(t, 0)
changeble only every 24 h0 ≤ q(t, 0) ≤ 0.6
solute input c0(t, 0)
changeble only every 24 h50 ≤ c0(t, 0) ≤ 200
sampling scheme: fixed measurement times
24 outflow measurements - 12 days, every 12h (σ = 0.01)
6 profile measurements (σ = 0.1)
only at end of experimentevaluation of solute concentrations at 6 locations
Parameterized Expert Designs
Input water fluxq(t, 0)
Substance input concentrationc0(t, 0)
Quality of Expert Designs:Criterion Values and Standard Deviations
Expert Design A
A-Criterion = 3.8859
scaled standard deviationn 1.0 ± 0.1604α 1.0 ± 1.5968Ks 1.0 ± 3.3752k 1.0 ± 0.0124b 1.0 ± 3.0460
Dm 1.0 ± 0.2644
Expert Design B
A-Criterion = 0.75
scaled standard deviationn 1.0 ± 0.0800α 1.0 ± 0.8600Ks 1.0 ± 1.6736k 1.0 ± 0.0083b 1.0 ± 0.9712
Dm 1.0 ± 0.1033
Optimal Experimental Conditions
Input water fluxq(t, 0)
Substance input concentrationc0(t, 0)
Comparison:Optimal Design vs Expert Designs A and B
Optimal design Expert design A Expert design B
A-Criterion 0.0472 3.8859 0.750
n 1.0 ± 0.0191 ± 0.1604 ± 0.0800α 1.0 ± 0.2438 ± 1.5968 ± 0.8600Ks 1.0 ± 0.4348 ± 3.3751 ± 1.6735k 1.0 ± 0.0086 ± 0.0124 ± 0.0083b 1.0 ± 0.1852 ± 3.0460 ± 0.9712
Dm 1.0 ± 0.0108 ± 0.2644 ± 0.1032
All parameters can be simultaneously estimated!
Ongoing DevelopmentsRich and well advanced theory of optimal experimental design forlinear modelsVery high relevance in industrial applicationsBASF: “We save up to 80% costs with these methods and getmuch more precise results!”
“Bildzeitung”, 17.06.2008
Actual Developments
Rich and well advanced theory of optimal experimental design forlinear modelsVery high relevance in industrial applicationsBASF: “We save up to 80% costs with these methods and getmuch more precise results!”
Efficient numerical methods forrobustification of OED against uncertainties in parameter estimatesparabolic PDE modelson-line design of optimal experiments
Robustification of OED Based on Second OrderSensitivity Analysis
Robustification of OED Based on Second OrderSensitivity Analysis
Consider again parametric PE problem depending on an error weightτ ∈ [0, 1]:
minx‖F1(x)‖2
2
s.t. F2(x) = 0
minx‖F1(x, τ)‖2
2
s.t. F2(x) = 0
where
F1(x, τ) = Σ−1
(M(t1) + τε1)−M(t1, y(t1; s), p)
...(M(tm1 ) + τεm1 )−M(tm1 , y(tm1 ; s), p)
LemmaTaylor series for the solution of the modified problem
• x(τ) = x + τJ+
(Σ−1ε
0
)+O(τ 2)
• x(τ) = x + τJ+
(Σ−1ε
0
)+ τ 2 (dJ+(I− JJ+)− 1
2 J+(dJ)J+)(Σ−1ε
0
)+O(τ 3)
Properties of Taylor Series
Consider Taylor expansion at the parameter estimates x = x∗ + ∆x + 12 ∆x with
∆x = −J+
(η0
), ∆x = −2
(dJ+(I− JJ+)− 1
2 J+(dJ)J+)(η0
), ‖η‖2
2 ≤ γ2m(α)
Then:
‖∆x‖2 ≤ θ,∥∥∥∥∆x +
12
∆x∥∥∥∥
2
≤ θ +
(κ(x∗) +
12ω(x∗)θ
)θ
where
θ :=√
Trace(C(x∗))γ2m(α),
ω(x∗) :=
∥∥∥∥J+(x∗)∂J(x∗)∂x
∥∥∥∥ , κ(x∗) :=
∥∥∥∥C(∂JT1 (x∗)∂x
(I⊗ F1(x∗)))∥∥∥∥
Design of Optimal Experiments: “Q” Cost FunctionalK., Nattermann 2015, 2016
New cost functional
φQ(·) = Trace(C(·)) +(κ(·) + ω(·)
2Trace(C(·))
)Trace(C(·)),
ω is a weighted Lipschitz constant of J (nonlinearity of the modelfunctions)κ is a weighted Lipschitz constant of J+ (compatibility of the dataand the model, asymptotic convergence rate of Gauss-Newton)
Numerical experiments show thatQ-optimal designs are significantly insensitive to uncertainties inparameter estimates,the condition of the corresponding PE problem is improved
→
OED for PDE Models
Optimal Experimental Design for Parabolic PDEModels
Choose design controls ζ as solution of the optimal control problem
minξ∈L2(Ω×[0,T]),u∈W(0,T)
tr(Cov[u, p, ξ]) + c‖ζ‖2L2(Ω×[0,T])
s.t. ∂tu + a(u, p, ξ) = 0,u(0) = u0,
Cov = (JredTJred)
−1,
+ additional constraints on controls
Optimal Experimental Design for Parabolic PDEModels: Numerical Framework
K., Kriwet 2016
Parametrization of the underlying parameter estimation problem withmultiple shooting approach
Condensing technique in function space and efficient computation of thereduced ovariance matrix for parameters
Formulation of the OED problem as a multi-stage optimal control problem
Efficient computation of reduced gradients by an adjoint approach whichrequires np + 1 PDE solves
Efficient computation of Hesse operator by Lagrange function and adjointequations
Positive definite approximation of Hesse operator which requires solutionof decoupled np × np adjoint equations
Software on the basis of DEAL-II
Application: Catalytic Plug Flow Reactor
A fluid flux through a packed bed of a porous catalyst, where anexothermic reaction takes place
Aim: to calibrate the mathematical model for the reactor
Catalytic Plug Flow Reactor
Mathematical Model
Mass balance:
ε∂ci
∂t= −∂(uf ci)
∂z+ Deff ,z
∂2ci
∂z2 + Deff ,r
(∂2ci
∂r2 +1r∂ci
∂r
)+ρs,bed
ρfRi
Energy balance:
(ε ρf cp,f + (1− ε)ρs,bed cp,s)∂T∂t
= −cp,f∂(uf ρf T)
∂z+ λeff ,z
∂2T∂z2
+ λeff ,r
(∂2T∂r2
+1r∂T∂r
)
+ρs,bed
N∑i=1
∆HR,iRi
Catalyst activity:
∂acat
∂t= cA
(k0,des + A0,des exp
(−Ea,des
Rg T
))
The Optimal Experiment Design Problem
4 parameters to estimate: activation energies and frequencyfactors of the main and deactivation reaction16 measurements of the temperature in the middle of the reactorat 5 different time pointsBondary control: The temperature of the gas at the inlet of thereactor Tent, and the concentration cent
λeff ,z∂T(t, 0, r)
∂z= −uf ρf cp,f (Tent(t, r)− T(t, 0, r)) z = 0
c(t, 0, r) = cent(t, r) z = 0
Intuitive vs Optimal Design
Parameter True Values Estimates Uncertaintyk1 −1.76416 −1.79401± 0.949 52.9%k2 −3.37131 −3.83068± 0.232 6.1%k1,deac −5.60809 −4.98277± 8.117 162.9%k2,deac −5.62849 −5.61164± 1.160 20.6%trace - 7.18573 -
Estimates Based on Intuitive Experimental Design
Parameter True Values Estimates Uncertaintyk1 −1.76416 −1.77712± 0.628 35.4%k2 −3.37131 −3.84067± 0.228 5.9%k1,deac −5.60809 −5.57985± 4.914 88.1%k2,deac −5.62849 −5.61718± 0.977 17.4%trace - 2.692312 -
Estimates Based on the Optimal Experiment
Refinements: Online Optimal Experimental Design
Idea:
new measurements gained in experiments can be used toimprove the parameter estimates p∗
re-optimize optimum experimental design over remainingexperiment
minξ∈Ω×[t,tN ]
ϕ(C(ξ, p∗))
Challenges:
real-time iteration and multi-level iteration for optimumexperimental design
Parameter True Values Estimates Uncertaintyk1 −1.76416 −1.77712± 0.628 35.4%k2 −3.37131 −3.84067± 0.228 5.9%k1,deac −5.60809 −5.57985± 4.914 88.1%k2,deac −5.62849 −5.61718± 0.977 17.4%trace - 2.692312 -
Estimates Based on Offline Optimal Experimental Design
Parameter True Values Estimates Uncertaintyk1 −1.76416 −1.79392± 0.492 27.4%k2 −3.37131 −3.8335± 0.203 5.3%k1,deac −5.60809 −5.42928± 2.653 48.9%k2,deac −5.62849 −5.59868± 0.872 15.6%trace - 0.851882 -
Estimates Based on Online Optimal Experimental Design
One More Application:Select Optimal Catalyst Geometry by Heat TransferCoefficient – High Impact on the Value Chain
Estimation of Catalyst Properties Cooperation with
silver catalyst on aluminumsupports of different geometriesultimate aim: maximize selectivityof catalytic production process,e.g., ethene to ethene oxide
important criterion: high heat transfer coefficient to avoid hotspots or explosion!
Experimental setup Cooperation with
standard heat conductiontest reactor
special features:non-stationary heating,temperature range 20-350Ctemperature gradient over200C
Experimental setup Cooperation with
2D nonstationary PDE
τ∂T∂t
=∂
∂z
(λz∂T∂z
)+
1r∂
∂r
(rλr
∂T∂r
)− Gcp
∂T∂z
≈ 1000 spatial grid points2 controls: boundarytemperature TW , mass currentdensity G4 parameters to estimate:
λz, λr (heat conductivity)τ , cp (heat capacity)
measurements: outlettemperature
Simulation of Heat ConductionCooperation with
Optimal Experimental Design Cooperation with
intuitive expert de-sign
experimentaltime 16h
four experimentswith constant G = 0,0.9, 1.8, 4 kg/m2snon-stationaryheating with rampTW = 20− 350C
optimal designexperimentaltime 2.5huncertainty ishalved
one experimentwith varying steps ofG = 0, 0.49, 4 kg/m2sstationary heatingwith TW = 350C
Optimal Experimental Design Cooperation with
Results: total heat transfer coefficients for 3 candidate shapesincluding error bars
high accuracy decision basis for reactor designmarket idea revised: shape 1 instead of shape 2 (original choice)
Impact on value chain: catalyst investment over 1 Me!
ConclusionsNumerical methods and software for parameter estimation and optimumexperimental design
Complex nonlinear problems modelled by ODE/DAE/PDE can be treated
Optimal experimental design leads to acceleration of modeling process,allows to reduce e.g. development times for new products, prevents frommaking wrong decisions based on not calibrated models
Challenges:
High potential, but practically no theory for nonlinear dynamicalprocesses availableAlso more efficient numerical methods needed
robustification of OED against uncertaintiesmore general problem classes, e.g., hybrid systems, PDE modelsincorporation of OED in real time state and parameter estimation andfeedback control
New application areas
THANK YOU!