Guidelines for Design and Diagnostics of CO 2 Inversions
description
Transcript of Guidelines for Design and Diagnostics of CO 2 Inversions
Guidelines for Design and Diagnostics of CO2 Inversions
Anna M. Michalak Department of Civil and Environmental Engineering, andDepartment of Atmospheric, Oceanic and Space Sciences,The University of Michigan
A.M. Michalak ([email protected])
Ill-Conditioned Nature of Inverse Problems
A.M. Michalak ([email protected])
Environmental Contamination with Unknown Sources
Source: http://www.marshfieldclinic.org/nfmc/lab/research_projects.stm#
A.M. Michalak ([email protected])
Example – Available Measurements
A.M. Michalak ([email protected])
Ownership of Site
A.M. Michalak ([email protected])
Source Release Scenarios
A.M. Michalak ([email protected])
Blue’s Source Release Scenario
A.M. Michalak ([email protected])
Green’s Source Release Scenario
A.M. Michalak ([email protected])
Red’s Source Release Scenario
A.M. Michalak ([email protected])
Possible Scenarios
A.M. Michalak ([email protected])
Propagation of Uncertainty
A.M. Michalak ([email protected])
Bayesian Inference Applied to Inverse Modeling for Inferring Historical Forcing
( ) ( ) ( )( ) ( )∫
=sssy
ssyys
dp|p
p|p|p
Posterior probability of historical forcing Prior information
about forcing
p(y) probability ofmeasurements
Likelihood of forcing givenavailable measurements
y : available observations (n×1)
s : discretized historical forcing (m×1)
A.M. Michalak ([email protected])
Bayesian Formalism
Use data, y, prior flux estimates, sp, and model (with Green’s
function matrix H) to estimate fluxes, s Estimate obtained by minimizing:
Solution is
Estimates, ŝ have covariance
Residuals:
( ) HQRHQHQHQVs1
ˆ
−+−= TT
( ) ( )pTTp HsyRHQHQHss −++=
−1ˆ
ppssr s −=ˆ,1 sHyr y ˆ,1 −=
( ) ( ) ( ) ( )pTp
TL ssQssHsyRHsy −−+−−= −− 11
2
1
2
1
A.M. Michalak ([email protected])
Impact of Aggregation and Independence Assumptions
A.M. Michalak ([email protected])
Modeling Tools for NA Carbon Cycle
0 50 100 150 200-5
0
5
10
15
Time
Concentration Actual release history
Prior guess
50 100 150 200 250 300 3500
1
2
3
4
5
Location downstream
Concentration Actual plume
Measurement locations and values
Actual flux history Available data
A.M. Michalak ([email protected])
31 data11 fluxes31 data
21 fluxes31 data
41 fluxes31 data
101 fluxes31 data
201 fluxes
Modeling Tools for NA Carbon Cycle
0 50 100 150 200-4
-2
0
2
4
6
8
10
Time
Concentration
0 50 100 150 200-4
-2
0
2
4
6
8
10
Time
Concentration
0 50 100 150 200-4
-2
0
2
4
6
8
10
Time
Concentration
0 50 100 150 200-4
-2
0
2
4
6
8
10
Time
Concentration
0 50 100 150 200-4
-2
0
2
4
6
8
10
Time
Concentration
0 50 100 150 200-4
-2
0
2
4
6
8
10
Time
Concentration
0 50 100 150 200-4
-2
0
2
4
6
8
10
Time
Concentration
0 50 100 150 200-4
-2
0
2
4
6
8
10
Time
Concentration
0 50 100 150 200-4
-2
0
2
4
6
8
10
Time
Concentration
0 50 100 150 200-4
-2
0
2
4
6
8
10
Time
Concentration
Geostatistical Bayesian
A.M. Michalak ([email protected])
Statistical Diagnostics of Inversions
(Ongoing work with Ian Enting)
A.M. Michalak ([email protected])
Need for Diagnostics Wide use of inversion studies Large set of possible results due to differences in:
transport models inversion methods (e.g. Bayesian, geostatistical, mass balance) data choices meteorological fields covariance and other parameter choices
TransCom experiments aim to assess variability and derive (relative) consensus
Moving toward operational inversions Need objective method to evaluate inversions to determine (at
a minimum) which inversions are self-consistent
A.M. Michalak ([email protected])
Approaches to Inversion Validation Cumulative plots of residuals (Enting et al., 1995)
Reporting of 2 statistics of residuals from priors and/or observations (e.g. Rayner et al., 1999; Gurney et al., 2002; Peylin et al., 2002; Rödenbeck et al., 2003)
Variance of observation residuals calculated using conditional realizations of a posteriori fluxes (Michalak et al., 2004)
Maximum likelihood approach leading to r2 = 1 (Michalak et al.,
2005; Hirsch et al., 2006)
Statistical diagnostics project proposed during TransCom-Tsukuba
A.M. Michalak ([email protected])
Bayesian Formalism
Use data, y, prior flux estimates, sp, and model (with Green’s
function matrix H) to estimate fluxes, s Estimate obtained by minimizing:
Solution is
Estimates, ŝ have covariance
Residuals:
( ) HQRHQHQHQVs1
ˆ
−+−= TT
( ) ( )pTTp HsyRHQHQHss −++=
−1ˆ
ppssr s −=ˆ,1 sHyr y ˆ,1 −=
( ) ( ) ( ) ( )pTp
TL ssQssHsyRHsy −−+−−= −− 11
2
1
2
1
A.M. Michalak ([email protected])
Testing Residuals Want to test assumption of zero-mean, multivariate normal
with covariance Q and R for and
Unknown strue in r0,sp and r0,y
Sum of squares of normalized residuals from fit, ŝ:
should be distributed as 2 with n degrees of freedom, i.e. normalized r1,sp, r1,y are not n + m independent N(0,1)
quantities
ptruepssr s −=,0 trueHsyr y −=,0
( ) ( ) 1//1
1,
2,1
1,
2,1 ≈⎥
⎦
⎤⎢⎣
⎡+ ∑∑
==
m
iiis
n
iiiy QrRr
n pii
A.M. Michalak ([email protected])
flux
data
sp
y
flux
data
sp
y
mo
de
l
Residuals from fit, r1,sp
and r1,y are correlated
because they represent departures from the ‘model’ line.
Testing Residuals
A.M. Michalak ([email protected])
Testing Residuals Conditional realizations can be generated:
where and uk is the k-th realization of vector of N(0,1)
values.
Normalized residuals from sc,k :
should have covariances R and Q
1ˆ
−Λ= SSVs
kkc uSss 2/1, ˆ Λ+=
pkcpssr s −= ,,2
kc,,2 Hsyr y −=
A.M. Michalak ([email protected])
Testing Residuals
Residuals from fit, r1,sp
and r1,y are correlated
because they represent departures from the ‘model’ line.
Adding perturbations u leads to residuals r2,sp
and r2,y that are
independent
flux
data
sp
y
mo
de
l
A.M. Michalak ([email protected])
Testing Residuals
flux
data
sp
y
mo
de
l
Residuals from fit, r1,sp
and r1,y are correlated
because they represent departures from the ‘model’ line.
Adding perturbations u leads to residuals r2,sp
and r2,y that are
independent
A.M. Michalak ([email protected])
Simple Tests Do residuals have the specified covariance structure?
Are the residuals unbiased?
Are the residuals normally distributed? Perform normality tests such as Kolmogorov-Smirnov goodness-
of-fit hypothesis test, Lilliefors hypothesis test of composite normality, Filliben normality test, etc.
( ) ( ) 0//1
1,,2
1,,2 ≈⎥
⎦
⎤⎢⎣
⎡+
+ ∑∑==
m
iiis
n
iiiy QrRr
mn pii
( ) ( ) 1//1
1,
2,2
1,
2,2 ≈⎥
⎦
⎤⎢⎣
⎡+
+ ∑∑==
m
iiis
n
iiiy QrRr
mn pii
A.M. Michalak ([email protected])
Test Setup Initial scoping study with 1995 CSIRO setup:
GISS model (8o x 10o) Cyclo-stationary inversion: constant + fixed seasonal cycle 12 ocean regions, 12 land regions with (separate) mean plus pre-
specified seasonality, 4 regions of (season-dependent) deforestation, fossil source plus explicit CO oxidation
Observations expressed as mean + Fourier components of seasonal cycle
No use of O2 or 13C data
Examined cases: Reference case, Biased priors, Biased non-fossil priors,
Loose data
A.M. Michalak ([email protected])
Results
0 0.2 0.4 0.6 0.8 1-3
-2
-1
0
1
2
3
0 0.2 0.4 0.6 0.8 1-3
-2
-1
0
1
2
3
0 0.2 0.4 0.6 0.8 1-3
-2
-1
0
1
2
3
0 0.2 0.4 0.6 0.8 1-3
-2
-1
0
1
2
3
0 0.2 0.4 0.6 0.8 1-3
-2
-1
0
1
2
3
Fails KS normality test Fails 2 test Fails unbiasedness test
Fails 2 and unbiasedness tests
Fails normality and unbiasedness tests Passes all tests
0 0.2 0.4 0.6 0.8 1-3
-2
-1
0
1
2
3
A.M. Michalak ([email protected])
Results – Reference Case
0 50 100 150 2000.5
1
1.5
2Normalized 2 - 1% fail
0 50 100 150 200-0.4
-0.2
0
0.2
0.4
A.M. Michalak ([email protected])
Results – Biased Priors
0 50 100 150 2000.5
1
1.5
2Normalized 2 - 4% fail
0 50 100 150 200-0.4
-0.2
0
0.2
0.4
A.M. Michalak ([email protected])
Results – Biased Non-fossil Priors
0 50 100 150 2000.5
1
1.5
2Normalized 2 - 4% fail
0 50 100 150 200-0.4
-0.2
0
0.2
0.4
A.M. Michalak ([email protected])
Results – Loose Data
0 50 100 150 2000.5
1
1.5
2Normalized 2 - 8% fail
0 50 100 150 200-0.4
-0.2
0
0.2
0.4
A.M. Michalak ([email protected])
What if Residuals are Correlated? Q and R are no longer diagonal matrices, but we can use
geostatistical methods to calculate equivalent statistics Do residuals have the specified covariance structure?
Are the residuals unbiased?
( ) ( ) 1//1
1,
2,2
1,
2,2 ≈⎥
⎦
⎤⎢⎣
⎡+
+ ∑∑==
m
iiis
n
iiiy
effeff
QrRrmn pii
( )XQX 12 −= TQeffm σ ( )XRX 12 −= T
Reffn σ
( ) 0,111
,2≈= −−−
kcTT
r yHsRXXRXμ
( ) 0,111
,2≈= −−−
kcTT
r ssQXXQXμ
A.M. Michalak ([email protected])
Conclusions
Residual analysis should be a standard step in validating inversion results
Conditional realizations allow for simple residual tests and tests on subsets of residuals
Diagnostics will not detect errors due to mis-interpretation (CO2 flux ≠ carbon flux ≠ carbon storage rate)
Geostatistics provides a set of tools for dealing with spatially and/or temporally correlated errors and parameters
Many cases suggest that previous studies have used cautious assignments of uncertainty, motivated by risk of unknown correlated errors.
A.M. Michalak ([email protected])
Next Steps
Develop additional tests Analyze residuals from individuals stations / regions Investigate use of loose priors for “reluctant” Bayesians Analysis of large ensembles of conditional realizations Application to existing TransCom inversions
A.M. Michalak ([email protected])
Maximum Likelihood Estimation of Covariance Parameters Covariance parameters determine:
Relative weight assigned to data and prior information Posterior covariance / uncertainty estimate
Appropriate estimates of covariance parameters is essential to flux estimation
Lack of objective methods for estimating these parameters: Described as “greatest single weakness” in some studies
(Rayner et al., 1999) Maximum likelihood approach provides estimates based on
available data
A.M. Michalak ([email protected])
Model-data Mismatch v. Prior Error ( ) ( ) ( )pTT
pTL HszRHQHHszRHQH −+−++=
−1
2
1ln
2
1θ
Michalak et al. (JGR 2005)
A.M. Michalak ([email protected])
Procedure Define the marginal pdf of covariance parameters (w/out solving
inverse problem). Its negative logarithm is:
Minimum of objective function determines best estimate of covariance parameters
Inverse of Hessian of objective function estimates uncertainty of covariance parameters
Currently applied to diagonal matrices, but can directly be used with matrices with off-diagonal terms (i.e. correlated residuals) Need to define (numerically or analytically) derivative w.r.t. covariance
parameters Demonstrated in the geostatistical framework (Michalak et al., JGR 2004)
( ) ( ) ( )pTTp
TL HszRHQHHszRHQH −+−++=−1
2
1ln
2
1θ
Michalak et al. (JGR 2005)
A.M. Michalak ([email protected])
Stations Used in ML Study
180oW 120oW 60oW 0o 60oE 120oE 180oW
60oS
30oS
0o
30oN
60oN
alt
asc
ask
azr
bal
bme & bmw
brw
bsc cba
cgo
chr
cmo
crz
eic
gmi
goz
hba
hun
ice
itn izo
kco
key
kzd & kzm lef
mbc
mhd
mid
mlo & kum
nmb
nwr
psa
pta
rpb
sey
shm
smo
spo
stm
sum
syo
tap
tdf
uta
uum
wis wlg
zep
poc 1 (South) to poc 15 (North)scs 1 (South) to scs 7 (North)
Michalak et al. (JGR 2005)
A.M. Michalak ([email protected])
Constant Variances
( ) ( ) 1//1
1,
2,2
1,
2,2 ≈⎥
⎦
⎤⎢⎣
⎡+
+ ∑∑==
m
iiis
n
iiiy QrRr
mn pii
R = 1.63 ppm
Q = 2.17 GtC/yr
Michalak et al. (JGR 2005)
A.M. Michalak ([email protected])
R, MBL = 0.71 ppm
R, HI/DES = 1.49 ppm
R, CONT = 3.16 ppm
Q, LAND = 2.02 GtC/yr
Q, OCEAN = 1.07 GtC/yr
Variances Based on Physical Attributes
( ) ( ) 1//1
1,
2,2
1,
2,2 ≈⎥
⎦
⎤⎢⎣
⎡+
+ ∑∑==
m
iiis
n
iiiy QrRr
mn pii
Michalak et al. (JGR 2005)
A.M. Michalak ([email protected])
R,1 = 0.58 ppm
R,2 = 1.04 ppm
R,3 = 1.19 ppm
R,4 = 4.36 ppm
R,5 = 7.64 ppm
Q, LAND = 1.76 GtC/yr
Q, OCEAN = 1.21 GtC/yr
Variances Based on Inversion Behavior
( ) ( ) 1//1
1,
2,2
1,
2,2 ≈⎥
⎦
⎤⎢⎣
⎡+
+ ∑∑==
m
iiis
n
iiiy QrRr
mn pii
Michalak et al. (JGR 2005)
A.M. Michalak ([email protected])
Variance Based on Auxiliary Information
R = CS
= 0.11 – 5.96 ppm
( ) ( ) 1//1
1,
2,2
1,
2,2 ≈⎥
⎦
⎤⎢⎣
⎡+
+ ∑∑==
m
iiis
n
iiiy QrRr
mn pii
Q, LAND = 2.08 GtC/yr
Q, OCEAN = 0.89 GtC/yr
Michalak et al. (JGR 2005)
A.M. Michalak ([email protected])
Land Fluxes
BNAm TNAm TrAm SoAm NoAf SoAf BoAs TeAs TrAs Aust Euro-1.5
-1
-0.5
0
0.5
1
1.5
2Setup 1Setup 2Setup 3Setup 4Setup 8
Michalak et al. (JGR 2005)
A.M. Michalak ([email protected])
Ocean Fluxes
NoPa TWPa TEPa SoPa NoOc NoAt TrAt SoAt SoOc TrIn SoIn-1.5
-1
-0.5
0
0.5
1
1.5
2Setup 1Setup 2Setup 3Setup 4Setup 8
Michalak et al. (JGR 2005)
A.M. Michalak ([email protected])
Land Uncertainty
BNAm TNAm TrAm SoAm NoAf SoAf BoAs TeAs TrAs Aust Euro0
0.5
1
1.5
2
2.5Setup 1Setup 2Setup 3Setup 4Setup 8
Michalak et al. (JGR 2005)
A.M. Michalak ([email protected])
Ocean Uncertainty
NoPa TWPa TEPa SoPa NoOc NoAt TrAt SoAt SoOc TrIn SoIn0
0.5
1
1.5
2
2.5Setup 1Setup 2Setup 3Setup 4Setup 8
Michalak et al. (JGR 2005)
A.M. Michalak ([email protected])
Conclusion from ML Method
Data themselves can provide information about model-data
mismatch and prior error covariances (in the absence of
external information regarding residual covariances)
Covariances R and Q reflect different patterns of residuals
Maximum likelihood approach produces covariance estimates
consistent with physical understanding
ML can be applied to more complex covariance structures
A.M. Michalak ([email protected])
Selecting an Appropriate Model
A.M. Michalak ([email protected])
Which Model is Best?
0 2 4 6 8-5
0
5
10
Available dataReal (unknown) determininistic componentConstant meanLinear trendLinear + QuadraticLinear+Quadratic+Cubic
0 2 4 6 8-5
0
5
10
A.M. Michalak ([email protected])
Which Model is Best?
0 2 4 6 8-5
0
5
10
Available dataReal (unknown) determininistic componentConstant meanLinear trendLinear + QuadraticLinear+Quadratic+Cubic
0 2 4 6 8-5
0
5
10
A.M. Michalak ([email protected])
Selecting Auxiliary Variables Variance ratio test quantifies the significance of adding
additional variable(s) to the model of mean
Calculate measure of fit (WSS) for each model of mean using measurement data, transport model, covariance matrices, auxiliary variables
qpnWSSqWSSWSS
v
−−
−
=1
10
vs.),;( qpnqvF −−
A.M. Michalak ([email protected])
Summary / Additional Issues Issues discussed in this presentation:
Ill-posed / Ill-Conditioned nature of inverse problem Impact of aggregation Impact of independence/correlation assumptions Inversion diagnostics Statistical tools for inversion design
Additional issues not specifically addressed: Representation error Transport model error
Never Trust Anyone Who Is Not Skeptical of Their Own Results!