Applications of Information Theory in Ensemble Data Assimilation
Dusanka Zupanski1, Arthur. Y. Hou2, Sara Q. Zhang2, Milija Zupanski1,
Christian D. Kummerow1, and Samson H. Cheung3
1Colorado State University, Fort Collins, Colorado
2NASA Goddard Space Flight Center, Greenbelt, Maryland
3University of California, Davis, California
Manuscript submitted to Quart. J. Roy. Meteor. Soc.
May 6, 2007
(2 tables, 7 figures)
Corresponding author address:
Dusanka Zupanski, Cooperative Institute for Research in the Atmosphere/Colorado State
University, Fort Collins, Colorado, 80523-1375; E-mail: [email protected]
2
SUMMARY
We apply information theory within an ensemble-based data assimilation approach and
define information matrix in ensemble subspace. The information matrix in ensemble subspace
employs a flow-dependent forecast error covariance and it is of relatively small dimensions
(equal to the ensemble size). The information matrix in ensemble subspace can be directly linked
to the information matrix typically used in non-ensemble based data assimilation methods, such
as the Kalman Filter (KF) and the 3-dimensioanl variational (3d-var) methods, which provides a
framework for consistent comparisons of information measures between different data
assimilation methods.
We evaluate information measures, such as degrees of freedom for signal, within the
Maximum Likelihood Ensemble Filter (MLEF) data assimilation approach and compare them
with those obtained using the KF approach and the 3d-var approach. We assimilate model-
simulated observations and use the Goddard Earth Observing System Single Column Model
(GEOS-5 SCM) as a dynamical forecast model.
The experimental results demonstrate that the proposed framework is useful for
comparing information measures obtained in different data assimilation approaches. These
comparisons indicate that using a flow-dependent forecast error covariance matrix (e.g., as in the
KF and the MLEF experiments) is fundamentally important for adequately describing prior
knowledge about the true model state when calculating information measures of assimilated
observations. We also demonstrate that data assimilation results obtained using the KF and the
MLEF approach (when ensemble size is larger than 10 ensemble members) are superior to the
results of the 3d-var approach.
3
Keywords: Ensemble data assimilation, Information theory,
Maximum Likelihood Ensemble Filter, Kalman filter, 3d-var
4
1. INTRODUCTION
It has been recognized that information theory (e.g., Shannon and Weaver 1949; Rodgers
2000) and predictability are inherently related (e.g., Schneider and Griffies 1999; Kleeman 2002;
Roulston and Smith 2002; DelSole 2004; Abramov et al. 2005). Information theory has also
come to the attention of data assimilation, where it has been used to calculate information
content of various observations (e.g., Wahba 1985; Purser and Huang 1993; Wahba et al. 1995;
Rodgers 2000; Rabier et al. 2002; Fisher 2003; Johnson 2003; Engelen and Stephens 2004;
L’Ecuyer et al. 2006). Information content of observations can potentially have many
applications, including planning measurement missions, designing observational systems and
defining targeted observations and data selection strategies. These applications have been
underutilized so far, and were mainly oriented towards defining data selection strategies (e.g.,
Rabier et al. 2002 and references therein). Nevertheless, progress in data assimilation methods
should foster applications of information theory in many different areas.
Ensemble-based data assimilation methods, often referred to as Ensemble Kalman Filter
(EnKF) methods, are novel data assimilation techniques that have rapidly progressed since the
pioneering work of Evensen (1994) has appeared. As a result of this progress, many different
variants of the EnKF have evoloved (e.g., Pham et al. 1997, 1998; Houtekamer and Mitchell
1998; Lermusiaux and Robinson 1999; Hamill and Snyder 2000; Keppenne 2000; Mitchell and
Houtekamer 2000; Anderson 2001; Bishop et al. 2001; van Leeuwen 2001; Pham 2001; Reichle
et al. 2002a,b; Whitaker and Hamill 2002; Hoteit et al. 2002, 2003; Tippett et al. 2003; Zhang et
al. 2004; Ott et al. 2005; Szunyogh et al. 2005; Peters et al. 2005; Zupanski 2005; Zupanski and
Zupanski 2006, just to mention some). While there are respectful differences between different
5
variants the EnKF, they are all closely related in data assimilation problems involving Gaussian
Probability Density Functions (PDFs) and linear dynamical forecast models. In such case, the
EnKFs share the common property of being rank-reduced approximations to the theoretically
optimal, full-rank KF solution. Under more general conditions, involving highly non-linear
dynamical models and non-Gaussian PDFs the differences between different EnKFs could be
more significant (e.g., Fletcher and Zupanski 2006).
Even though there was much advancement in the EnKF methods, this was not matched
by applications of information theory within these methods. In fact, information theory has
primarily been applied within other data assimilation methods (e.g., variational, KF), while its
application to ensemble data assimilation has been rather limited so far. Some of the pioneering
studies in this area are as follows. Wang and Bishop (2003) examined the eignevalues and
eigenvectors of the Ensemble Transform Kalman Filter (ETKF, Bishop et al. 2001 and Wang and
Bishop 2003) transformation matrix and demonstrated that these eigenvalues and eigenvectors
define the amount and the direction of the maximum forecast error reduction due to information
from the observations. Patil et al. (2001), Oczkowski et al. (2005), and Wei et al. (2006) used the
eigenvalues of the ETKF transformation matrix to define measures of information, referred to as
“bred dimension”, “effective degrees of freedom”, and “E dimension”, respectively. These
studies have recognized that ensemble-based methods have a potential to improve measures of
information due to the use of flow-dependent forecast error covariance matrix, especially in
applications to adaptive observations. A recent study by Uzunoglu et al. (2007) described a novel
application of information measures in ensemble data assimilation: for ensemble size reduction
or inflation.
6
Building upon the previous studies, and recognizing that there is similarity between the
ETKF and the MLEF approach, we link the MLEF transformation matrix with the so-called
information or observability matrix, defined in ensemble subspace. We also demonstrate how the
information matrix can be used to define standard measures of information theory, such as
Degrees of Freedom (DOF) for signal and Shannon entropy reduction (e.g. Rodgres 2000). Thus,
we propose a general framework to link together ensemble data assimilation and information
theory in a similar manner as in variational and KF methods. This framework can be used for
comparing information measures of different data assimilation approaches. Additionally, as
demonstrated in Zupanski et al. (2007), the information measures in ensemble subspace can be
employed to define a flow-dependent “distance” function for covariance localization. We
evaluate this framework within an ensemble-based data assimilation method, using a single
column precipitation model and simulated observations. We also evaluate the results of the KF
and the 3-dimensional variational (3d-var) approaches, defined as special applications of the
proposed framework.
The paper is organized as follows. In section 2 the general framework is described. The
experimental design is explained in section 3, and experimental results are presented in section 4.
Finally, in section 5, the conclusions are summarized and their relevance for future research is
discussed.
2. GENERAL FRAMEWORK
In this study we employ an ensemble data assimilation approach referred to as Maximum
Likelihood Ensemble Filter (MLEF, Zupanski 2005; Zupanski and Zupanski 2006; Zupanski et
7
al. 2006). Here we shortly describe the MLEF. The MLEF seeks a maximum likelihood state
solution employing an iterative minimization of a cost function. The solution for a state vector x
(also referred to as control variable), of dimension Nstate, is obtained by minimizing the cost
function J defined as
J(x) =
1
2[x ! xb ]
TPf
!1[x ! xb ]+
1
2[ y ! H (x)]
TR
!1[ y ! H (x)] , (1)
where y is an observation vector of dimension equal to the number of observations (Nobs), and H
is, in general, a non-linear observation operator. Subscript b denotes a background (i.e., prior)
estimate of x, and superscript T denotes a transpose. The Nobs ×Nobs matrix R is a prescribed
observation error covariance, and it includes instrumental and representativeness errors (e.g.,
Cohn 1997). The matrix Pf of dimension Nstate×Nstate is the forecast error covariance. As in many
other ensemble-based methods, we do not use the full matrix Pf explicitly, but we employ the
rank-reduced square-root formulation
Pf = Pf
1
2 Pf
1
2( )T
, where Pf
1
2 is an Nstate×Nens square-root
matrix (Nens being the ensemble size).
Uncertainties of the optimal estimate of the state x are defined as square roots of the
analysis error covariance ( Pa
1
2 ) and the forecast error covariance ( Pf
1
2 ), both defined in ensemble
subspace. The square root of the analysis error covariance is obtained as (e.g., Zupanski 2005)
!Pa
12 = pa
1pa2... pa
Nens!" #$!!= Pf
12 (Iens +C )
% 12 , (2)
where Iens
is an identity matrix of dimension Nens× Nens, and pa
i are column vectors representing
analysis perturbations in ensemble subspace. The square root in (2) is calculated via eigenvalue
8
decomposition of C. It is defined as a symmetric positive semi-definite square root, and therefore
it is unique (e.g., Horn and Johnson 1985, Theorem 7.2.6).
Matrix C has dimensions Nens×Nens and is defined by
ZZCT
= ; zi= R
!12H (x + p f
i) ! R
!12H (x) , (3)
where vectors zi are the columns of the matrix Z of dimension Nobs×Nens. Note that, when
calculating zi, a nonlinear operator H is applied to perturbed and unperturbed states x. Vectors
i
fp are columns of the square root of the background error covariance matrix and are obtained
via ensemble forecasting employing a non-linear forecast model M:
!Pf
1
2 = p f
1p f
2... p f
Nens!" #$! ; p f
i= M (x + pa
i) ! M (x) . (4)
Equations (1)-(3), are solved iteratively in each data assimilation cycle, while Eq. (4) is used to
propagate in time the columns of the forecast error covariance matrix Pf
1
2 .
An information measure referred to as the DOF for signal is often used in information
theory (e.g., Rodgers 2000). In data assimilation applications, DOF for signal (here denoted ds) is
commonly defined in terms of analysis and forecast error covariances, Paand
Pf , (e.g., Wahba
1985; Purser and Huang 1993; Wahba et al. 1995; Rodgers 2000; Rabier et al. 2002; Fisher
2003; Johnson 2003; Engelen and Stephens 2004) as
ds = tr [Istate ! PaPf
!1] , (5a)
9
where tr denotes trace, and Istate
is an identity matrix of dimension Nstate× Nstate. The quantity ds
counts the number of new pieces of information brought to the analysis by the observations, with
respect to what was already known, as expressed by Pf . Being dependent on the ratio between
the analysis and forecast error covariance ( PaPf
!1 ), ds measures the forecast error reduction due
to new information from the observations. Wahba et al. (1995) define ds in terms of so-called
influence matrix A as
ds= tr [R
- 12 HP
aH
TR- 12 ] = tr [A] , (5b)
which is equivalent to (5a), as pointed out by Fisher (2003).
Employing definition of Pa in ensemble subspace (2) and using
tr [xx
T] = tr [x
Tx] we
can write (5b) in ensemble subspace as
ds = tr [(Iens +C )
!1(Pf
12 )T
HT(R
!12 )T
R!12 HPf
12 ] . (6)
Assuming that the linear operator H is the first derivative of a weakly non-linear operator H at
the point x, we can write the following approximate equation for the columns ri of the matrix
R
!1
2 HPf
1
2
ri ! R
"12H (x + p f
i) " R
"12H (x) . (7)
Finally, by combining (3), (6), and (7) we have
10
ds= tr [(I
ens+C )
!1Z
TZ ] = tr [(I
ens+C )
!1C ] . (8)
Definition (8) is essentially the same as Eq. (2.61) of Rodgers (2000). The only difference is that
the trace is obtained employing matrix C of dimension Nens×Nens, while in the formulation of
Rodgers (2000), the trace is obtained employing an information matrix of dimensions Nstate×Nstate
(the full-rank information matrix). We will denote matrix C as the information matrix in
ensemble subspace.
By introducing information matrix C, we have defined a link between information theory
and ensemble data assimilation. Having this link is of special importance for the following
reasons. When calculating information content measures such as ds, a flow-dependent
Pf obtained directly from ensemble data assimilation is used. In addition, eigen-decomposition
of C is easily accomplished due to the relatively small size of this matrix (Nens×Nens) compared to
the typical number of observations (Nobs) used in applications to complex forecast models with
large state vectors (of dimension Nstate). A possible disadvantage of this ensemble-based
approach, as of any ensemble-based approach, is that a small ensemble size might not be
sufficient to adequately describe the variability of the full-rank forecast error covariance matrix.
In such cases, the information measures would still measure the amount of information brought
by the observations with respect to what was already known, however, the quality of the analysis
could be poor. One of the main focuses of this study is to evaluate the impact of ensemble size
on the information measures.
11
Once the information matrix C is available, various information measures can be
calculated. It is especially useful to define these measures in terms of the eigenvalues 2
i! of C.
Thus, as in Rodgers (2000), we can define (8) in terms of 2
i! and calculate ds as
ds=
!i
2
(1+ !i
2)i
" . (9)
Equations (3) and (7) indicate that the eigeinvalues !i
2 depend on the ratio between the
forecast error covariance and the observation error covariance, both defined in the observation
locations. Thus, for the forecast errors larger than the observation errors we have !i
2" 1 (signal),
and for the forecast errors smaller than the observation errors we have !i
2< 1 (noise). Using
eigenvalues !i
2 one can also calculate other information measures, such as Shannon information
content defined as the reduction of entropy due to added information from the observations
(Shannon and Weaver 1949; Rodgers 2000). Since this measure is quite similar to DOF for
signal, it will not be examined in this study.
An important characteristic of the MLEF approach is that it can be made identical to KF
or variational methods, under special conditions that are explained below. This provides an
opportunity to compare information measures obtained using different data assimilation
approaches.
(a) Connection to KF
A linear version of the full-rank MLEF is identical to the classical linear KF when using
Gaussian PDFs, linear models M, and linear observation operators H. The full-rank MLEF
12
solution is obtained by setting Nens=Nstate. Under these conditions, the solution that minimizes (1)
can be explicitly calculated using (e.g., Zupanski 2005, Appendix A, Eq. A7)
x = xb +!Pf H
T(HPf
TH
T+ R)
"1[ y " H (xb )] . (10)
If both the KF and the MLEF are initialized using the same forecast error covariance the MLEF
solution after the first data assimilation cycle (Eq. 10) will be identical to the KF solution,
because the minimization step-size α is equal to 1 for quadratic cost functions (Gill et al. 1981).
The MLEF solution will remain identical to the KF solution throughout all data assimilation
cycles, since the linear version of the forecast error covariance update (Eq. 4) is the same as the
KF update equation. Thus, we can conclude that the full-rank MLEF (Nens=Nstate) is identical to
the full-rank KF under the above assumptions. Under the same assumptions, the reduced-rank
MLEF (Nens<Nstate) could be interpreted as a variant of a reduced-rank KF, since the same
equations are being solved in both approaches, however, a reduced rank Pf is used. Since
different variants of the reduced-rank KF would produce different solution due to different ways
of defining a reduced rank Pf , the link between the reduced-rank MLEF and reduced-rank KFs
is not uniquely defined.
(b) Connection to 3d-var
As explained before, the solution obtained by the MLEF is a maximum likelihood one,
and, in general, a non-linear one. These characteristics are shared with variational methods, thus
there is a connection to these methods as well. The full-rank non-linear MLEF solution without
the update of the forecast error covariance [i.e., using a prescribed covariance instead of Eq. (4)]
13
is identical to the 3d-var solution, since the same cost function (1) is minimized. To obtain
identical results, one can employ the same minimization method with the same preconditioning
in both the MLEF and the 3d-var (e.g., Zupanski 2005). In this study we employ Hessian
preconditioning, which may not be always feasible in variational methods due to large
dimensions of the full-rank covariance matrices.
In summary, the general framework proposed here should be directly applicable not only
to EnKF methods, but also to KF and 3d-var methods, as long as it remains practical to evaluate
full-rank covariance matrices. There are, however, some restrictions to the proposed general
framework. For example, when deriving information measures (e.g., DOF for signal and entropy
reduction) we have assumed, as in Rodgers 2000, that all errors are Gaussian. Therefore, we
have implicitly assumed weak nonlinearity in M and H, even though ensemble-based and
variational methods do not necessarily require this assumption. Consequently, the information
measures obtained in highly non-linear data assimilation problems, and also for variables that are
typically non-Gaussian (e.g., humidity and cloud microphysical variables) could be incorrect, or
only approximately correct. A theoretical framework for information measures employing non-
Gaussian ensembles is proposed in Majda et al. (2002) and Abramov and Majda (2004). They
have employed a different approach, based on the moment constraint optimization, to estimate
the so-called “predictive utility”, which is an information measure derived from the Shannon
entropy. As shown in Abramov and Majda (2004), higher order moments, up to first four
moments, would be required for non-Gaussian information measures in typical atmospheric
applications. The framework proposed here could be further generalized following Majda et al.
(2002) and Abramov and Majda (2004). An extension of the MLEF to account for log-normally
distributed observations has already been developed by Fletcher and Zupanski (2006) and could
14
be used as a starting point for defining non-Gaussian information measures within the MLEF. As
indicated in Fletcher and Zupanski (2006), the cost function should include an additional term in
order to account for log-normally distributed observations.
3. EXPERIMENTAL DESIGN
(a) Forecast model
A single column version of the GEOS-5 Atmospheric General Circulation Model
(AGCM) is used in this study. We refer to this model as GEOS-5 SCM (Single Column Model).
Previous experience employing column versions of the GEOS-series within a 1-dimensional
variational data assimilation technique indicated that the 1-dimensional framework could
produce useful data assimilation results, especially in applications to rainfall assimilation (Hou et
al. 2000, 2001, 2004).
The GEOS-5 SCM consists of the model physics components of the GEOS-5 AGCM:
moist processes (Relaxed Arakawa-Schubert convection and prognostic large-scale cloud
condensation), turbulence, radiation, land surface, and chemistry. The dynamic advection is
driven by prescribed forcing time series. The column model is capable of updating all the
prognostic state variables and evaluating of a suite of additional observable quantities such as
precipitation and cloud properties. The GEOS-5 SCM retains most of the non-linear complexities
and interaction between physical processes as in the full AGCM. In the meanwhile, it has the
advantage of reduced dimensions when it is used in the research experiments of ensemble data
assimilation.
15
(b) Control variable, observations
In the applications of this paper, we focus on using simulated observations directly on
two state variables: vertical profiles of temperature (T) and specific humidity (q). They are also
the control variables for data assimilation. In the experiments presented, 40 model levels are
used. Thus, the dimension of the control vector is 80. The column model only updates
temperature and specific humidity during a data assimilation interval. Remaining state variables,
along with the advection forcing, are prescribed by the Atmospheric Radiation Measurement
(ARM) data time series. The Tropical Western Pacific site (130E, 15N) in ARM observation
program is chosen for the application discussed in this paper. The assimilation experiments cover
the period from 7 May 1998 to 24 May 1998 (17 days).
A data assimilation interval of 6 hours is used in the experiments, and simulated
observations of temperature and specific humidity are assimilated at the end of each data
assimilation interval. Simulated observations are defined using the “true” state, defined by the
GEOS-5 SCM, and by adding Gaussian white noise to the “true” state. Thus, the observation
error covariance matrix R is assumed diagonal and constant in time. We use the same version of
the model to perform data assimilation and to create observations, thus we assume that the model
is perfect. In experiments with real observations the perfect model assumption might not be
justified. In order to relax this assumption one can use some of the recently proposed model error
estimation approaches (e.g., Heemink et al. 2001; Mitchell et al. 2002; Reichle et al. 2002a;
Zupanski and Zupanski 2006).
16
Observations are created assuming an instrument error of 0.2 K for T at all model levels
( Rinst
1 2= 0.2K ). Instrument errors for q vary between
Rinst
1 2= 6.1*10
!8 and Rinst
1 2= 7.9 *10
!4 ; the
errors are defined to decrease from the lowest to the highest model level. The total observation
errors are defined as R1 2
= !Rinst
1 2 , where an empirical parameter α>1 is employed to
approximately account for representativeness errors. Here, the representativeness error
approximately accounts for the mismatch between the observed and modeled scales (which is a
common definition of representativeness error), and also for inadequate scales in the forecast
error covariance due to the small ensemble size. To approximately account for both parts of the
representativeness error we require that for the largest ensemble size the parameter α is greater
than 1, and we let it increase with decreasing ensemble size. The values of the parameter α are
tuned to the ensemble size to approximately satisfy the expected chi-square innovation statistic,
calculated for optimized innovations and normalized by the analysis error (e.g., Dee et al. 1995;
and Menard et al. 2000; Zupanski 2005). Instrument errors and the values of the parameter α
used in data assimilation experiments of this study are listed in Table 1.
Initial conditions for T and q at the beginning of the first data assimilation cycle are from
ARM observations of T and q at the time (0000 UTC 07 May 1998), and they are interpolated
from observation levels to the model levels. With this configuration errors in the initial
conditions are simulated by the difference between ARM observations and the “true” states
defined by the model simulation (started from 1800 UTC 06 May 1998 and integrated for 6
hours to 0000 UTC 07 May1998). This has resulted in Root Mean Square (RMS) errors of 0.46
K for Tb and 4.8×10-4 for qb in the first data assimilation cycle (recall that subscript b denotes
17
background values). In all subsequent cycles, the 6-h forecast of T and q from the previous cycle
is used to define the background for the current cycle.
(c) Ensemble perturbations
The square root forecast error covariance Pf
1
2 is initialized in the first data assimilation
cycle using prescribed perturbations i
fp (cold start); in the subsequent cycles the data
assimilation scheme updates i
fp according to Eqs. (2)-(4). The cold start ensemble perturbations
are defined using Gaussian white noise with prescribed standard deviation of comparable
magnitude to the observations errors. A compactly supported second-order correlation function
of Gaspari and Cohn (1999), with decorrelation length of 3 vertical layers, is applied to the
random perturbations to define a correlated random noise (e.g., Zupanski et al. 2006). The
decorrelation length of 3 vertical layers was determined empirically, based on the overall best
data assimilation performance.
(d) Minimization
A conjugate gradient minimization algorithm (e.g., Luenberger 1984), with the line-
search technique as in Navon et al. (1992) and with Hessian preconditioning as in Zupanski
(2005), is used in the experiments of this paper. In all data assimilation experiments, only a
single iteration of the minimization is performed, which is sufficient for linear observation
operators (Zupanski 2005). Note that non-linearity of the forecast model M, even though it
influences the final data assimilation results, does not influence the minimization results within a
18
filter formulation. This would be, however, different for a smoother application, since the non-
linear model would be included in the cost function.
(e) Covariance localization
Covariance localization is often used in ensemble-data assimilation applications to better
constrain the data assimilation problems with either insufficient observations or insufficient
ensemble size (e.g., Houtekamer and Mitchell 1998; Hamill et al. 2001; Whitaker and Hamill
2002). The localization was also found beneficial in the full-rank KF filter applications due to
spurious loss of variance in the discrete KF covariance evolution equation (e.g., Menard et al.
2000). Since covariance localizations are typically achieved by employing arbitrary covariance
functions (e.g., Gaspari and Cohn 1999) it is important to evaluate how such localizations impact
the information measures.
We use a localization technique based on Schur (element-wise) product between the
forecast error covariance matrix and a compactly supported covariance function (e.g.,
Houtekamer and Mitchell 1998; Hamill et al. 2001; Whitaker and Hamill 2002). Since the
dimensions of the full forecast error covariance are small (80×80), we evaluate the full
covariance Pf and multiply it, element-wise, with the localization function (the second-order
correlation function of Gaspari and Cohn 1999 with decorrelation length of 3 vertical layers). As
a result, we obtain a localized Pf . We then perform the eigenvalue decomposition of the
localized covariance and keep only the Nens leading eigenvalues and eigenvectors in data
assimilation. Note, however, that covariance localization could be achieved in a different (i.e.,
19
approximate) way in applications to large-size problems (e.g., Whitaker and Hamill 2002; Ott et
al. 2005).
4. RESULTS
(a) Verification summary
Verifications of data assimilation experiments listed in Table 1 are performed in terms of
analysis and background errors and the chi-square innovation statistic tests (e.g., Dee et al. 1995;
and Menard et al. 2000; Zupanski 2005). The verification summary is given in Table 2. The
RMS errors of the analysis and the 6-h forecast (background) are calculated with respect to the
truth as mean values over 70 consecutive data assimilation cycles. The mean values and the
standard deviations of the chi-square statistic are calculated over 70 data assimilation cycles
from the chi-square statistic values obtained in the individual data assimilation cycles. Note that
the ergodic hypothesis was made when calculating the mean chi-square values: sample mean was
replaced by time mean, calculated over 70 data assimilation cycles.
The results in Table 2 indicate superior performance of the KF approach, and also good
performance of the MLEF approach, with the RMS errors decreasing as the ensemble size
increases, and also as the number of observation increases, which is an expected performance. In
comparison to the 3d-var experiment, the MLEF errors are generally smaller for larger ensemble
sizes (20 and 40 members) and larger for the smallest ensemble size (10 members). The analysis
errors of the MLEF experiments with 80 observations are within the estimated total observation
errors (note that the total observation errors also include empirical represenativeness errors). The
20
analysis and background errors of all experiments are smaller than the errors of the experiment
without data assimilation (no_obs), thus indicating a positive impact of data assimilation. Table 2
also indicates that covariance localization, which is applied in the experiment with 10 ensemble
members and 40 observations, reduces analysis and background errors (compare experiments
10ens_40obs and 10ens_40obs_loc).
Mean values of the chi-square statistic indicate that the experiments are generally within
20% difference from the expected value of 1, with standard deviations within 15%-34%, with the
exception of the 3d-var experiment. In the 3d-var experiment there are larger fluctuations of the
chi-square statistic from one data assimilation cycle to another (the standard deviation is 78%)
which is a consequence of using a constant forecast error covariance in all data assimilation
cycles. Note that the chi-square values larger (smaller) than 1 indicate an underestimation
(overestimation) of the forecast error variance. One should, however, expect departures from the
expected chi-square statistic, since the Gaussian assumption is not strictly valid due to non-
linearity of the forecast model. The chi-square values calculated in individual data assimilation
cycles indicated no time increasing or decreasing trends, meaning that all data assimilation
experiments had stable filter performance.
(b) Impact of ensemble size
Let us now examine the impact of ensemble size on DOF for signal. We calculate DOF
for signal ds in data assimilation experiments with 80 and 40 observations and plot it as a
function of data assimilation cycles in Figs. 1a and 1b, respectively. Results from the reduced-
rank MLEF experiments (with 10 and 40 ensemble members) are shown along with the full-rank
21
KF and 3d-var experiments. Recall that, since the observation number and the observation errors
did not change from one data assimilation cycle to another, the time variability of ds reflects the
time variability of the forecast error covariance matrix. Comparing the results obtained using the
same ensemble size in Figs. 1a and 1b, we can notice generally higher values of ds in the
experiments with 80 observations in the first few data assimilation cycles; the differences
between 80 and 40 observations diminish in the later cycles. This is an indication that the KF and
also the MLEF have learning capabilities, since they recognize that previously assimilated
observations had an impact on reducing the initially prescribed forecast errors, thus more
observations in the later cycles is less beneficial than in the earlier cycles. This learning
capability is not present in the 3d-var approach since the forecast error covariance is kept
constant at all times. Consequently, the 3d-var approach could not recognize that the previously
assimilated observations had an impact on reducing the forecast uncertainty.
As seen in Figs. 1a and 1b, the experiments with larger ensemble size typically have
larger values of ds, and vice versa. The smaller (larger) value of ds is a consequence of using the
forecast error covariance matrix of a smaller (larger) rank. The important observation is that the
KF experiment and all reduced-rank experiments show similar time variability of the information
measures. Assuming that the full-rank KF experiment produces the best analysis solution and the
best estimate of the flow-dependent forecast error covariance, these results indicate that the
forecast error covariance is also realistically described in the reduced-rank experiments. We will
examine this issue farther in the section “Temporal evolution of the information measures”.
(c) Impact of covariance localization
22
In this subsection we evaluate the impact of covariance localization on the information
measures focusing on the experiment with small ensemble size (10 ensemble members) and with
smaller number of observations (40 observations). In Fig. 2, DOF for signal, obtained in the
experiments with and without localization is plotted as a function of data assimilation cycles.
The figure indicates that the covariance localization generally increases the amount of
information. This is not surprising, since covariance localization introduces extra DOF to the
data assimilation system (e.g., Hamill et al. 2001), but the total number of DOF cannot exceed
the ensemble size (Nens), since the information matrix C can have up to Nens non-zero
eigenvalues. An important observation is that the localization does not change the essential
character of the information measures (the lines with and without covariance localization are
approximately parallel). There is, however, a notable departure between the two lines around
cycle 56. Note, however, that because we use a single column model, it is likely to get shifted
maxima and minima by a single point in time, even under similar experimental conditions.
(d) Temporal evolution of the information measures
As observed in Figs. 1 and 2, the information measures reach a maximum in the first data
assimilation cycle. There are also two pronounced local maxima around cycles 40 and 50 (the
exact locations of the maxima vary between different experiments). In the following text, we
examine if there is a correlation between the information measures in Figs. 1 and 2 and the true
model state evolution.
The true model state evolution is shown in Figs. 3a, b, c and d, where true temperature,
true specific humidity, observed temperature, and observed specific humidity are plotted as
23
functions of data assimilation cycles and model vertical levels. One can observe rapid, front-like,
time-tilted changes in both temperature and humidity around cycles 40 and 50. Figs. 1 and 2
indicate the two local maxima in the information measures around the same data assimilation
cycles. One can also observe correlations between additional smaller local maxima in Figs. 1 and
2 and rapid changes in Fig. 3, though, the rapid changes are more pronounced in the humidity
field than in the temperature field. It is, therefore, evident that the time evolution of the
information measures is correlated with the true model state time evolution. Since the
information measures employ a flow-dependent forecast error covariance, this confirms that the
flow-dependency of the forecast error covariance is reasonably correct. Note, however, that a
“flow-dependent” forecast error covariance does not always imply that the information measures
are flow-dependent. For example, the experiments with an insufficient ensemble size would
commonly produce ds=Nens in all data assimilation cycles, thus indicating that ds is not sensitive
to the changes in the true model state, even though the forecast error covariance is “flow-
dependent”. Thus, having a correct flow-dependent forecast error covariance matrix is of
fundamental importance for describing a prior knowledge about the truth when calculating
information measures.
One can also observe more variability in the observations than in the corresponding
“true” fields, especially for the specific humidity field (Figs. 3b and 3d). This is a manifestation
of representativeness error, introduced by randomly perturbing the model state variables when
creating simulated observations. Recall that we have approximately accounted for the impact of
the representativeness error through the empirical parameter α (Table 1).
(e) Trace of Pf
24
Let us now examine the magnitude of the forecast error covariance. As an example, we
present the trace of the forecast error covariance matrix as a function of data assimilation cycles
obtained in the KF and 3d-var experiments with 80 observations (Fig. 4). In Fig. 4a the
temperature component (of the total trace) is given and in Fig. 4b the specific humidity
component is shown. As expected, Fig. 4 indicates time varying magnitudes of Pf for the KF
experiment and constant magnitudes of Pf for the 3d-var experiment. The figure also implies that
the largest values of ds obtained in the first data assimilation cycle (Figs. 1 and 2) are not a
simple consequence of using large initial Pf, since much larger magnitudes of Pf are obtained in
later data assimilation cycles (e.g., in the KF experiment around cycles 40 and 50). Thus, we can
conclude that the reason for large values of ds in the first data assimilation cycle is an inadequate
Pf , but not necessarily large Pf. The results in Fig. 4 also suggest that the large values of the
information measures obtained in the 3d-var experiment are a consequence of the shape of Pf,
not necessarily of the large magnitude of Pf. One can, however, appropriately tune the Pf used in
the 3d-var experiment in order to reduce or increase the information measures.
(f) Temporal evolution of the analysis and forecast errors
Let us conclude this section by examining temporal evolution of the analysis and forecast
RMS errors, calculated with respect to the “truth”, of different data assimilation experiments. We
show the RMS errors obtained in the KF, the 3d-var and the experiment with 10 ensemble
members with covariance localization. The errors of the three experiments are shown as
functions of model vertical levels and data assimilation cycles for temperature (Fig. 5) and for
25
specific humidity (Fig. 6). For reference, we also show temporal evolution of the errors of
temperature and specific humidity obtained in the experiment without data assimilation (Figs. 7a
and 7b). Examination of Figs. 5, 6 and 7 indicates that the errors of the experiment without data
assimilation (no_obs, shown in Fig. 7) are the largest, and that they are at maximum around
cycles 40 and 50. Around the same cycles maxima in ds (Figs. 1 and 2) and the abrupt changes in
the “true” model state (Fig. 3) were also observed. These largest errors are reduced by the
greatest amount, but still not completely eliminated, in the KF experiment, as shown in Figs. 5
and 6. This is an expected result, which indicates a highly efficient use of the observed
information in the KF, owing to the use of the full-rank flow-dependent forecast error
covariance. Other two experiments, the 3d-var (Figs. 5c, 5d, 6c and 6d) and the 10-ensemble
members experiment (Figs. 5e, 5f, 6e and 6f), also indicate considerable, but much smaller, error
reductions with respect to the experiment without data assimilation. Comparisons of the RMS
errors of the 3d-var experiment, which uses a full-rank but constant forecast error covariance,
with the MLEF experiment with 10 ensemble members, which uses a flow-dependent forecast
error covariance but with a considerably reduced rank, indicates generally slightly better
performance of the 3d-var experiment. As shown in Table 2, the analysis and the background errors
obtained using larger ensemble sizes (e.g., 20 and 40 ensembles) are generally smaller the 3d-var
errors. Thus, there is a trade-off regarding the quality of the analysis, depending on how many
ensemble members are feasible to employ.
26
5. CONCLUSIONS
In this study, we have applied information theory within an ensemble-based data
assimilation approach and defined information matrix in ensemble subspace. We have shown
that the information matrix in ensemble subspace can be directly linked to the information matrix
typically used in non-ensemble based data assimilation methods, such as the KF and the 3d-var
methods, which provides a framework for consistent comparisons of information measures
between different data assimilation methods.
We have evaluated this framework in application to the GEOS-5 SCM and simulated
observations, employing ARM observations as forcing. We have compared three different data
assimilation approaches, the KF, the MLEF and the 3d-var, focusing on the impact of ensemble
size, covariance localization, and the temporal evolution of the “true” model state on the
information measures.
Experimental results indicated that the essential character of the information measures
was similar in all experiments using a flow-dependent forecast error covariance matrix (the KF
and the MLEF experiments with varying ensemble sizes) by indicating similar trends of increase
or decrease with time. The temporal evolution of the information measures was correlated with
the true model state evolution, which was an indication that the flow-dependent forecast error
covariance was reasonable. The 3d-var based information measures were insensitive to the
changes in the true model state, since the forecast error covariance was (inadequately)
prescribed. These results indicated that it is fundamentally important to use a flow-dependent
forecast error covariance in order to adequately describe the prior knowledge about the truth
when calculating information measures.
27
As expected, the impact of covariance localization was in improved data assimilation
results and in increased values of the information measures. Temporal evolution of the
information measures remained sensitive to the major changes it the true model state in a similar
way as in the experiments without localization.
Comparisons of the three different data assimilation approaches indicated superior
performance of the KF approach, owing to the use of the full-rank flow-dependent forecast error
covariance matrix. Comparisons of the reduced-rank MLEF and the 3d-var approach indicated
superior MLEF results when the ensemble size was greater than 10, and comparable or slightly
worse MLEF results for smaller ensemble sizes (without covariance localization).
The results of this study indicated effectiveness of the proposed framework in
applications to different data assimilation approaches. Although the results were very
encouraging, further evaluations of the proposed framework are still necessary, especially in
applications to data assimilation problems with numerous observations and atmospheric models
with many degrees of freedom.
28
Acknowledgements
The first author would also like to thank Graeme Stephens, Christine Johnson, Louie
Grasso, and Stephane Vannitsem for inspiring discussions regarding information content
measures. This research was supported by NASA grants: 621-15-45-78, NAG5-12105, and
NNG04GI25G. We also acknowledge computational resources provided by the Explore
computer system at NASA’s Goddard Space Flight Center.
29
References:
Abramov, R, and A. Majda, 2004: Quantifying uncertainty for non-Gaussian ensembles in
complex systems. SIAM J. Sci. Stat. Comp., 26, 411-447.
Abramov, R., A. Majda and R. Kleeman, 2005: Information theory and predictability for low-
frequency variability. J. Atmos. Sci., 62, 65–87.
Anderson, J. L., 2001: An ensemble adjustment filter for data assimilation. Mon. Wea. Rev., 129,
2884–2903.
Bishop, C. H., B. J. Etherton, and S. Majumjar, 2001: Adaptive sampling with the ensemble
Transform Kalman filter. Part 1: Theoretical aspects. Mon. Wea. Rev., 129, 420–436.
Bishop, C. H., and Z. Toth, 1999: Ensemble transformation and adaptive observations. J. Atmos.
Sci., 56, 1748–1765.
Cohn, S. E., 1997: An introduction to estimation theory. J. Meteor. Soc. Japan, 75, 257–288.
Daley, R., 1991: Atmospheric Data Analysis. Cambridge University Press, 457 pp.
Dee, D., 1995: On-line estimation of error covariance parameters for atmospheric data
assimilation. Mon. Wea. Rev., 123, 1128–1145.
DelSole, T., 2004: Predictability and information theory. Part I: Measures of predictability. J.
Atmos. Sci., 61, 2425–2440.
Engelen, R. J., and G. L. Stephens, 2004: Information Content of Infrared Satellite Sounding
Measurements with Respect to CO2. J. Appl. Meteor. 43, 373–378.
Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using
Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99, (C5), 10143-
10162.
Fisher, M., 2003: Estimation of entropy reduction and degrees of freedom for signal for large
variational analysis systems. ECMWF Tech. Memo. No. 397. 18 pp.
30
Fletcher, S.J., and M. Zupanski, 2006: A data assimilation method for lognormally distributed
observational errors. Q. J. Roy. Meteor. Soc., 132, 2505-2520.
Gaspari, G., and S. E. Cohn, 1999: Construction of correlation functions in two and three
dimensions. Quart. J. Roy. Meteor. Soc., 125, 723–757.
Gill, P. E., W. Murray, and M. H. Wright, 1981: Practical Optimization. Academic Press, 401
pp.
Golub, G. H., and C. F. van Loan, 1989: Matrix Computations. 2d ed. The Johns Hopkins
University Press, 642 pp.
Hamill, T. M., and C. Snyder, 2000: A hybrid ensemble Kalman filter/3D-variational analysis
scheme. Mon. Wea. Rev., 128, 2905–2919.
Hamill, T. M., J. S. Whitaker, and C. Snyder, 2001: Distance-dependent filtering of background
error covariance estimates in an ensemble Kalman filter. Mon. Wea. Rev., 129, 2776–
2790.
Heemink, A. W., M. Verlaan, and A. J. Segers, 2001: Variance reduced ensemble Kalman
filtering. Mon. Wea. Rev., 129, 1718–1728.
Horn, R. A., and C. R. Johnson, 1985: Matrix Analysis. Cambridge University Press, 561 pp.
Hoteit, I., D.-T. Pham, and J. Blum, 2002: A simplified reduced-order kalman filtering and
application to altimetric data assimilation in tropical Pacific. J. Mar. Sys., 36, 101-127.
Hoteit, I., D.-T. Pham, and J. Blum, 2003: A semi-evolutive filter with partially local correction
basis for data assimilation in oceanography. Oceanologica Acta, 26, 511-524.
Hou, A. Y., S. Q. Zhang, A. da Silva and W. Olson, 2000: Improving assimilated global datasets
using TMI rainfall and columnar moisture observations. J. Climate., 13, 4180–4195.
31
Hou, A. Y., S. Q, Zhang, A. da Silva, W. Olson, C. Kummerow, and J. Simpson, 2001:
Improving global analysis and short-range forecast using rainfall and moisture
observations derived from TRMM and SSM/I passive microwave sensors. Bull. Amer.
Meteor. Soc., 81, 659–679.
Hou, A. Y., S. Q. Zhang, and O. Reale, 2004: Variational continuous assimilation of TMI and
SSM/I rain rates: Impact on GEOS-3 hurricane analyses and forecasts. Mon. Wea. Rev.,
132, 2094–2109.
Houtekamer, P. L., and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter
technique. Mon. Wea. Rev., 126, 796–811.
Houtekamer, P. L., and H. L. Mitchell, 2001: A sequential ensemble Kalman filter for
atmospheric data assimilation. Mon. Wea. Rev., 129, 123–137.
Jazwinski, A. H., 1970: Stochastic Processes and Filtering Theory. Academic Press, 376 pp.
Johnson, C., 2003: Information content of observations in variational data assimilation. Ph.D.
thesis, Department of Meteorology, University of Reading, 218 pp. [Available from
University of Reading, Whiteknights, P.O. Box 220, Reading, RG6 2AX, United
Kingdom.]
Keppenne, C., 2000: Data assimilation into a primitive-equation model with a parallel ensemble
Kalman filter. Mon. Wea. Rev., 128, 1971–1981.
Kleeman, R, 2002: Measuring dynamical prediction utility using relative entropy. J. Atmos. Sci.,
59, 2057–2072.
Lermusiaux, P. F. J., and A. R. Robinson, 1999: Data assimilation via error subspace statistical
estimation. Part I: Theory and schemes. Mon. Wea. Rev., 127, 1385–1407.
L’Ecuyer, T. S., P. Gabriel, K. Leesman, S. J. Cooper, and G. L. Stephens. 2006: Objective
assessment of the information content of visible and infrared radiance measurements for
32
cloud microphysical property retrievals over the global oceans. Part i: liquid clouds. J.
Appl. Meteor. Climat., 45, 20–41.
Lorenc, A. C., 1986: Analysis methods for numerical weather prediction. Quart. J. Roy. Meteor.
Soc., 112, 1177–1194.
Luenberger, D. L., 1984: Linear and Non-linear Programming. 2d ed. Addison-Wesley, 491 pp.
Menard, R., S. E. Cohn, L.-P. Chang, and P. M. Lyster, 2000: Assimilation of stratospheric
chemical tracer observations using a Kalman filter. Part I: Formulation. Mon. Wea. Rev.,
128, 2654–2671.
Mitchell, H. L, and P. L. Houtekamer, 2000: An adaptive ensemble Kalman filter. Mon. Wea.
Rev., 128, 416–433.
Mitchell, H. L., P. L. Houtekamer, and G. Pellerin, 2002: Ensemble size, balance, and model-
error representation in an ensemble Kalman filter. Mon. Wea. Rev., 130, 2791–2808.
Navon, I. M., X. Zou, J. Derber, and J. Sela, 1992: Variational data assimilation with an
adiabatic version of the NMC spectral model. Mon. Wea. Rev., 120, 1433–1446.
Oczkowski, M., I. Szunyogh, and D. J. Patil, 2005: Mechanism for the development of locally
low-dimensional atmospheric dynamics. J. Atmos. Sci., 62, 1135-1156.
Ott, E., Hunt, B. R., Szunyogh, I., Zimin, A. V., Kostelich, E. J., Corazza, M., Kalnay, E.,
Patil, D. J. and Yorke, J. A. 2004: A local ensemble Kalman filter for atmospheric
data assimilation. Tellus, 56A, 273-277.
Pham D. T., 2001: Stochastic methods for sequential data assimilation in strongly nonlinear
systems. Mon. Wea. Rev, 129, 1194–1207.
Pham, D.T., Verron, J., Roubaud, M.C., 1997. Singular evolutive Kalman filter with EOF
initialization for data assimilation in oceanography. J. Mar. Syst. 16, 323– 340.
33
Pham D. T., J. Verron, and M. C. Roubaud, 1998: A singular evolutive extended Kalman filter for
data assimilation in oceanography. J. Mar. Syst., 16, 323–340.
Patil, D. J., B. R. Hunt, E. Kalnay, J.A. Yorke, and E. Ott, 2001. Local low dimensionality of
atmospheric dynamics. Phys. Rev. Lett., 86, 5878-5881.
Peters, W., J.B. Miller, J. Whitaker, A.S. Denning, A. Hirsch, M.C. Krol, D. Zupanski, L.
Bruhwiler, and P.P. Tans, 2005: An ensemble data assimilation system to estimate
CO2 surface fluxes from atmospheric trace gas observations. J. Geophys. Res. 110,
D24304, doi:10.1029/2005JD006157.
Purser, R.J., and H.-L. Huang, 1993: Estimating effective data density in a satellite retrieval or an
objective analysis. J. Appl. Meteorol., 32, 1092–1107.
Rabier F., N. Fourrie, C. Djalil, and P. Prunet, 2002: Channel selection methods for Infrared
Atmospheric Sounding Interferometer radiances. Quart. J. Roy. Meteor. Soc., 128, 1011–
1027.
Reichle, R. H., D. B. McLaughlin, D. Entekhabi, 2002a: Hydrologic data assimilation with the
ensemble Kalman filter. Mon. Wea. Rev., 130, 103–114.
Reichle, R.H., J.P. Walker, R.D. Koster, and P.R. Houser, 2002b: Extended versus ensemble
Kalman filtering for land data assimilation. J. Hydrometorology, 3, 728-740.
Rodgers, C. D., 2000: Inverse Methods for Atmospheric Sounding: Theory and Practice. World
Scientific, 238 pp.
Roulston, M, and L. Smith, 2002: Evaluating probabilistic forecasts using information theory.
Mon. Wea. Rev., 130, 1653–1660.
Schneider, T, and S. Griffies, 1999: A conceptual framework for predictability studies. J.
Climate., 12, 3133–3155.
34
Shannon, C. E., and W. Weaver, 1949: The Mathematical Theory of Communication. University
of Illinois Press, 144 pp.
Szunyogh, I., E. J. Kostelich, G. Gyarmati, D. J. Patil, B. R. Hunt, E. Kalnay, E. Ott, and J. A.
Yorke, 2005: Assessing a local ensemble Kalman filter: Perfect model experiments with
the NCEP global model. Submitted to Tellus, 57A, 528-545.
Tippett, M., J. L. Anderson, C. H. Bishop, T. M. Hamill, and J. S. Whitaker, 2003: Ensemble
square-root filters. Mon. Wea. Rev., 131, 1485–1490.
Uzunoglu, B., S. J. Fletcher, M. Zupanski, and I. M. Navon, 2007: Adaptive ensemble member
size reduction and inflation. Quart. J. Roy. Meteor. Soc., (in print).
van Leeuwen, P. J., 2001: An ensemble smoother with error estimates. Mon. Wea. Rev., 129,
709–728.
Wahba, G., 1985: Design criteria and eigensequence plots for satellite-computed tomography. J.
Atmos. Oceanic Technol., 2, 125–132.
Wahba, G., D. R. Johnson, F. Gao, and J. Gong, 1995: Adaptive tuning of numerical weather
prediction models: Randomized GCV in three- and four-dimensional data assimilation.
Mon. Wea. Rev., 123, 3358–3370.
Wang, X., and C. H. Bishop, 2003: A comparison of breeding and ensemble transform Kalman
filter ensemble forecast schemes. J. Atmos. Sci., 60, 1140–1158.
Wei, M., Z. Toth, R.Wobus, Y. Zhu, C.H. Bishop, and X. Wang, 2006: Ensemble Transform
Kalman Filter-based ensemble perturbations in an operational global prediction system at
NCEP, Tellus, 58A, 28-44.
Whitaker, J. S., and T. M. Hamill, 2002: Ensemble data assimilation without perturbed
observations. Mon. Wea. Rev., 130, 1913–1924.
35
Zhang, F., Z. Meng, and A. Aksoy, 2006: Tests of an ensemble Kalman filter for mesoscale and
regional-scale data assimilation. Part I: perfect model experiments. Mon. Wea. Rev., 134,
722–736.
Zhang, F., Snyder, C., and Sun J., 2004: Impacts of initial estimate and observation availability
on convective-scale data assimilation with an ensemble Kalman filter. Mon. Wea. Rev.
132, 1238–1253.
Zupanski, D., A. S. Denning, M. Uliasz, M. Zupanski, A. E. Schuh, P. J. Rayner, W. Peters and
K. D. Corbin, 2007: Carbon flux bias estimation employing Maximum Likelihood
Ensemble Filter (MLEF). J. Geophys. Res., (accepted with revisions).
Zupanski D. and M. Zupanski, 2006: Model error estimation employing an ensemble data
assimilation approach. Mon. Wea. Rev., 134, 1337-1354.
Zupanski, M., 2005: Maximum Likelihood Ensemble Filter: Theoretical Aspects. Mon. Wea.
Rev., 133, 1710–1726.
Zupanski, M., S. J Fletcher, I. M. Navon, B. Uzunoglu, R. P. Heikes, D. A. Randall, T. D.
Ringler, and D. Daesccu, 2006: Initiation of ensemble data assimilation. Tellus,
58A, 159-170.
36
Table Captions List
Table 1. List of data assimilation experiments discussed in this paper. Nobs indicates the number
of observations per data assimilation cycle. The empirical parameter α, varying with ensemble
size, is employed to approximately account for an unknown representativeness error. Prefixes
“KF” and “3dv” indicate Kalman Filter and 3d-var experiments, respectively. Suffix “loc”
indicates that localization is applied to the forecast error covariance. Experiment denoted no_obs
is an experiment without data assimilation.
Table 2. Total RMS errors of the analysis and the background solutions, calculated with respect
to the truth over 70 data assimilation cycles, for the experiments listed in Table 1. The RMS
analysis and background errors are shown for temperature (denoted RMS Ta and RMS Tb) and
for specific humidity (denoted RMS qa and RMS qb). The RMS errors are smallest for the KF
experiment with 80 observations and are largest for the experiment without data assimilation
(no_obs). The smallest RMS errors are highlighted in bold, and the largest RMS errors are
highlighted in bold italic. Also shown are the mean values and standard deviations of the chi-
square statistic, calculated over 70 data assimilation cycles.
37
*Covariance localization was not applied in the 3d-var experiments; however, the 3d-var covariance is localized by definition [defined using the Gaspari and Cohn (1999) correlation function].
Table 1. List of data assimilation experiments discussed in this paper. Nobs indicates the number
of observations per data assimilation cycle. The empirical parameter α, varying with ensemble
size, is employed to approximately account for an unknown representativeness error. Prefixes
“KF” and “3dv” indicate Kalman Filter and 3d-var experiments, respectively. Suffix “loc”
indicates that localization is applied to the forecast error covariance. Experiment denoted no_obs
is an experiment without data assimilation.
Experiment
Nens
(T and q
estimated)
Nobs
(T and q
observed)
Rinst
1 2 for T in
degrees K
Rinst
1 2 for q in kg/kg
(Min; Max errors)
Parameter
α
Localization
10ens_80obs 10 80 0.2 6.1*10-8 ; 7.9*10-4 2.1 NO
20ens_80obs 20 80 0.2 6.1*10-8 ; 7.9*10-4 1.7 NO
40ens_80obs 40 80 0.2 6.1*10-8 ; 7.9*10-4 1.4 NO
KF_80obs 80 80 0.2 6.1*10-8 ; 7.9*10-4 1.15 NO
10ens_40obs 10 40 0.2 6.1*10-8 ; 7.9*10-4 2.1 NO
20ens_40obs 20 40 0.2 6.1*10-8 ; 7.9*10-4 1.7 NO
40ens_40obs 40 40 0.2 6.1*10-8 ; 7.9*10-4 1.4 NO
KF_40obs 80 40 0.2 6.1*10-8 ; 7.9*10-4 1.15 NO
10ens_40obs_loc 10 40 0.2 6.1*10-8 ; 7.9*10-4 2.1 YES
3dv_40obs 80 40 0.2 6.1*10-8 ; 7.9*10-4 1.15 NO*
no_obs _ 0 _ _ _ _
38
Experiment RMS Ta
(K)
RMS Tb
(K)
RMS qa
(kg/kg)
RMS qb
(kg/kg)
Chi-square
(mean)
Chi-square
(stddev)
10ens_80obs 0.45 0.49 3.77*10-4 3.97*10-4 1.11 0.27
20ens_80obs 0.28 0.35 2.65*10-4 3.08*10-4 0.95 0.20
40ens_80obs 0.23 0.32 2.26*10-4 2.91*10-4 0.92 0.15
KF_80obs 0.21 0.31 2.04*10-4 2.57*10-4 1.06 0.20
10ens_40obs 0.64 0.68 4.93*10-4 5.08*10-4 1.16 0.31
20ens_40obs 0.54 0.57 4.07*10-4 4.27*10-4 1.03 0.31
40ens_40obs 0.51 0.55 3.74*10-4 4.14*10-4 0.84 0.22
KF_40obs 0.38 0.40 3.38*10-4 3.42*10-4 0.81 0.20
10ens_40obs_loc 0.57 0.58 4.35*10-4 4.51*10-4 1.21 0.34
3dv_40obs 0.51 0.61 4.34*10-4 4.43*10-4 0.84 0.78
no_obs 0.82 0.82 6.56*10-4 6.56*10-4 _ _
Table 2. Total RMS errors of the analysis and the background solutions, calculated with respect
to the truth over 70 data assimilation cycles, for the experiments listed in Table 1. The RMS
analysis and background errors are shown for temperature (denoted RMS Ta and RMS Tb) and
for specific humidity (denoted RMS qa and RMS qb). The RMS errors are smallest for the KF
experiment with 80 observations and are largest for the experiment without data assimilation
(no_obs). The smallest RMS errors are highlighted in bold, and the largest RMS errors are
highlighted in bold italic. Also shown are the mean values and standard deviations of the chi-
square statistic, calculated over 70 data assimilation cycles.
39
Figure Captions List
Fig. 1. Degrees of Freedom (DOF) for signal (ds), obtained in the experiments with (a) 80
observations and (b) 40 observations per data assimilation cycle. Note that ds is constant in time
in the 3d-var experiment, which is a consequence of a constant forecast error covariance.
Fig. 2. Values of DOF for signal (ds), obtained in the experiments with 10 ensemble members
and 40 observations (with and without covariance localization), plotted as functions of data
assimilation cycles.
Fig. 3. (a) True temperature, (b) true specific humidity, (c) observed temperature, and (d)
observed specific humidity, shown as functions of data assimilation cycles and model vertical
levels. Observations defined in each grid point (80 observations) are used in Fig. 4c,d. Units for
temperature are K, and for specific humidity g kg-1. Note rapid time-tilted changes in both
temperature and humidity around cycles 40 and 50.
Fig. 4. Trace of Pf, shown as a function of data assimilation cycles. Results from the KF and the
3d-var experiments with 80 observations are plotted. The temperature component of the total
trace is given in (a) in units of K2, and the specific humidity component of the total trace is given
in (b) in units of kg2 kg-2. Trace of Pf is constant in all cycles for the 3d-var experiment. It is
equal to 1.6 K2, for temperature, and 9.2*10-5 kg2 kg-2, for specific humidity.
40
Fig. 5. Analysis and background errors of temperature obtained in three different data
assimilation experiments with 40 observations: KF_40obs, 3dv_40obs and 10ens_40obs_loc.
The errors are calculated with respect to the “truth” and are shown as functions of data
assimilation cycles and model vertical levels. The results from the KF experiment are shown in
(a), for the analysis, and in (b), for the background. The results of the 3d-var experiment are
shown in (c), for the analysis, and (d), for the background. The results of the experiment with 10
ensemble members, which also includes covariance localization, are given in (e), for the analysis
and in (f), for the background. The numbers in the upper right corners are total RMS errors from
Table 2. The units are in K degrees for both the plots and the total RMS errors.
Fig. 6. As in Fig. 5, but for specific humidity in g kg-1. The numbers in the upper right corners
are total RMS errors from Table 2, given in kg kg-1.
Fig. 7. Analysis errors of the experiment without data assimilation (no_obs), calculated with
respect to the “truth”. The results are plotted in (a), for temperature in K, and in (b), for specific
humidity in g kg-1. The RMS errors in the upper right corners are in units of K degrees, for
temperature, and in kg kg-1, for specific humidity.
41
(a)
Fig. 1. Degrees of Freedom (DOF) for signal (ds), obtained in the experiments with (a) 80
observations and (b) 40 observations per data assimilation cycle. Note that ds is constant in time in
the 3d-var experiment, which is a consequence of a constant forecast error covariance.
(b)
42
Fig. 2. Values of DOF for signal (ds), obtained in the experiments with 10 ensemble members and
40 observations (with and without covariance localization), plotted as functions of data assimilation
cycles.
43
Fig. 3. (a) True temperature, (b) true specific humidity, (c) observed temperature, and (d) observed
specific humidity, shown as functions of data assimilation cycles and model vertical levels.
Observations defined in each grid point (80 observations) are used in Fig. 4c,d. Units for
temperature are K, and for specific humidity g kg-1. Note rapid time-tilted changes in both
temperature and humidity around cycles 40 and 50.
(a) T true
(c) T obs
(b) q true
(d) q obs
Vert
ical
leve
ls
Data assimilation cycles
44
Fig. 4. Trace of Pf, shown as a function of data assimilation cycles. Results from the KF and the 3d-
var experiments with 80 observations are plotted. The temperature component of the total trace is
given in (a) in units of K2, and the specific humidity component of the total trace is given in (b) in
units of kg2 kg-2. Trace of Pf is constant in all cycles for the 3d-var experiment. It is equal to 1.6 K2,
for temperature, and 9.2*10-5 kg2 kg-2, for specific humidity.
(a)
(b)
45
RMS=0.38K RMS=0.40K
Data assimilation cycles
Ver
tical
leve
ls
(a) (b)
(c) RMS=0.51K (d) RMS=0.61K
RMS=0.57K (e) RMS=0.58K (f)
46
Fig. 5. Analysis and background errors of temperature obtained in three different data assimilation
experiments with 40 observations: KF_40obs, 3dv_40obs and 10ens_40obs_loc. The errors are
calculated with respect to the “truth” and are shown as functions of data assimilation cycles and
model vertical levels. The results from the KF experiment are shown in (a), for the analysis, and in
(b), for the background. The results of the 3d-var experiment are shown in (c), for the analysis, and
(d), for the background. The results of the experiment with 10 ensemble members, which also
includes covariance localization, are given in (e), for the analysis and in (f), for the background.
The numbers in the upper right corners are total RMS errors from Table 2. The units are in K
degrees for both the plots and the total RMS errors.
47
Data assimilation cycles
Ver
tical
leve
ls
RMS=3.38*10-4 kg kg -1 RMS=3.42*10-4 kg kg -1
(a) (b)
RMS=4.43*10-4 kg kg -1 RMS=4.34*10-4 kg kg -1 (c) (d)
RMS=4.35*10-4 kg kg -1 (e) RMS=4.51*10-4 kg kg -1 (f)
48
Fig. 6. As in Fig. 5, but for specific humidity in g kg-1. The numbers in the upper right corners are
total RMS errors from Table 2, given in kg kg-1.
49
Fig. 7. Analysis errors of the experiment without data assimilation (no_obs), calculated with respect
to the “truth”. The results are plotted in (a), for temperature in K, and in (b), for specific humidity
in g kg-1. The RMS errors in the upper right corners are in units of K degrees, for temperature, and
in kg kg-1, for specific humidity.
Data assimilation cycles
Ver
tical
leve
ls
RMS=0.82K RMS=6.56*10-4 kg kg -1 (a) (b)
Top Related