Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data...

Introduction to ensemble forecasting

Eric J. Kostelich

SCHOOL OF MATHEMATICS AND STATISTICS

MSRI Climate Change Summer SchoolJuly 21, 2008

Introduction Data Mathematical Framework LETKF

Co-workers:

Istvan Szunyogh, Brian Hunt, Edward Ott,

Eugenia Kalnay, Jim Yorke

and many others!

Thanks to: Dave Kuhl

Papers, preprints, and codes:

http://www.weatherchaos.umd.eduhttp://math.asu.edu/∼eric

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 2 / 32


Principal papers

Preprints: www.weatherchaos.umd.edu

Initial papers:E. Ott et al., Tellus A 56 (2004), 415–428.I. Szunyogh et al., Tellus A 57 (2005),528–545.

Refined mathematical implementation: B. R. Hunt, E. K.,I. Szunyogh, Physica D 230 (2007) 112–126.

Results with real data: I. Szunyogh, E.K. et al., Tellus A 60(2008) 113–130.



Recap from last time

In a chaotic process, every point is sensitiveUncertainties in initial conditions grow exponentially(at least for awhile)The weather is chaotic (as far as anyone can tell)The uncertainty in the global weather vector roughlydoubles every 2 daysForecast horizon: about 2 weeks



Relevant U. S. organizations

The National Oceanographic and AtmosphericAdministration (NOAA) is a division of theDepartment of CommerceThe National Centers for Environmental Prediction(NCEP) is the division of NOAA responsible fordeveloping and maintaining weather forecast modelsSpectrum of models: Global Forecast System (GFS),Regional Spectral Model (RSM), etc.Model data is distributed to local Weather Serviceoffices, which generate public forecast products



Other important modeling efforts

NASA develops and maintains its own forecast modelsInternational agreements to share forecasts andobservations (NCEP, UK Met Office, ECMWF,Canada, Japan, Brazil, etc.)Research community: Weather Research andForecasting model (WRF)NOAA and the U. S. Navy develop and maintain oceanmodelsPrivate sector efforts: AccuWeather, airlines, etc.



What do we want to predict?

The best long-term forecast is climatology (the mean isthe maximum likelihood estimate)Prior to the mid 1960s, the starting initial condition wasclimatologyThe U. S. Weather Service defines “normal” as the1971–2000 averageExample: in Phoenix, Arizona, tomorrow’s weatherwill be sunny with 96% probabilityExceptional weather often is of greatest interest



What is data assimilation?

The process by which empirical measurements areincorporated into a forecast model to refine an estimateof the initial conditionThe distinction between variables and parameters is amatter of definitionOperational weather forecast centers perform dataassimilation steps 4 times per day (0Z, 6Z, 12Z, 18Z)Real-time constraints: NCEP allows 20 minutes



Measures of forecast quality

One objective measure of goodness:

〈forecast−observations〉

A 72-hour forecast today is as accurate as a 36-hourforecast in 1985“Holy grail:” 7-day forecasts that are as accurate as3-day forecasts are now



Many applications besides weather

Controls (e.g., airplane autopilots)Ocean and climate models (obviously)Biological models (e.g., Tim Sauer & Steve Schiff)Parameter estimation



Some fundamental problems

Naive approach: direct insertionDifficulty: there are usually many more grid pointsthan available measurementsDoes not account for errors in the measurementDoes not exploit correlations between nearby gridpointsThe variables in the model are not necessarily the onesthat can be easily measured



Example: Global Forecast System

Principal variables in the GFS:natural logarithm of surface pressurevirtual temperaturedivergence and vorticity of the wind field

Principal measurements:barometric pressuresensible temperaturerelative humiditywind speed and directionsatellite radiances (complicated!)



Typical 6-hour land surface dataset: 31,310 locations



Typical 6-hour surface marine dataset: 2,642 locations



Typical 6-hour satellite dataset: 53,842 locations



The observation space

For these reasons, data assimilation is done in theobservation spaceGiven a vector of observations y, interpolate the modelstate x to the same locationsThe interpolation operator is denoted HThe innovation is y−H(x)



Basic idea: Weighted least squares

Observations: y ∈ Rp, y = H(xt)+ ε

Observation errors: E(ε) = 0, E(εεT) = RModel forecast (“background”): x ∈ Rn, xb = xt +η

Model errors: E(η) = 0, E(ηηT) = Pb

Goal: minimize the objective function

J(x)= [y−H(x)]TR−1[y−H(x)]+(x−xb)TP−1b (x−xb)

Minimization produces an analysis xa with associatedcovariance Pa



Simplest assumptions

The observation errors ε are normally distributed withmean 0 and covariance RModel errors similarly: N(0,Pb)When the underlying model is linear, it can be shownthe the minimizer xa of J is unique, unbiased and hasminimum variance among all linear estimatorsWeather models are “linear enough” over 6-hourintervals, but there is no guarantee of optimality



The dimensionality problem

Must evaluate

J(x)= [y−H(x)]TR−1[y−H(x)]+(x−xb)TP−1b (x−xb)

where y ∈ Rp, x ∈ Rn

Current NCEP operations: p∼ 1.75 million andn∼ 3 billionWe need R−1 (p×p) and P−1

b (n×n)



The computational complexity problem

Inversion of a k× k matrix is an O(k3) algorithmIf a 100×100 matrix takes ∼ 1 sec to invert, then a109×109 matrix takes ∼ 1018 secR is nearly diagonal if observation errors are mostlyuncorrelatedPb is not diagonalComputing Pb(t +∆t) from Pa(t) requires integrationof the tangent linear model



Complexity reduction strategies

Localization: Try to do the minimzation over smallerregions of the globeEstimate and precompute P−1

b : Assume that theforecast uncertainty is approximately constant fromone day to the next. (Used in all current operational DAsystems)Thin the observations and use only the “mostimportant” ones



Each strategy has drawbacks

Assuming Pb ≈ constant ignores the “errors of the day”Generally regarded as one of the key impediments tobetter forecastsThe result of sequential assimilation of observationsdepends on the order of processingMust assure continuity at the boundaries of the smallerregions



The Local Ensemble Transform Kalman Filter (LETKF)

Addresses many of these problemsExploits the “geometry of uncertainty” in chaoticprocesses to lower the dimension but still account forerrors of the dayAssimilates all the data at onceUses localization and sets of observations that varyslowly in space to help assure continuityPermits efficient implementation on massively parallelcomputers



The geometry of forecast uncertainty

The size of a typical high- or low-pressure system isabout 1000 km×1000 km (≈ Texas)The GFS, when run at medium (T62) resolution,contains about 3000 grid-point variables in Texas-sizedregionsSuppose we run k statistically equivalent forecastsWhat are the singular values of the resulting 3000× kforecast matrix XF?



Correlation and dimensionality

Over most Texas-sized regions, one solution looksmuch like anotherThe columns of XF tend to be highly correlated

so the SVD of XF yields a good rank-r approximationeven when r � k



Correlation and dimensionality

Over most Texas-sized regions, one solution looksmuch like anotherThe columns of XF tend to be highly correlatedso the SVD of XF yields a good rank-r approximationeven when r � k



Key empirical finding

This was a key finding byD. J. Patil et al. PRL 86 (2001), 5878–5881.

GFS at T62 resolution: ∼ 3000 grid variables overtypical Texas-sized regionTypical ensemble of 100≤ k ≤ 200 forecasts generatesa 3000× k forecast matrix XF whose first r singularvectors, 40≤ r ≤ 80, yield an excellent approximationof the forecast uncertainty



The ensemble dimension

The ensemble dimension (E-dimension) of an n× kmatrix is

E ≡ (s1 + s2 + · · ·+ sk)2

s21 + s2

2 + · · ·+ s2k

Measures the eccentricity of the “ellipse” of forecastuncertainty



Example: s1 = 3.78, s2 = 3.60, Edim = 1.99

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1



Example: s1 = 19.24, s2 = 4.35, Edim = 1.43

−5 −4 −3 −2 −1 0 1 2 3 4 5−4

−3

−2

−1

0

1

2

3

4



Example: s1 = 83.65, s2 = 4.33, Edim = 1.10

−15 −10 −5 0 5 10 15

−10

−5

0

5

10



The key idea behind the LETKF

If the E-dimension is much less than the dimension ofthe overall space, then the distribution is “flat”The ensemble forecast uncertainty over a typicalsynoptic region resembles a “pancake” (at least forshort intervals)Reduce the dimensionality of the problem by changingcoordinates to the r-dimensional subspace containingmost of the forecast uncertaintyThe dynamics reduces the uncertainty in the remainingdirections



Next lecture

Outline of the Kalman filterMathematical details of how we accomplish thedimension reductionResults with operational models and real observations


Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data...

Documents

Transcript of Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data...