Low-order stochastic model and “past-noiseresearch.atmos.ucla.edu/tcd/PREPRINTS/grlmjo_rev_vf...38...

27
GEOPHYSICAL RESEARCH LETTERS, VOL. ???, XXXX, DOI:10.1029/, Low-order stochastic model and “past-noise 1 forecasting” of the Madden-Julian oscillation 2 D. Kondrashov, 1 M.D. Chekroun, 1 A. W. Robertson 3 , M. Ghil 1,2 Corresponding author: D. Kondrashov, Department of Atmospheric and Oceanic Sciences, 405 Hilgard Ave, Box 951565, 7127 Math Sciences Bldg. University of California, Los Angeles, CA 90095-1565, U.S.A. ([email protected]) 1 Department of Atmospheric and Oceanic Sciences and Institute of Geophysics and Planetary Physics, University of California, Los Angeles, U.S.A 2 Geosciences Department and Laboratoire de M´ et´ eorologie Dynamique (CNRS and IPSL), Ecole Normale Sup´ erieure, F-75231 Paris Cedex 05, FRANCE. 3 International Research Institute for Climate an Society, Columbia University, Palisades, NY 10964-8000, U.S.A. DRAFT September 24, 2013, 1:23pm DRAFT

Transcript of Low-order stochastic model and “past-noiseresearch.atmos.ucla.edu/tcd/PREPRINTS/grlmjo_rev_vf...38...

  • GEOPHYSICAL RESEARCH LETTERS, VOL. ???, XXXX, DOI:10.1029/,

    Low-order stochastic model and “past-noise1

    forecasting” of the Madden-Julian oscillation2

    D. Kondrashov,1 M.D. Chekroun,1 A. W. Robertson3, M. Ghil1,2

    Corresponding author: D. Kondrashov, Department of Atmospheric and Oceanic Sciences, 405

    Hilgard Ave, Box 951565, 7127 Math Sciences Bldg. University of California, Los Angeles, CA

    90095-1565, U.S.A. ([email protected])

    1Department of Atmospheric and Oceanic

    Sciences and Institute of Geophysics and

    Planetary Physics, University of California,

    Los Angeles, U.S.A

    2Geosciences Department and Laboratoire

    de Météorologie Dynamique (CNRS and

    IPSL), Ecole Normale Supérieure, F-75231

    Paris Cedex 05, FRANCE.

    3International Research Institute for

    Climate an Society, Columbia University,

    Palisades, NY 10964-8000, U.S.A.

    D R A F T September 24, 2013, 1:23pm D R A F T

  • X - 2 KONDRASHOV ET AL.: LOW-ORDER MJO

    This paper presents a predictability study of the Madden-Julian Oscilla-3

    tion (MJO) that relies on combining empirical model reduction (EMR) with4

    the “past-noise forecasting” (PNF) method. EMR is a data-driven method-5

    ology for constructing stochastic low-dimensional models that account for6

    nonlinearity, seasonality, and serial correlation in the estimated noise, while7

    PNF constructs an ensemble of forecasts that accounts for interactions be-8

    tween (i) high-frequency variability (“noise”), estimated here by EMR; and9

    (ii) the low-frequency mode (LFM) of MJO, as captured by singular-spectrum10

    analysis (SSA). A key result is that — compared to an EMR ensemble driven11

    by generic white noise — PNF is able to considerably improve prediction of12

    MJO phase. When forecasts are initiated from weak MJO conditions, the13

    useful skill is of up to 30 days. PNF also significantly improves MJO pre-14

    diction skill for forecasts that start over the Indian Ocean.15

    D R A F T September 24, 2013, 1:23pm D R A F T

  • KONDRASHOV ET AL.: LOW-ORDER MJO X - 3

    1. Introduction

    The Madden–Julian oscillation (MJO) is the dominant mode of intraseasonal variability16

    across the global tropics. The MJO is a natural component of the coupled atmosphere–17

    ocean system and it a↵ects other important climate processes, in both the tropics and18

    extratropics; these processes include the seasonal evolution of temperature and precipita-19

    tion, tropical cyclone frequency, and weather extremes.20

    In this paper, we built on the extensive studies to improve prediction of the daily21

    indices of the Real-time Multivariate MJO Index, known as RMM1 and RMM2 [Wheeler22

    and Hendon, 2004]. These indices are dominated by intraseasonal fluctuations due to23

    MJO variability in the 40–50-day band. The predictive statistical models used so far can24

    be grouped roughly into (i) multiple lagged-regression–based models [Jiang et al., 2008;25

    Seo et al., 2009; Maharaj and Wheeler , 2005; Kang and Kim, 2010], and (ii) models based26

    on advanced time series analysis, such as singular-spectrum analysis (SSA) and wavelets27

    [Kang and Kim, 2010], neural networks [Love and Matthews , 2009], and analogues [Seo28

    et al., 2009]. The useful predictive skill for the MJO of such empirical models is typically29

    of about 15–20 days, and it is similar to the MJO forecast skill achieved by state-of-the-30

    art dynamical models [Vitart and Molteni , 2010; Zhang and Van den Dool , 2012; Zhang ,31

    2013].32

    We propose here to bring to bear on empirical MJO prediction a combination of novel33

    tools (i) for the construction of stochastic low-dimensional models (LDMs); and (ii) for the34

    application of the LDMs so constructed to prediction. The LDM construction is carried35

    out by data-driven empirical model reduction (EMR), which accounts for nonlinearity,36

    D R A F T September 24, 2013, 1:23pm D R A F T

  • X - 4 KONDRASHOV ET AL.: LOW-ORDER MJO

    seasonality, and serial correlation in the noise estimate [Kravtsov et al., 2005, 2009]. The37

    prediction method is the pathwise past-noise forecasting (PNF) method [Chekroun et al.,38

    2011a], which improves predictions by accounting for the modulation of high-frequency39

    variability, or “noise,” by the MJO’s low-frequency mode (LFM), as captured by SSA.40

    The understanding and reliable description of LFMs is of the essence for the prediction of41

    the high-amplitude but irregularly occurring events in the frequency bands of interest [Ghil42

    and Robertson, 2002]. Furthermore, it is crucial to understand the interaction between43

    LFMs and the higher-frequency variability [Chekroun et al., 2011a].44

    This interaction can be better understood in the framework of random dynamical sys-45

    tems (RDS) theory, in which the stochastic dynamics is studied pathwise — i.e., for46

    individual realizations of the noise — rather than being merely sampled by an ensemble47

    of forward trajectories that aim to approximate the system’s probability density function48

    (PDF) [Chekroun et al., 2011b]. This pathwise view of the dynamics helps in applying49

    inverse stochastic nonlinear models to prediction, since such an inverse model “lives”50

    naturally on the estimated path of the noise, as derived from the data.51

    So far, though, it was not clear how to derive such pathwise models. Most of the52

    inherent di�culties have been overcome by using the EMR methodology [Kravtsov et al.,53

    2005, 2009] to yield nonlinear stochastic inverse models that successfully capture the54

    LFMs of the climatic phenomena of interest [Kondrashov et al., 2011, 2005], while also55

    estimating the path of the noise.56

    Chekroun et al. [2011a] extended the results of Kondrashov et al. [2005] beyond the57

    sampled-PDF approach, and developed PNF as a general pathwise method for nonlin-58

    D R A F T September 24, 2013, 1:23pm D R A F T

  • KONDRASHOV ET AL.: LOW-ORDER MJO X - 5

    ear stochastic systems that exhibit LFMs. This method exploits information on the59

    estimated path on which the inverse stochastic model evolves, as well as on the interac-60

    tion between the model’s deterministic, nonlinear components and the stochastic ones.61

    PNF has been shown to outperform classical ensemble prediction techniques on synthetic,62

    model-generated data, as well as on observational data for ENSO [Chekroun et al., 2011a].63

    2. Methods

    2.1. Empirical Model Reduction (EMR)

    The EMR reduction approach represents a generalization of linear inverse modeling64

    (LIM; [Penland and Sardeshmukh, 1995]). As an operational prediction methodology,65

    EMR constructs a low-order nonlinear system of prognostic equations driven by stochastic66

    forcing; the method estimates the model’s deterministic part, as well as the properties of67

    the driving noise from the observations or from a more detailed model’s simulation.68

    EMR relies on multi-level polynomial inverse modeling: it thus allows (a) the model to69

    be nonlinear; and (b) the noise terms to be correlated in space, and serially correlated70

    in time [Kravtsov et al., 2005]. Like other model-reduction methodologies, it adopts a71

    stochastic approach to describe a system of coupled, slow-and-fast variables. Kravtsov72

    et al. [2009] reviewed EMR and its applications, and compared it with other method-73

    ologies. They showed EMR to be particularly e↵ective (1) when variability of the faster74

    parts of the system is both modulated by and feeds back on the slower components of the75

    system; and (2) when there is no significant time scale separation between slow and fast76

    variables.77

    D R A F T September 24, 2013, 1:23pm D R A F T

  • X - 6 KONDRASHOV ET AL.: LOW-ORDER MJO

    A multi-level EMR model can be formally written as the following set of M + 1 vector78

    equations:79

    ẋ = F�Ax+B(x,x) + r0t ,

    m�1t = L

    (m)(x, r0t , ..., rm�1t ) + r

    mt , 1 m M, (1)

    Here x = (xi; i = 1, ..., d) represents the resolved modes on the main level of the model,80

    while rmt at additional levels accounts for unresolved modes, which at the last level are81

    approximated as spatially correlated white noise rMt ⇡ ⌃ ˙W.82

    For our MJO application, d = 2 and x1

    and x2

    are the RMM1 and RMM2 indices. The83

    terms�Ax andB(x,x) represent the linear dissipation and the quadratic self-interactions,84

    respectively, while F accounts for the deterministic forcing [Kondrashov et al., 2011].85

    Memory e↵ects of the interactions between the resolved variables x and the unresolved86

    r

    mt arise as convolution terms by integrating recursively the “matrioshka” of levels from87

    the lowest level M to the main level [Kondrashov et al., 2013; Wouters and Lucarini ,88

    2013]. The coupling between the variable rmt and the variables (x, r0

    t , ..., rm�1t ) from the89

    previous levels is modeled by rectangular matrices L(m) of increasing size.90

    When deriving a time-discrete formulation of Eq. (1), instantaneous tendencies �x/�t91

    and �rmt /�t are computed numerically by Euler time-di↵erencing, and are used to estimate92

    by recursive least-squares the coe�cients F, A, B and L(m), as well as computing the93

    regression residuals rm+1t , to be modeled at the next level.The procedure is stopped when94

    the estimated ⇠t ⌘ rMt has an autocorrelation that vanishes at unit lag and its spatial95

    covariance matrix has converged to a constant matrix ⌃.96

    D R A F T September 24, 2013, 1:23pm D R A F T

  • KONDRASHOV ET AL.: LOW-ORDER MJO X - 7

    Standard forecasting by EMR relies on forward integration of the model from a given97

    initial state and driven at the last model level by a large ensemble of random realizations98

    of the estimated noise ⌃ ˙W. ENSO real-time prediction by EMR [Kondrashov et al.,99

    2005] is highly competitive among other dynamical and statistical forecasts in the ENSO100

    multi-model prediction plume of the IRI [Barnston et al., 2012].101

    For this study, we tested an energy-conserving EMR formulation to ensure that kxk =102dP

    i=1x

    2

    i < 1, see auxiliary materials. We have found that, in the case of MJO, there was103

    no noticeable di↵erence in prediction skill between strictly energy-conserving EMR and104

    its non-energy conserving implementation.105

    2.2. Past-Noise Forecasting (PNF) by EMR

    We start by building a nonlinear EMR model using past observations and we estimate106

    the path of the “weather” noise ⇠t (see Sec. 2.1) that drove this model over previous finite-107

    time windows. The PNF method is then applied in two steps. First, noise snippets —108

    obtained as copies of this estimated path — are selected from past time intervals during109

    which the LFM phase resembles that observed prior to a particular forecast. Second,110

    these noise snippets — which have the same length as the forecast lead time — are used111

    to drive the system into the future.112

    We rely on SSA to identify a nearly periodic LFM in the (RMM1, RMM2) data set. SSA113

    is a data-adaptive, nonparametric method for spectral estimation that extends classic PCA114

    into the time-lag domain [Vautard and Ghil , 1989]. In practice, the LFMs are described115

    by the reconstructed components (RCs) given by SSA [Ghil et al., 2002]. These RCs are116

    then used for the noise snippet selection here, by finding in the past record analogs of117

    D R A F T September 24, 2013, 1:23pm D R A F T

  • X - 8 KONDRASHOV ET AL.: LOW-ORDER MJO

    the LFM that best match the LFM phase just preceding the start of the forecast. Such118

    a conditioning of the noise forcing on the LFM phase improves forecast skill, and so-119

    called linear pathwise response with respect to noise perturbations helps explain the PNF120

    method’s success, cf. Chekroun et al. [2011a].121

    3. Results and Discussion

    We developed a quadratic EMR model with three levels (main plus two, i.e. M = 2122

    in Section 2.1) to simulate and predict the (RMM1, RMM2) daily indices (RMM1-2123

    hereafter) for June 1974–January 20091. Following Kondrashov et al. [2005], we included124

    the seasonal-cycle e↵ects into the forcing and the linear part, i.e. F and A in Eq. (1), on125

    the model’s main level.126

    Figures 1a,b compare the autocorrelation functions (ACFs) of the RMM1-2 time series,127

    EMR-simulated vs. observed, when using a large ensemble of EMR stochastic realizations.128

    These two panels demonstrate that the EMR model captures very well the ACFs, and129

    hence the MJO power spectrum, as well as its 40–60 day LFM. We found that the use of130

    M > 1 improves the EMR-MJO model’s prediction skill at long lead times (not shown).131

    The EMR model was trained on the June 1974–July 2005 portion of the RMM1-2132

    record, and forecasts were validated for the independent time interval July 2005–January133

    2009. In this study, we modified the noise snippet selection in the PNF procedure of134

    Chekroun et al. [2011a] by adopting a suggestion of Feliks et al. [2010], namely to determine135

    an instantaneous phase 0 < � < 2⇡ of narrow-band LFM components via the Hilbert136

    transform applied to the SSA reconstruction of the latter, i.e. to the leading RC pair of137

    See http://cawcr.gov.au/staff/mwheeler/maproom/RMM/ for the data set.

    D R A F T September 24, 2013, 1:23pm D R A F T

  • KONDRASHOV ET AL.: LOW-ORDER MJO X - 9

    RMM1. This selection procedure results in a fixed size of 20 PNF ensemble members per138

    t

    ⇤, where t⇤ is the forecast start time within the validation interval, compared to EMR139

    ensemble driven by 200 independent realizations of white noise; see the auxiliary materials140

    for details.141

    In real-time prediction, the PNF algorithm for choosing noise snippets has to deal with142

    “end e↵ects” that bias the SSA reconstruction and the estimated value of the LFM phase143

    at the forecast starting time. For the predictability study presented here, we computed144

    the RC pair that captures the MJO’s LFM by using the entire data record available, in145

    order to present the optimal PNF skill as proof-of-concept. It is left for future research146

    to explore ways to minimize such end e↵ects for applications to operational forecasting.147

    For the purpose of defining MJO forecast skill, we use the commonly adopted bivari-148

    ate correlation (corr) and the root-mean-square error (rmse) between the observed and149

    forecast RMM indices [Gottschalck et al., 2010]. The skill of the PNF method is notably150

    better than the standard EMR: see the blue vs. red curves in Fig. 2a. Moreover, PNF151

    compares favorably with existing statistical and dynamical models for MJO prediction,152

    cf. Sec. 1.153

    Even though PNF is applied here to RMM1 only, prediction of RMM2 is also improved,154

    since both indices are physically coupled. The potential increase in predictability by PNF155

    (blue) vs. EMR (red) in the time domain coincides mostly with energetic phases of MJO’s156

    LFM; these phases are well captured by the leading RC pair with an SSA window of 50157

    days (cyan), cf. Fig. 1d. This PNF feature is due to the method’s intrinsic conditioning158

    D R A F T September 24, 2013, 1:23pm D R A F T

  • X - 10 KONDRASHOV ET AL.: LOW-ORDER MJO

    of the driving noise on the LFM phase. On the other hand, EMR and PNF predictions159

    are similar when LFM is weak, i.e. during March-April 2006, as seen in Fig. 1d.160

    The time series of EMR forecasts are expected to have smaller variance than the PNF161

    ones since — at long lead times and when averaged over a large ensemble of driving-noise162

    realizations — they converge to the climatological mean. PNF, though, picks out certain163

    members of that ensemble that enhance LFM prediction. Moreover, averaging over the164

    PNF ensemble aims to follow the relatively smooth LFM trajectory and, therefore, PNF165

    improvement for rmse (amplitude) is less pronounced than in corr (phase); see auxiliary166

    materials.167

    Figures 2c–f compare prediction skill for the forecasts started for weak vs. strong MJO168

    events, defined as {(RMM1)2 + (RMM2)2}1/2 < 1 or � 1, respectively. The rmse barely169

    depends on the MJO strength for either EMR or PNF predictions (Fig. 2d,f). Correlation170

    skill corr when using EMR alone, however, is much worse for weak vs. strong MJO171

    variability (Fig. 2c), while PNF improves forecasts started during strong as well as weak172

    MJO events. It does so much more dramatically in terms of corr in the latter case, when173

    it exceeds the overall mean skill (Fig. 2e). PNF is thus able to better capture the growth174

    of strong MJO events, an observation that is also consistent with improvement in the time175

    domain (Figs. 1c,d).176

    Figures 3a–d present prediction skill conditioned on the eight MJO phases (as defined177

    by tan�1(RMM2/RMM1) in Wheeler and Hendon [2004], cf. Fig. 1c) at the start of178

    the forecast. Physically, these phases describe di↵erent geographical locations of MJO179

    convective activity.180

    D R A F T September 24, 2013, 1:23pm D R A F T

  • KONDRASHOV ET AL.: LOW-ORDER MJO X - 11

    MJO events started in phases 3 and 8, i.e. over the Indian Ocean and tropical Africa,181

    are best predicted; they result in forecasts with the largest corr and smallest rmse,182

    respectively. There is also general improvement in PNF for rmse around phases 2—5.183

    The decrease in corr for PNF predictions started during phases 4–5 corresponds to the184

    well-known di�culties of simulating [Vitart and Molteni , 2010] and predicting [Zhang and185

    Van den Dool , 2012] MJO when it crosses the Maritime Continent. These di�culties are186

    sometimes referred to as “the Maritime Continent prediction barrier,” reminiscent of the187

    “spring barrier” for ENSO prediction [Barnston et al., 2012].188

    Figures 3e–h show strong seasonality in the corr and the rmse. The correlation gets189

    higher in boreal summer and early fall, as well as during boreal winter, while the rmse190

    is smallest during late summer and early fall. This result agrees with the perception191

    that MJO is more active and better organized during certain seasons [Ghil and Mo, 1991;192

    Wheeler and Hendon, 2004; Zhang and Van den Dool , 2012]. Our results here show that193

    the EMR and PNF prediction methods can capture well the mostly eastward propagating194

    MJO in boreal winter, as well as the additional, northward propagating component in195

    boreal summer.196

    4. Conclusions

    We presented results of a predictability study in which the EMR and PNF methodologies197

    were applied to MJO, with the ultimate goal to use and test these methods in real-time198

    forecasting. The bivariate forecast skill obtained by PNF, in both phase and amplitude,199

    is useful up to 30 days and comparable with the skill demonstrated by a dynamical200

    multi-model ensemble [Zhang , 2013]. The PNF improvement over EMR alone is most201

    D R A F T September 24, 2013, 1:23pm D R A F T

  • X - 12 KONDRASHOV ET AL.: LOW-ORDER MJO

    pronounced for predictions started during weak MJO events, over Africa and the Indian202

    Ocean, and during boreal summer and early fall.203

    Acknowledgments. It is a pleasure to thank Adam Sobel and Daehyun Kim for their204

    valuable input, and an anonymous reviewer for thoughtful and stimulating comments.205

    This study was supported by grants NSF DMS-0934426 and NSF OCE-1243175 from the206

    National Science Foundation, ONR-MURI N00014-12-1-0911 from the the O�ce of Naval207

    Research, and DOE-DE-F0A-0000411 from the Department of Energy.208

    References

    Barnston, A. G., M. K. Tippett, M. L. Heureux, S. Li, and D. G. DeWitt (2012), Skill209

    of real-time seasonal ENSO model predictions during 2002–2011 — is our capability210

    improving?, Bull. Am. Meteorol. Soc., 93(5), 631–651.211

    Chekroun, M., D. Kondrashov, and M. Ghil (2011a), Predicting stochastic systems by212

    noise sampling, and application to the El Niño-Southern Oscillation, Proc. Natl. Acad.213

    Sci. USA, 108 (29), 11,766–11,771.214

    Chekroun, M. D., E. Simonnet, and M. Ghil (2011b), Stochastic climate dynamics: Ran-215

    dom attractors and time-dependent invariant measures, Physica D, 240 (21), 1685–1700216

    Feliks, Y., M. Ghil, and A. W. Robertson (2010), Oscillatory climate modes in the East-217

    ern Mediterranean and their synchronization with the North Atlantic Oscillation, J.218

    Climate, 23, 4060–4079.219

    Ghil, M., and K.-C. Mo (1991), Intraseasonal oscillations in the global atmosphere. Part220

    I: Northern Hemisphere and tropics, J. Atmos. Sci, 48, 752–779.221

    D R A F T September 24, 2013, 1:23pm D R A F T

  • KONDRASHOV ET AL.: LOW-ORDER MJO X - 13

    Ghil, M., and A. W. Robertson (2002), “Waves” vs. “particles” in the atmosphere’s phase222

    space: A pathway to long-range forecasting?, Proc. Natl. Acad. Sci. USA, 99 (Suppl.223

    1), 2493–2500.224

    Ghil, M., M. R. Allen, M. D. Dettinger, K. Ide, D. Kondrashov, M. E. Mann, A. W.225

    Robertson, A. Saunders, Y. Tian, F. Varadi, and P. Yiou (2002), Advanced spectral226

    methods for climatic time series, Rev. Geophys., 40, 3.1–3.41.227

    Gottschalck, J., M. Wheeler, K. Weickmann, F. Vitart, N. Savage, H. Lin, H. Hendon,228

    D. Waliser, K. Sperber, M. Nakagawa, C. Prestrelo, M. Flatau, and W. Higgins (2010),229

    A framework for assessing operational Madden-Julian oscillation forecasts: A CLIVAR230

    MJO Working Group project, Bull. Am. Meteor. Soc., 91 (9), 1247–1258,231

    Jiang, X., D. E. Waliser, M. C. Wheeler, C. Jones, M.-N. Lee, and S. D. Schuert (2008),232

    Assessing the skill of an all-season statistical forecast model for the Madden-Julian233

    oscillation, Mon. Wea. Rev., 136 (6), 1940–1956234

    Kang, I.-S., and H.-M. Kim (2010), Assessment of MJO predictability for boreal winter235

    with various statistical and dynamical models, J. Climate, 23, 2368–2378.236

    Kondrashov, D., S. Kravtsov, A. W. Robertson, and M. Ghil (2005), A hierarchy of237

    data-based ENSO models, J. Climate, 18, 4425–4444.238

    Kondrashov, D., M. D. Chekroun, and M. Ghil (2013), Data-driven reduction by multi-239

    layered stochastic models with energy-preserving nonlinearities, Physica D., submitted.240

    Kondrashov, D., S. Kravtsov, and M. Ghil (2011), Signatures of nonlinear dynamics in241

    an idealized atmospheric model, J. Atmos. Sci., 68, 3–12.242

    D R A F T September 24, 2013, 1:23pm D R A F T

  • X - 14 KONDRASHOV ET AL.: LOW-ORDER MJO

    Kravtsov, S., D. Kondrashov, and M. Ghil (2005), Multilevel regression modeling of non-243

    linear processes: Derivation and applications to climatic variability, J. Climate, 18,244

    4404–4424.245

    Kravtsov, S., D. Kondrashov, and M. Ghil (2009), Empirical model reduction and the246

    modeling hierarchy in climate dynamics, in Stochastic Physics and Climate Modeling,247

    edited by T. N. Palmer and P. Williams, pp. 35–72, Cambridge Univ. Press.248

    Love, B. S., and A. J. Matthews (2009), Real-time localised forecasting of the Madden-249

    Julian Oscillation using neural network models, Quart. J. Roy. Meteor. Soc., 135 (643,250

    Part b), 1471–1483251

    Maharaj, E., and M. Wheeler (2005), Forecasting an index of the Madden-Julian oscilla-252

    tion, Int. J. Climatol., 25 (12), 1611–1618253

    Penland, C., and P. D. Sardeshmukh (1995), The optimal growth of tropical sea surface254

    temperature anomalies, J. Climate, 8, 1999–2024.255

    Seo, K.-H., W. Wang, J. Gottschalck, Q. Zhang, J.-K. E. Schemm, W. R. Higgins, and256

    A. Kumar (2009), Evaluation of MJO forecast skill from several statistical and dynam-257

    ical forecast models, J. Climate, 22 (9), 2372–2388258

    Vautard, R., and M. Ghil (1989), Singular spectrum analysis in nonlinear dynamics, with259

    applications to paleoclimatic time series, Physica D, 35, 395–424.260

    Vitart, F., and F. Molteni (2010), Simulation of the Madden–Julian Oscillation and its261

    teleconnections in the ECMWF forecast system, Quart. J. Roy. Meteor. Soc., 136, 842–262

    855.263

    D R A F T September 24, 2013, 1:23pm D R A F T

  • KONDRASHOV ET AL.: LOW-ORDER MJO X - 15

    Wheeler, M. C., and H. Hendon (2004), An all-season real-time multivariate MJO index:264

    Development of an index for monitoring and prediction, Mon. Wea. Rev., 132, 1917–265

    1932.266

    Wouters, J., and V. Lucarini (2013), Multi-level Dynamical Systems: Connecting the267

    Ruelle Response Theory and the Mori-Zwanzig Approach, J. Stat. Physics, 151 (5),268

    850–860.269

    Zhang, C., J., Gottschalck, E. Maloney, M. Moncrie↵, F. Vitart, D. Waliser, B. Wang, M.270

    Wheeler (2013), Cracking the MJO nut, Geophys. Res. Lett., 40(6), 1223–1230.271

    Zhang, Q., and H. van den Dool (2012), Relative merit of model improvement versus avail-272

    ability of retrospective forecasts: The case of Climate Forecast System MJO prediction,273

    Wea. Forecasting, 27, 1045–1051.274

    D R A F T September 24, 2013, 1:23pm D R A F T

  • X - 16 KONDRASHOV ET AL.: LOW-ORDER MJO

    20 40 60 80 100−0.5

    0

    0.5

    1

    AC

    F

    (a) RMM1

    EMRDatastd.dev

    20 40 60 80 100−0.5

    0

    0.5

    1

    Days

    AC

    F

    (b) RMM2

    EMRDatastd.dev

    9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4−2

    0

    2

    Months

    2006 2007

    (d)25days lead, RMM1 prediction

    5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1−2

    0

    2

    Months

    2008 2009

    DataSSAEMRPNF

    −4 −2 0 2 4−4

    −2

    0

    2

    4

    1

    2 3

    4

    5

    67

    8

    WesternPacific

    Wes

    tern

    Hem

    isphe

    rean

    d Af

    rica

    IndianOceanIndianOcean

    Maritim

    eContinent

    RMM1

    RM

    M2

    START

    (c) RMM1,RMM2 phase space for 26−8−2007 to 25−9−2007

    Figure 1. Comparison between the simulation-and-prediction capabilities of the EMR model

    with the PNF-modified version thereof. (a,b) Autocorrelation functions (ACFs) of the (a) RMM1

    and (b) RMM2 daily indices: observations—red; ensemble mean of the EMR simulations—blue,

    standard deviation of the EMR ensemble—black. (c) Improved prediction of the observed RMM1-

    RMM2 indices (light black) by PNF-based driving-noise selection: PNF ensemble mean (blue)

    vs. EMR ensemble mean (red), while individual members of the PNF ensemble are shown in

    green; the predictions in this panel, out to 30 days, all start on August 26, 2007, namely the

    date that is marked by the black dashed line in panel (d). (d) PNF prediction in the time

    domain of RMM1 (light black) at a 25-day lead is consistent with MJO’s LFM, captured by SSA

    RCs 1-2 (heavy cyan); x-axis is time in calendar months within the validation interval of July

    2005–January 2009.

    D R A F T September 24, 2013, 1:23pm D R A F T

  • KONDRASHOV ET AL.: LOW-ORDER MJO X - 17

    0 10 20 300

    0.2

    0.4

    0.6

    0.8

    1(a) Bivariate Corr

    Lead (days)

    EMRPNFPersistence

    0 10 20 300.2

    0.4

    0.6

    0.8

    1

    1.2

    1.4

    1.6

    Lead (days)

    (b) Bivariate RMS

    EMRPNFPersistence

    0 10 20 300

    0.2

    0.4

    0.6

    0.8

    1

    Lead (days)

    (c) Corr, EMR

    AllStrongWeak

    0 10 20 300

    0.5

    1

    1.5

    Lead (days)

    (d) RMS, EMR

    AllStrongWeak

    0 10 20 300

    0.5

    1

    1.5

    Lead (days)

    (f) RMS, PNF

    AllStrongWeak

    0 10 20 300

    0.2

    0.4

    0.6

    0.8

    1

    Lead (days)

    (e) Corr, PNF

    AllStrongWeak

    Figure 2. Skill scores for prediction by EMR alone and by EMR driven by PNF snippets. (a)

    Bivariate correlation (corr) and (b) root-mean-square error (rmse) between the observed and

    forecast RMM indices showing increase in predictability of RMM1-2 beyond 15 days by PNF

    (blue) vs. EMR (red); the black curve shows damped persistence as a basis for comparison. (c–f)

    Prediction skill as a function of the MJO strength at the start of the forecast: PNF improvement

    is most pronounced for weak MJO, especially in correlation.D R A F T September 24, 2013, 1:23pm D R A F T

  • X - 18 KONDRASHOV ET AL.: LOW-ORDER MJO

    mon

    th

    (e) Corr, EMR

    5 10 15 20 25 30

    2

    4

    6

    8

    10

    12

    0.2

    0.4

    0.6

    0.8

    1

    mon

    th

    (f) Corr, PNF

    5 10 15 20 25 30

    2

    4

    6

    8

    10

    12

    0.2

    0.4

    0.6

    0.8

    1

    (g) RMS, EMR

    mon

    th

    5 10 15 20 25 30

    2

    4

    6

    8

    10

    12

    0.5

    1

    1.5(h) RMS, PNF

    mon

    th

    5 10 15 20 25 30

    2

    4

    6

    8

    10

    12

    0.5

    1

    1.5

    MJO

    pha

    se

    (a) Corr, EMR

    5 10 15 20 25 30

    2

    4

    6

    8

    0.2

    0.4

    0.6

    0.8

    1

    MJO

    pha

    se

    (b) Corr, PNF

    5 10 15 20 25 30

    2

    4

    6

    8

    0.2

    0.4

    0.6

    0.8

    1

    (c) RMS, EMR

    MJO

    pha

    se

    5 10 15 20 25 30

    2

    4

    6

    8

    0.5

    1

    1.5(d) RMS, PNF

    MJO

    pha

    se

    5 10 15 20 25 30

    2

    4

    6

    8

    0.5

    1

    1.5

    Figure 3. Prediction skill conditioned (a–d) on the phase of the MJO; and (e–h) on the

    calendar month at the start of the forecast. Left column (a, c, e, g) EMR only; right column (b,

    d, f, h) EMR driven by PNF snippets.

    D R A F T September 24, 2013, 1:23pm D R A F T

  • Auxiliary materials for Low-order stochastic model and “past-noiseforecasting” of the Madden-Julian oscillation

    by D. Kondrashov, M.D. Chekroun, A.W. Robertsonand M. Ghil

    September 24, 2013

    The purpose of these Auxiliary Materials is to provide details on the discrete-timeformulation of our empirical model reduction (EMR) model for the Madden-Julian oscil-lation (MJO) and to help explain the success of past-noise forecasting (PNF) in improvingthe EMR predictions.

    Discrete EMR model formulation

    To derive a discrete-time formulation of any EMR model, as stated in Eq. (1) of theMain Text, we start from the main level and form time series of instantaneous tendencies�xk/�t = (xk+1 � xk)/�t from a given discrete time series xtk = xk. Then, given aparametric form of the main level in Eq. (1) of the Main Text — in which a quadraticnonlinearity is typically assumed — the coe�cients F, A, B are found by a least-squaresestimation procedure.

    The regression residual of the main level defines a time series of r(0)k , and the least-

    squares estimation procedure is then repeated at the next levelm = 0 to estimate �r(0)k /�t;

    this yields L(1) and a new time series of regression residuals r(1)k . We continue adding levels

    until the estimated regression residual at the last level ⇠t ⌘ r(M)k has an autocorrelationthat vanishes at unit lag, and its spatial covariance matrix has converged to a constantmatrix ⌃ =< ⇠Tt ⇠t >. Unless it is necessary for the application at hand to do otherwise,it is convenient to use a unit time step of �t = 1. For instance this is what we used forthe MJO data set of this study, given here by the pair of daily indices of the Real-timeMultivariate MJO Index (RMM1, RMM2).

    To ensure that the deterministic part of the dynamics is stable, we require thatBijkxixjxk ⌘ 0 and Aijxixj > 0 should hold for any x 6= 0; [Kondrashov et al.(2013)]. Thecondition for the linear-part coe�cients A is a combination of linear equality and inequal-ity constraints. In particular, skew-symmetry for the o↵-diagonal terms Aij + Aji = 0reflects the wave propagation nature of MJO in the phase space of the daily (RMM1,RMM2) indices, while the diagonal terms Aii > 0 account for dissipation.

    1

  • The condition for the quadratic-term coe�cients B is satisfied by introducing an ap-propriate number of linear constraints on the Bijk. In the MJO case, with d = 2, we haveBiii ⌘ 0, and skew-symmetry constraints for two pairs of coe�cients Bjjk +Bkjj = 0, andBjkk +Bkjk = 0, j 6= k. The least-square minimization problem with linear constraints issolved using a preconditioned conjugate gradient method; see [Gould et al., 2001].

    By applying this EMR energy-conserving formalism to daily time series of the RMMindices (x1,n, x2,n) for July 1974–June 2005, and including seasonal cycle dependence inF and A, we obtain the following EMR model:

    "x1,n+1 � x1,nx2,n+1 � x2,n

    #

    = F�A"x1,nx2,n

    #

    +B

    2

    64x21,n

    x1,nx2,nx22,n

    3

    75+ S

    2

    666666664

    sin(!n)x1,n sin(!n)x2,n sin(!n)cos(!n)

    x1,n cos(!n)x2,n cos(!n)

    3

    777777775

    +

    "r(0)1,nr(0)2,n

    #

    ,

    "r(0)1,n+1 � r

    (0)1,n

    r(0)2,n+1 � r(0)2,n

    #

    = L(1)

    2

    66664

    x1,nx2,nr(0)1,nr(0)2,n

    3

    77775+

    "r(1)1,nr(1)2,n

    #

    , (1)

    "r(1)1,n+1 � r

    (1)1,n

    r(1)2,n+1 � r(1)2,n

    #

    = L(2)

    2

    6666666664

    x1,nx2,nr(0)1,nr(0)2,nr(1)1,nr(1)2,n

    3

    7777777775

    +Q

    "⌘1,n⌘2,n

    #

    .

    Here

    F =

    "�0.00003�0.00728

    #

    , (2)

    A =

    "0.0268 0.1150�0.1150 0.0278

    #

    , (3)

    B =

    "0 �0.0064 �0.0002

    0.0064 0.0002 0

    #

    , (4)

    L(1) =

    "�0.0219 �0.0025 �0.4893 0.0987�0.0002 �0.0176 �0.0736 �0.5323

    #

    , (5)

    L(2) =

    "0.0022 0.0020 �0.0474 �0.0576 �0.9560 0.0306�0.0017 0.0017 0.0463 �0.0423 �0.0292 �0.9687

    #

    . (6)

    The matrix S in the main level of Eq. (1) here represents the linear interactions onthe main level between the pair of RMM indices and the seasonal cycle with frequency

    2

  • ! = 2⇡/365, and n accounting for a time index in days. This matrix is given by:

    S =

    "�0.0028 0.0228 0.0101 �0.0015 0.0011 �0.00690.0033 �0.0242 �0.0250 0.0063 0.0013 �0.0048

    #

    (7)

    The matrix Q in the equation governing the evolution of r(1)t accounts for the Choleskydecomposition QTQ of the spatial correlation matrix associated with the last-level regres-sion residual ⇠t ⌘ r(2)t :

    Q =

    "1.0 0.0

    0.0412 0.9992

    #

    . (8)

    This matrix allows us to scale properly the magnitude of the stochastic forcing by twoindependent normally distributed random variables ⌘1 and ⌘2, with ⌘1 ⇠ N (0, 0.1704)and ⌘2 ⇠ N (0, 0.1688).

    Note that the multi-level EMR model aims to reproduce the full power spectrum ofthe RMM indices, interpreted here as the resolved variables x in Eq. (1) of the MainText, i.e. as the leading MJO modes. Alternatively, we can think of them as selectedobservables for these modes. The EMR model construction proceeds by estimating thedynamics of the residual r-variables, at the additional levels; the latter are aimed to modelthe dynamics of the unresolved or unobserved modes. The e↵ective contribution of ther-variables to the dynamics of the RMM indices on the main level may be interpreted asred-noise type forcing with slow-decay time scales, and memory e↵ects related to the pasthistory of the RMM indices also present; [Kondrashov et al.(2013)].

    Practical insights into PNF prediction

    Proper initialization for prediction. When performing prediction by using time-discrete formulation of Eq. (1), unobserved r-variables of additional levels have to beproperly initialized in cross-validation interval where only observed RMM indices of mainlevel are available. It is easily achieved by explicitly using RHS of Eq. (1), i.e. timeseries of r-variables are estimated in a backward procedure by going sequentially from themain level to the last, in a same way as it has been described in the previous section forestimating regression residuals when fitting the model in training interval.

    Moreover, forward integration for prediction purposes has to be initialized a certainnumber of time steps back into the past, equal to number of additional levels in Eq. (1).

    The last level variable r(1)t can be forced either by spatiality correlated random variables ⌘t(standard EMR prediction) or by chosen PNF noise snippet from the regression residual⇠t of the last level in the training interval. Equation (1) is then integrated forward fromthe last level to the main one over an interval of length equals to the lead-time, the lattercorresponding also to the length of the snippet.

    Such “backward-forward” procedure ensures initial conditions for the forecast that areconsistent with observations in validation interval, and the random forcing introduced atthe last level is then properly accounted at the main level beyond start of the forecast.

    3

  • Snippet selection via Hilbert transform. To select PNF noise snippets associatedwith a particular t⇤ (the forecast start time in validation interval), we first perform aHilbert transform H(st) = st + iht, where st denotes the time series of MJO LFM asobtained by applying SSA to either RMM1 or RMM2 index. The instantaneous phase �tis uniquely defined as �t = arctan(ht/st) and can be thus determined for all data pointsup to the forecast start time t⇤. The noise snippets — whose length equals the forecastlead — are then selected from the history of ⇠t (as determined by EMR during the traininginterval) for LFM phase values �t that fall within a small ✏�-neighborhood of �⇤t , the valueat the forecast start time t⇤.

    We used ✏� ⇡ 0.1⇡, resulting in a typical subset of ⇡ 500 ensemble members — out ofa maximum possible number of ⇡ 11000, the total number of days in the training interval.In addition, we refined further this subset of noise snippets by choosing only those thatcorrespond to initial states in the raw RMM data closest to those at forecast start t⇤.This final ensemble of noise snippets reduces then to a fixed size of 20 members per t⇤.

    Behavior of individual members from PNF ensemble. The prediction by av-eraging over the PNF ensemble aims to follow the smooth behavior of the MJO’s LFM,as captured by SSA. Therefore, its phase speed is not expected to exhibit abrupt localchanges.

    This behavior, which is clearly due to averaging, is not shared by individual ensemblemembers. To illustrate this point, we included below separate plots for 20 individualmembers of the PNF ensemble used for the MJO case presented in Fig. 1c of the MainText. These 20 members are ranked by increasing bivariate root-mean-square error withrespect to the true evolution over 30 days. The 20 plots illustrate very rich dynamics forindividual trajectories, including both slow-down and acceleration in RMM phase space.This rich dynamics can be attributed to the dynamical interaction between the multi-levelvariables of the low-order model and the PNF selection of the driving noise at the lastlevel. Note that the leading PNF ensemble members (ranked 1 through 3 below) comevery close to actual MJO evolution over 30 days.

    References

    [Kondrashov et al.(2013)] Kondrashov, D., M. D. Chekroun, and M. Ghil (2013), Data-driven reduction by multilayered stochastic models with energy-preserving nonlineari-ties, Physica D., submitted.

    [Gould et al., 2001] Gould, N., I., M., M. E. Hribar and J. Nocedal (2001), On the solutionof equality constrained quadratic programming problems arising in optimization, SIAMJ. Sci. Comput., 23, 1376–1395.

    4

  • −4 −2 0 2 4−4

    −2

    0

    2

    4

    1

    2 3

    4

    5

    67

    8

    WesternPacific

    Wes

    tern

    Hem

    isph

    ere

    and

    Afric

    a

    IndianOceanIndianOcean

    Maritim

    eC

    ontinentRMM1

    RM

    M2

    START

    1

    −4 −2 0 2 4−4

    −2

    0

    2

    4

    1

    2 3

    4

    5

    67

    8

    WesternPacific

    Wes

    tern

    Hem

    isph

    ere

    and

    Afric

    a

    IndianOceanIndianOcean

    Maritim

    eC

    ontinent

    RMM1

    RM

    M2

    START

    2

    −4 −2 0 2 4−4

    −2

    0

    2

    4

    1

    2 3

    4

    5

    67

    8

    WesternPacific

    Wes

    tern

    Hem

    isph

    ere

    and

    Afric

    a

    IndianOceanIndianOcean

    Maritim

    eC

    ontinent

    RMM1

    RM

    M2

    START

    3

    −4 −2 0 2 4−4

    −2

    0

    2

    4

    1

    2 3

    4

    5

    67

    8

    WesternPacific

    Wes

    tern

    Hem

    isph

    ere

    and

    Afric

    a

    IndianOceanIndianOcean

    Maritim

    eC

    ontinent

    RMM1

    RM

    M2

    START

    4

    Figure 1: Individual members (1–4) of the PNF ensemble, observed RMM1-RMM2 indices(light black), PNF ensemble mean (blue); EMR ensemble mean (red)

    5

  • −4 −2 0 2 4−4

    −2

    0

    2

    4

    1

    2 3

    4

    5

    67

    8

    WesternPacific

    Wes

    tern

    Hem

    isph

    ere

    and

    Afric

    a

    IndianOceanIndianOcean

    Maritim

    eC

    ontinentRMM1

    RM

    M2

    START

    5

    −4 −2 0 2 4−4

    −2

    0

    2

    4

    1

    2 3

    4

    5

    67

    8

    WesternPacific

    Wes

    tern

    Hem

    isph

    ere

    and

    Afric

    a

    IndianOceanIndianOcean

    Maritim

    eC

    ontinent

    RMM1

    RM

    M2

    START

    6

    −4 −2 0 2 4−4

    −2

    0

    2

    4

    1

    2 3

    4

    5

    67

    8

    WesternPacific

    Wes

    tern

    Hem

    isph

    ere

    and

    Afric

    a

    IndianOceanIndianOcean

    Maritim

    eC

    ontinent

    RMM1

    RM

    M2

    START

    7

    −4 −2 0 2 4−4

    −2

    0

    2

    4

    1

    2 3

    4

    5

    67

    8

    WesternPacific

    Wes

    tern

    Hem

    isph

    ere

    and

    Afric

    a

    IndianOceanIndianOcean

    Maritim

    eC

    ontinent

    RMM1

    RM

    M2

    START

    8

    Figure 2: Individual members (5–8) of the PNF ensemble, observed RMM1-RMM2 indices(light black), PNF ensemble mean (blue); EMR ensemble mean (red)

    6

  • −4 −2 0 2 4−4

    −2

    0

    2

    4

    1

    2 3

    4

    5

    67

    8

    WesternPacific

    Wes

    tern

    Hem

    isph

    ere

    and

    Afric

    a

    IndianOceanIndianOcean

    Maritim

    eC

    ontinentRMM1

    RM

    M2

    START

    9

    −4 −2 0 2 4−4

    −2

    0

    2

    4

    1

    2 3

    4

    5

    67

    8

    WesternPacific

    Wes

    tern

    Hem

    isph

    ere

    and

    Afric

    a

    IndianOceanIndianOcean

    Maritim

    eC

    ontinent

    RMM1

    RM

    M2

    START

    10

    −4 −2 0 2 4−4

    −2

    0

    2

    4

    1

    2 3

    4

    5

    67

    8

    WesternPacific

    Wes

    tern

    Hem

    isph

    ere

    and

    Afric

    a

    IndianOceanIndianOcean

    Maritim

    eC

    ontinent

    RMM1

    RM

    M2

    START

    11

    −4 −2 0 2 4−4

    −2

    0

    2

    4

    1

    2 3

    4

    5

    67

    8

    WesternPacific

    Wes

    tern

    Hem

    isph

    ere

    and

    Afric

    a

    IndianOceanIndianOcean

    Maritim

    eC

    ontinent

    RMM1

    RM

    M2

    START

    12

    Figure 3: Individual members (9–12) of the PNF ensemble, observed RMM1-RMM2indices (light black), PNF ensemble mean (blue); EMR ensemble mean (red)

    7

  • −4 −2 0 2 4−4

    −2

    0

    2

    4

    1

    2 3

    4

    5

    67

    8

    WesternPacific

    Wes

    tern

    Hem

    isph

    ere

    and

    Afric

    a

    IndianOceanIndianOcean

    Maritim

    eC

    ontinentRMM1

    RM

    M2

    START

    13

    −4 −2 0 2 4−4

    −2

    0

    2

    4

    1

    2 3

    4

    5

    67

    8

    WesternPacific

    Wes

    tern

    Hem

    isph

    ere

    and

    Afric

    a

    IndianOceanIndianOcean

    Maritim

    eC

    ontinent

    RMM1

    RM

    M2

    START

    14

    −4 −2 0 2 4−4

    −2

    0

    2

    4

    1

    2 3

    4

    5

    67

    8

    WesternPacific

    Wes

    tern

    Hem

    isph

    ere

    and

    Afric

    a

    IndianOceanIndianOcean

    Maritim

    eC

    ontinent

    RMM1

    RM

    M2

    START

    15

    −4 −2 0 2 4−4

    −2

    0

    2

    4

    1

    2 3

    4

    5

    67

    8

    WesternPacific

    Wes

    tern

    Hem

    isph

    ere

    and

    Afric

    a

    IndianOceanIndianOcean

    Maritim

    eC

    ontinent

    RMM1

    RM

    M2

    START

    16

    Figure 4: Individual members (13–16) of the PNF ensemble, observed RMM1-RMM2indices (light black), PNF ensemble mean (blue); EMR ensemble mean (red)

    8

  • −4 −2 0 2 4−4

    −2

    0

    2

    4

    1

    2 3

    4

    5

    67

    8

    WesternPacific

    Wes

    tern

    Hem

    isph

    ere

    and

    Afric

    a

    IndianOceanIndianOcean

    Maritim

    eC

    ontinentRMM1

    RM

    M2

    START

    17

    −4 −2 0 2 4−4

    −2

    0

    2

    4

    1

    2 3

    4

    5

    67

    8

    WesternPacific

    Wes

    tern

    Hem

    isph

    ere

    and

    Afric

    a

    IndianOceanIndianOcean

    Maritim

    eC

    ontinent

    RMM1

    RM

    M2

    START

    18

    −4 −2 0 2 4−4

    −2

    0

    2

    4

    1

    2 3

    4

    5

    67

    8

    WesternPacific

    Wes

    tern

    Hem

    isph

    ere

    and

    Afric

    a

    IndianOceanIndianOcean

    Maritim

    eC

    ontinent

    RMM1

    RM

    M2

    START

    19

    −4 −2 0 2 4−4

    −2

    0

    2

    4

    1

    2 3

    4

    5

    67

    8

    WesternPacific

    Wes

    tern

    Hem

    isph

    ere

    and

    Afric

    a

    IndianOceanIndianOcean

    Maritim

    eC

    ontinent

    RMM1

    RM

    M2

    START

    20

    Figure 5: Individual members (17–20) of the PNF ensemble, observed RMM1-RMM2indices (light black), PNF ensemble mean (blue); EMR ensemble mean (red)

    9