General-to-Specific Time Series...

75
General-to-Specific Time Series Modelling Felix Pretis University of Oxford Michaelmas 2017 Lecture 2: Indicator Saturation Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 1 / 65

Transcript of General-to-Specific Time Series...

Page 1: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

General-to-SpecificTime Series Modelling

Felix Pretis

University of Oxford

Michaelmas 2017

Lecture 2: Indicator Saturation

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 1 / 65

Page 2: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Indicator Saturation References

Core References for Lecture 2:

Hendry, Johansen, and Santos (2008)* – Impulse IndicatorSaturation (IIS)

Castle, Doornik, Hendry, and Pretis (2015)* – Step IndicatorSaturation (SIS)

Johansen and Nielsen (2009) – IIS Asymptotic Theory

Johansen and Nielsen (2016) – IIS Asymptotic Theory

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 2 / 65

Page 3: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

US Food Demand

US Food Demand (Expenditure) – Hendry and Mizon (2011):ef: food expenditure

1920 1940 1960 1980 2000

-12.50

-12.25

-12.00

-11.75ef: food expenditure pfp: price of food

1920 1940 1960 1980 2000

-0.25

0.00

pfp: price of food

e: total expenditure

1920 1940 1960 1980 2000

-11

-10e: total expenditure s: savings ratio

1920 1940 1960 1980 2000

0.0

0.1

0.2

0.3s: savings ratio

a: average family size

1920 1940 1960 1980 2000

1.3

1.5 a: average family size ∆ ef

1920 1940 1960 1980 2000-0.1

0.0

0.1 ∆ ef

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 3 / 65

Page 4: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

US Food Demand

US Food Demand (Expenditure) – Hendry and Mizon (2011):

The data variables are (lower case denoting logs):

. ef is constant price, per capita, expenditure on food

. e is constant price, per capita, total expenditure

. p is deflator of total expenditure

. y is constant price, per capita, income

. pf − p is real price of food

. s = (y− e) is an approximation to the savings ratio

. a is average family size–demographic effects

. n is total population of the USA–should be irrelevant as per capita data.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 4 / 65

Page 5: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Food Demand changes

There are considerable changes over the period:

ef and e fall sharply at the beginning of the Great Depression,rise substantially till WWII, fall after, then resume a gentle rise,

so ∆ef is much more volatile pre WWII: ∆e has a similar but lesspronounced pattern).

pf − p is quite volatile till after WWII, then is relatively stable,

s rises from ‘forced saving’ in WWII.

a has fallen considerably, partly reflecting changes in socialmores.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 5 / 65

Page 6: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Modelling Food Demand

Tobin (1950) modelled US food demand:used time series 1912-48.We use extended time-series data, updated by Reade (2008).

The basic theory is:

ef = f (e,pf − p, s,a) (1)

Conventional theory expects:

∂ef∂e

> 0,∂ef

∂ (pf − p)< 0,

∂ef∂a

< 0,∂ef∂n

= 0 (2)

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 6 / 65

Page 7: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Static model

The static theory model estimates are:

ef,t = 5.30(4.02)

+ 0.77(0.14)

et + 0.11(0.08)

(pf − p)t + 0.72(0.14)

st − 0.36(0.23)

at − 0.73(0.22)

nt

R2 = 0.94 χ2nd(2) = 19.5∗∗ Farch(1, 72) = 216.8∗∗ Far(2, 66) = 44.3∗∗

σ = 0.055 Freset(2, 66) = 18.1∗∗ Fhet(10, 63) = 23.2∗∗

The static economic-theory model has a very poor fit, and doesnot adequately capture behaviour of observed data.

The price elasticity (pf − p)t has the ‘wrong sign’, contradicting(2), but is insignificant.

Although it is theoretically irrelevant, population nt is significant.

Finally, every mis-specification test strongly rejects.Next Figure shows the estimated model fails to describe the1930s.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 7 / 65

Page 8: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Static Model

ef ef

1940 1960 1980 2000

-12.50

-12.25

-12.00ef ef

scaled residuals

1940 1960 1980 2000

-4

-2

0

2

scaled residuals

residual density N(0,1)

-4 -2 0 2

0.1

0.2

0.3

0.4

0.5

residual density N(0,1)

residual autocorrelations

0 5 10

-0.5

0.0

0.5

1.0residual autocorrelations

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 8 / 65

Page 9: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

‘Discovering Indicator Saturation’

Magnus and Morgan (1999) (‘competition’):Most contributors found dynamic models were non-constant overfull sample 1931–1989, so modelled post 1950 only.

Hendry (1999):i) Added Indicators for inter-war & post-war

Food program, Great Depressionii) Reversed procedure: indicators for all observations from 1950sonwards

1970s

Retained significant and re-select.

Hendry (1999) found a constant equation over 1931–1989 byadding impulse indicators pre-1950 for large outliers, identified asbeing due to a food program and post-war de-rationing.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 9 / 65

Page 10: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

‘Discovering Indicator Saturation’

Magnus and Morgan (1999) (‘competition’):Most contributors found dynamic models were non-constant overfull sample 1931–1989, so modelled post 1950 only.

Hendry (1999):i) Added Indicators for inter-war & post-war

Food program, Great Depressionii) Reversed procedure: indicators for all observations from 1950sonwards

1970s

Retained significant and re-select.

Hendry (1999) found a constant equation over 1931–1989 byadding impulse indicators pre-1950 for large outliers, identified asbeing due to a food program and post-war de-rationing.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 9 / 65

Page 11: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

‘Discovering Indicator Saturation’

Magnus and Morgan (1999) (‘competition’):Most contributors found dynamic models were non-constant overfull sample 1931–1989, so modelled post 1950 only.

Hendry (1999):i) Added Indicators for inter-war & post-war

Food program, Great Depressionii) Reversed procedure: indicators for all observations from 1950sonwards

1970s

Retained significant and re-select.

Hendry (1999) found a constant equation over 1931–1989 byadding impulse indicators pre-1950 for large outliers, identified asbeing due to a food program and post-war de-rationing.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 9 / 65

Page 12: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Food Expenditure with Impulses

∆ef = 0.13(0.035)

∆ef,t−1 − 0.11(0.012)

I31 − 0.11(0.012)

I32

+ 0.028(0.0096)

I34 − 0.027(0.0096)

I43 + 0.031(0.0085)

I70

+ 0.59(0.04)

∆et − 0.32(0.031)

∆(pf − p)t − 0.19(0.1)

∆nt

+ 0.23(0.035)

∆st − 0.36(0.023)

ECMt−1

Far(2, 59) = 0.68 χ2nd(2) = 1.78 Farch(1, 70) = 0.27

Freset(2, 59) = 0.23 Fhet(12, 54) = 1.01

Solved cointegrating relation with dummies excluded:

ECM = ef − 0.63(0.01)

e+ 0.13(0.04)

(pf − p) − 1.12(0.08)

s+ 0.45(0.01)

n

∆ef Fitted

1930 1940 1950 1960 1970 1980 1990 2000

-0.1

0.0

0.1

∆ef Fitted

r: ∆ef (scaled)

1930 1940 1950 1960 1970 1980 1990 2000

-1

1

r: ∆ef (scaled)

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 10 / 65

Page 13: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

‘Discovering Indicator Saturation’

Adding indicators for every observation?David Hendry trying to convince Søren Johansen at Engle and

Granger Nobel award ceremony 2003 Stockholm:

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 11 / 65

Page 14: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Motivation

Unknown unknowns: unknown number of location shifts/outliers ofunknown magnitudes at unknown times.

Testing model mis-specificationLearning from dataTesting super exogeneity

Unmodelled location shifts have pernicious effects:

in sample, mis-specified empirical models, distorting inference;out of sample, assess forecast failure.

Forecasts CHF/NOK - Exchange Rate

0.12

0.13

0.14

Spot the Step-Shift?

December 2014 - January 2015 15.1.2015

Swiss-Franc to Norwegian Krone Exchange Rate: Forecasts from 14.1.2015 onwards

CH

F/N

OK

Forecasts CHF/NOK - Exchange Rate

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 12 / 65

Page 15: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Detecting multiple breaks

Numbers and magnitudes of breaks in models usually unknown:obviously unknown for unknowingly omitted variables.General approach required to detect location shifts anywhere insample while also selecting over many candidate variables.

Theory-embedding in general model allowing for outlier/locationshift at any point in time.

Impulse-Indicator Saturation (IIS) creates complete set of indicatorvariables:

1j=t

= 1 when j = t and 0 otherwise for j = 1, . . . , T :

1 0 0 00 1 0 00 0 1 0

0 0 0. . .

(3)

add T impulse indicators to set of candidate variables when T obs.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 13 / 65

Page 16: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Detecting multiple breaks

Numbers and magnitudes of breaks in models usually unknown:obviously unknown for unknowingly omitted variables.General approach required to detect location shifts anywhere insample while also selecting over many candidate variables.

Theory-embedding in general model allowing for outlier/locationshift at any point in time.

Impulse-Indicator Saturation (IIS) creates complete set of indicatorvariables:

1j=t

= 1 when j = t and 0 otherwise for j = 1, . . . , T :

1 0 0 00 1 0 00 0 1 0

0 0 0. . .

(3)

add T impulse indicators to set of candidate variables when T obs.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 13 / 65

Page 17: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Impulse-indicator saturation (IIS)

Hendry, Johansen, and Santos (2008): impulse-indicator saturation(IIS): adding an indicator dummy variable for each observation tocandidate set of variables for the model.

yt = α0 + α1I1 + α2I2 + ...+ αT IT + ut. (4)

This has T + 1 parameters for T observations.However, the impulses can be added in blocks (say T = 100):

1 Partition in 2 blocks, B1 = I1, ..., I50, B2 = I51, ..., I100,C.f. estimating models over two subsamples of T/2.

2 Run model selection on each block, form union S,3 Run model selection on S.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 14 / 65

Page 18: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Impulse-indicator saturation

Consider yt ∼ IID[µ,σ2ε

]for t = 1, . . . , T

First, include half of indicators, record significant:just ‘dummying out’ T/2 observations for estimating µThen omit, include other half, record again.Combine sub-sample indicators, & select significant.

αT indicators selected on average at significance level αFeasible ‘split-sample’ (IIS) algorithm: see Hendry, Johansen, andSantos (2008)Many well-known procedures are variants of IIS.

Chow (1960) test is sub-sample IIS over T − k+ 1 to T withoutselection.

Salkever (1976) tests parameter constancy by indicators.

Recursive estimation equivalent to IIS over future sample,reducing indicators one at a time.Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 15 / 65

Page 19: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Illustrating IIS

Next Figure illustrates ‘split-half’ approach for yt ∼ IN[µ,σ2y

]Three rows correspond to the three stages:

first half of the indicators, second half, then selected indicatorscombined.

Three columns report:

indicators entered,

indicators retained,

and fitted and actual values of selected model.

Many indicators added, but only one is retained in row 1.When second half entered (row 2), none is retained.Combined retained dummies entered (here just one), andselection again retains it.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 16 / 65

Page 20: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Null ‘split-sample’ search in IIS

0 50 100

0.5

1.0Indicators included initially

Blo

ck 1

Selected model: actual and fitted

0 50 100

10.0

12.5 actualfitted

0 50 100

0.5

1.0Indicators retained

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 17 / 65

Page 21: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Null ‘split-sample’ search in IIS

0 50 100

0.5

1.0Indicators included initially

Blo

ck 1

Selected model: actual and fitted

0 50 100

10.0

12.5 actualfitted

0 50 100

0.5

1.0Indicators retained

0 50 100

0.5

1.0

Blo

ck 2

0 50 100

10.0

12.5

0 50 100

0.5

1.0

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 18 / 65

Page 22: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Null ‘split-sample’ search in IIS

0 50 100

0.5

1.0Dummies included initially

Blo

ck 1

Selected model: actual and fitted

0 50 100

10.0

12.5 actualfitted

0 50 100

0.5

1.0Dummies retained

0 50 100

0.5

1.0

Blo

ck 2

0 50 100

10.0

12.5

0 50 100

0.5

1.0

0 50 100

0.5

1.0

Fina

l

0 50 100

10.0

12.5

0 50 100

0.5

1.0

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 19 / 65

Page 23: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

IIS Theory

Consider adding first half of the indicators:

yt = µ1 +

T/2∑j=1

δjdj,t + εt. (5)

The estimators are:

µ1 =1

T/2

T∑t=T/2+1

yt, (6)

s21 =1

T/2− 1

T∑t=T/2+1

(yt − µ1)2 (7)

δt = yt − µ1, t = 1, . . . , T/2 (8)

so that residuals are:εt = 0, t = 1, . . . , T/2

εt = yt − µ1, i = T/2+ 1, . . . , T

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 20 / 65

Page 24: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

IIS Theory

Consider adding first half of the indicators:

yt = µ1 +

T/2∑j=1

δjdj,t + εt. (5)

The estimators are:

µ1 =1

T/2

T∑t=T/2+1

yt, (6)

s21 =1

T/2− 1

T∑t=T/2+1

(yt − µ1)2 (7)

δt = yt − µ1, t = 1, . . . , T/2 (8)

so that residuals are:εt = 0, t = 1, . . . , T/2

εt = yt − µ1, i = T/2+ 1, . . . , T

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 20 / 65

Page 25: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

IIS Theory

Consider adding first half of the indicators:

yt = µ1 +

T/2∑j=1

δjdj,t + εt. (5)

The estimators are:

µ1 =1

T/2

T∑t=T/2+1

yt, (6)

s21 =1

T/2− 1

T∑t=T/2+1

(yt − µ1)2 (7)

δt = yt − µ1, t = 1, . . . , T/2 (8)

so that residuals are:εt = 0, t = 1, . . . , T/2

εt = yt − µ1, i = T/2+ 1, . . . , T

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 20 / 65

Page 26: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Split-Half Estimators

The final estimates are:

µ =

∑T1t=1 yt1|t1,δt |<cα

+∑Tt=T1+1 yt1|t2,δt |<cα∑T1

t=1 1|t1,δt |<cα+∑Tt=T1+1 1|t2,δt |<cα

(9)

and

σ2ε =

∑T1t=1(yt − µ1)

21|t1,δt

|<cα +∑Tt=T1+1(yt − µ2)

21|t2,δt

|<cα∑T1t=1 1|t1,δt |<cα

+∑Tt=T1+1 1|t2,δt |<cα

− 1.

(10)

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 21 / 65

Page 27: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Properties of IIS

Let yt = µ+ σεεt, t = 1, . . . , T be i.i.d., where εt has symmetriccontinuous density f(.) with mean zero, variance one. Let T = T1 + T2and assume that T1/T → λ1 and T2/T → λ2 where 0 < λ1, λ2 < 1,with λ1 + λ2 = 1 then:

T 1/2 (µ− µ)D→ N

[0,σ2εσ

](7)

where

σ2µ =

(∫cα−cα

f(ε)dε

)−2 [∫cα−cα

ε2f(ε)dε(1+ 4cαf(cα)) +

(λ21λ2

+λ22λ1

)(2cαf(cα))

2

]where:

∫cα−cα

f(ε)dε = 1− α

measures the impact of truncating the residuals.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 22 / 65

Page 28: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Split-Half Estimator

Using∫cα−cα

f(ε)dε = 1− α, and for the normal distribution,f(ε) = φ(ε), using integration by parts we find:∫cα

−cα

ε2φ(ε)dε =

∫cα−cα

φ(ε)dε− 2cαφ(cα),

so that for λ1 = λ2 = 0.5 (split-half) above simplifies to:

σ2µ =1

(1− α)

(1+ 4cαφ(cα) −

2cαφ(cα)

(1− α)[1+ 2cαφ(cα)]

)where σ2µ → 1 as |cα|→∞.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 23 / 65

Page 29: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Variance Estimator

σ2ε =

∑T1i=1(yt − µ1)

21|t1,δt

|<cα +∑Tt=T1+1(yt − µ2)

21|t2,δt

|<cα∑T1t=1 1|t1,δt |<cα

+∑Tt=T1+1 1|t2,δt |<cα

− 1.

(11)

The estimator σ2ε, has the limit

σ2εP→ σ2εκ = V(ε||ε| < cα).

For the normal distribution, f(ε) = φ(ε),we have the expression:

κ = 1−2cαφ(cα)

1− α.

where κ→ 1 as |c|→∞Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 24 / 65

Page 30: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

IIS under the null

Effects of Impulse Indicator Saturation under null:

Selection effects on mean and variance estimators similar to‘trimming’:

Small loss in efficiency (consistency effect on variance estimateσε, efficiency effect through σ2µ)Controllable by choosing more conservative α

IIS interpretable as robust estimator, but allows for joint selection overvariables.

Johansen and Nielsen (2016) ‘gauge’ is consistent:

g =1

T

T∑t=1

1(|yt−xtβ|>σεcα) (12)

E[g]→ P(|ε1| > σcα) = α (13)

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 25 / 65

Page 31: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Dynamic generalizations of IIS

Johansen & Nielsen (2009) & (2016) extend IIS to both stationaryand unit-root autoregressions:

When distribution is symmetric, adding T impulse-indicators to aregression with n variables, coefficient β (not selected) and secondmoment Σ:

T 1/2(β − β)D→ Nn

[0,σ2εΣ

−1Ωβ]

Efficiency of IIS estimator β with respect to OLS β measured by Ωβdepends on cα and distribution

Must lose efficiency under null; small loss αT : 1 observation atα = 1/T if T = 100, despite T extra candidates.

Potential for major gain under alternatives of breaks and/or datacontamination: variant of robust estimationbut can be done jointly with all other selections

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 26 / 65

Page 32: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Dynamic generalizations of IIS

Johansen & Nielsen (2009) & (2016) extend IIS to both stationaryand unit-root autoregressions:

When distribution is symmetric, adding T impulse-indicators to aregression with n variables, coefficient β (not selected) and secondmoment Σ:

T 1/2(β − β)D→ Nn

[0,σ2εΣ

−1Ωβ]

Efficiency of IIS estimator β with respect to OLS β measured by Ωβdepends on cα and distribution

Must lose efficiency under null; small loss αT : 1 observation atα = 1/T if T = 100, despite T extra candidates.

Potential for major gain under alternatives of breaks and/or datacontamination: variant of robust estimationbut can be done jointly with all other selections

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 26 / 65

Page 33: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

A single outlier

Single Outlier at t = T1:

DGP: yt = λ11t=T1 + εt

matched by model: yt = γdt=T1

(γ− λ1) = (d ′t=T1dt=T1)−1d ′t=T1εt

= εT1

Unbiased but not consistent (Hendry and Santos (2005)). Withvariance:

V[γ] = V[εT1 ] = σ2ε

t-statistic:

tγ =γ

σε≈ (γ− λ1)

σε+λ1

σε∼ N(ψλ1 , 1)

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 27 / 65

Page 34: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

IIS for a location shift

Illustrate IIS for a location shift of λ over last k observations:

yt = µ+ λ1t>T−k+1 + εt (14)

where εt ∼ IN[0,σ2ε

]and λ 6= 0.

Optimal test is t-test for a break in (14) at T − k+ 1 onwards,requires:

knowledge of location-shift timing

knowing that it is the only break

is same magnitude break thereafter

The next slide records IIS for λ = 10σε in (14) at 0.75T = 75.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 28 / 65

Page 35: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Structural break example

yt Fitted

0 20 40 60 80 100

0

5

10

yt Fitted Scaled residuals

0 20 40 60 80 100-2

0

2 Scaled residuals

Size of the break is 10 standard errors at 0.75T

There are no outliers in this mis-specified modelas all residuals ∈ [−2, 2] SDs:

outliers 6= structural breaks

step-wise regression has zero power

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 29 / 65

Page 36: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

‘Split-sample’ search in IIS

0 50 100

Dummies included initially

actual fitted

0 50 100

0

5

10

15Selected model: actual and fitted

Blo

ck 1

actual fitted

0 50 100

Dummies retained

0 50 100 0 50 100

0

5

10

15

Blo

ck 2

0 50 100

0 50 100 0 50 100

0

5

10

15

Fina

l

0 50 100

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 30 / 65

Page 37: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

How IIS works for a location shift

Initially, many indicators now retained (top row),considerable discrepancy between the first-half and second-halfmeans.

When second set entered, all indicators for location shift periodare retained.

Once combined set entered, despite large number of dummies,selection reverts to just those for break period.

Under null, indicators significant in sub-sample would remain sooverall, for alternatives, sub-sample significance can be transient, dueto unmodeled features that occur elsewhere in data.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 31 / 65

Page 38: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

IIS and SIS

Extension of IIS to step-indicator saturation (SIS):Regression model saturated with complete set of step indicators

S1 =1t6j, j = 1, . . . , T

where 1t6j = 1 for observations up to j, and zero otherwise.

Step indicators cumulate impulse indicators up to each nextobservation:

IIS: Impulses SIS: Step shifts1 0 0 00 1 0 00 0 1 0

0 0 0. . .

1 1 1 10 1 1 10 0 1 10 0 0 1

Step indicators take the form:ι′1 = (1, 0, 0, . . . , 0), ι′2 = (1, 1, 0, . . . , 0), . . . , ι′T = (1, 1, 1, . . . , 1),

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 32 / 65

Page 39: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

IIS and SIS

Extension of IIS to step-indicator saturation (SIS):Regression model saturated with complete set of step indicators

S1 =1t6j, j = 1, . . . , T

where 1t6j = 1 for observations up to j, and zero otherwise.

S:25 S:50

0 10 20 30 40 50 60 70 80 90 1000.0

0.5

1.0

Step Indicator for t=25

Step Indicatorfor t=50

S:25 S:50

0 10 20 30 40 50 60 70 80 90 100

0

1Linear Combination: S:50-S:25

Shift from t=25 to t=50

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 33 / 65

Page 40: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Differences between IIS and SIS

IIS and SIS - important differences necessitate a new analysis:

Impulse Indicators: mutually orthogonal. Step indicatorsoverlap increasingly as their second index increases.

Two indicators are required to characterize an outlier or shift notat the end of the sample: 1t6T2 − 1t<T1.

Opens the door to “designed break functions” (Volcanoes! SeePretis, Schneider, Smerdon, and Hendry (2016))

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 34 / 65

Page 41: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Step-indicator saturation

The Step-Indicator Saturation Model (Infeasible as N >> T ):

yt = β0 + β′1zt +

T−1∑j=1

δj1t6j + εt where εt ∼ IN[0,σ2ε

](15)

Split-Half Approach:Add the first T/2 indicators from the saturating set S1:

yt = β0 + β′1zt +

T/2∑j=1

δj1t6j + εt (16)

can be estimated directly, indicators retained when estimated

coefficients δj satisfy∣∣∣tδj∣∣∣ > cα where cα is the critical value for

significance level α.Locations are recorded, all those indicators are dropped, secondset is then investigated.Combine selected indicators and re-select.Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 35 / 65

Page 42: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Null rejection frequency of SIS

Under the null with α = 1/T , at both sub-steps on average, αT/2(namely 1/2 an indicator) will be retained by chance.

On average αT = 1 indicator will be retained from the combinedstage: gauge should equal nominal size.

One degree of freedom is lost on average.

When m indicators are selected in a congruent representation atsignificance level α:

yt = β0 + β′1zt +

m∑i=1

φi,α1t6Ti + vt (17)

where vt ∼ IN[0,σ2v

], and coefficients of significant indicators are

denoted φi,α.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 36 / 65

Page 43: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Illustrating SIS when no location shifts

‘Split-sample’ search by SIS at 1%.

0 50 100

0.5

1.0

Indicators included initially Selected model: actual and fitted

Bloc

k 1

Indicators included initially Selected model: actual and fitted

0 50 100

0.5

1.0Indicators retainedIndicators retained

actual fitted

0 50 100

10.0

12.5actual fitted

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 37 / 65

Page 44: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Illustrating SIS when no location shifts

‘Split-sample’ search by SIS at 1%.

0 50 100

0.5

1.0

Indicators included initially Selected model: actual and fitted

Bloc

k 1

0 50 100

0.5

1.0Indicators retained

actual fitted

0 50 100

10.0

12.5actual fitted

0 50 100

0.5

1.0

Bloc

k 2

0.0 0.5 1.0

0.5

1.0

0 50 100

10.0

12.5

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 37 / 65

Page 45: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Illustrating SIS when no location shifts

0 50 100

0.5

1.0

Indicators included initially Selected model: actual and fitted

Bloc

k 1

Indicators included initially Selected model: actual and fitted

0 50 100

0.5

1.0

Indicators retained

actual fitted

0 50 100

10.0

12.5actual fitted

0 50 100

0.5

1.0

Bloc

k 2

0.0 0.5 1.0

0.5

1.0

0 50 100

10.0

12.5

0 50 100

0.5

1.0

Fina

l

Indicators retained

0 50 100

0.5

1.0

Indicators retained

0 50 100

10.0

12.5

T = 100, and no shifts, retains 2 significant steps, so lose 2degrees of freedom–but could be combined to one dummy.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 37 / 65

Page 46: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Simulating SIS when no location shifts

Under the null of no shift:Model with n = 0: using the split-half approach where β0 = 0,εt ∼ IN[0,σ2ε] and σ2ε = 1, for a sample size T = 100 and variousvalues of α.

Retention frequency of irrelevant indicators: close to α, on averageαT irrelevant step indicators retained under the null:

Overall Gauge First Half Gauge Second Half Gauge 45 Deg. × p_alpha

0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035 0.040 0.045 0.050 0.055

0.025

0.050

45 Deg.

Nominal Significance Level α

Gau

ge

Retention frequency of irrelevant step indicators under no shift for varied αOverall Gauge First Half Gauge Second Half Gauge 45 Deg. × p_alpha

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 38 / 65

Page 47: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Simulating SIS when no location shifts

Properties of two out of the 100 step indicators (T1 = 20, T2 = 35)under the null of no shift:

γ20 Normal

-5.0 -2.5 0.0 2.5 5.0

0.1

0.2

0.3

Density of Estimators (under null)

T1=20

γ20 Normal

t(γ20) Normal

-4 -2 0 2 4

0.2

0.4

t-Statistics (under null)

t(γ20) Normal

γ35 Normal

-5.0 -2.5 0.0 2.5 5.0

0.1

0.2

T2=35

γ35 Normal

t(γ35) Normal

-2 0 2 4

0.1

0.2

0.3

0.4t(γ35) Normal

Both estimators γT1 and γT2 have densities close to Normal, centeredon zero, and central t-statistics.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 39 / 65

Page 48: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Power of a step-indicator test for aknown mean shift

Known mean shift from λ1 6= 0 to λ1 = 0 at time 0 < T1 < T/2 in DGP:

yt = µ+ λ11t6T1 + εt where εt ∼ IN[0,σ2ε

](18)

where λ1 6= 0: shift is from µ+ λ1 to µ.Nesting model of (18) when the break is known:

yt = ϕ+ δT11t6T1 + vt (19)

As∑Tt=1 1t6T1 =

∑T1t=1 1t6T1 = T1, estimating (19) delivers:

(ϕ− µ

δT1 − λ1

)=

(T T1T1 T1

)−1( ∑T

t=1 εt∑T1t=1 εt

)=

(ε(2)

ε(1) − ε(2)

)where ε(1) = T

−11

∑T1t=1 εt average over first T1 observations and

ε(2) = (T − T1)−1∑Tt=T1+1 εt

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 40 / 65

Page 49: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Power of a step-indicator test for aknown mean shift

And the variance is:

V

[(ϕ− µ

δT1 − λ1

)]= σ2ε (T− T1)

−1

(1 −1

−1 T−11 (T − T1) + 1

).

For the DGP in (18):

√T∗(δT1 − λ1

)∼ N

[0,σ2ε

](20)

Hence, neglecting the estimation uncertainty in σ2ε and letting(T−1

1 + (T − T1)−1)−1 = T∗:

tδT1

=

√T∗δT1

σε≈√T∗(δT1 − λ1)

σε+

√T∗λ1σε

∼ N[ψ∗λ1 , 1

](21)

where T∗ = T1 when there is no intercept. Yields√T∗ times the

corresponding non-centrality for an individual impulse indicator.Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 41 / 65

Page 50: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Illustrating ‘split-half’ SIS for asingle location shift

Add half indicators and select ones significant at 1%.

0 50 100

0.5

1.0

Indicators included initially Selected model: actual and fitted

Bloc

k 1

0 50 100

0.5

1.0

Indicators retained

actual fitted

0 50 100

0

5

10

15actual fitted

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 42 / 65

Page 51: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Illustrating ‘split-half’ SIS for asingle location shift

Drop, add other half indicators and again select at 1%.

0 50 100

0.5

1.0

Indicators included initially Selected model: actual and fitted

Bloc

k 1

0 50 100

0.5

1.0

Indicators retained

actual fitted

0 50 100

0

5

10

15actual fitted

0 50 100

0.5

1.0

Bloc

k 2

0 50 100

0.5

1.0

0 50 100

0

5

10

15

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 42 / 65

Page 52: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Illustrating ‘split-half’ SIS for asingle location shift

Combine retained indicators and re-select at 1%.

0 50 100

0.5

1.0

Indicators included initially Selected model: actual and fitted

Bloc

k 1

0 50 100

0.5

1.0

Indicators retained

actual fitted

0 50 100

0

5

10

15actual fitted

0 50 100

0.5

1.0

Bloc

k 2

0 50 100

0.5

1.0

0 50 100

0

5

10

15

0 50 100

0.5

1.0

Fina

l

0 50 100

0.5

1.0

0 50 100

0

5

10

15

Matching theory: initially retains last step indicator closest tomean shift, then finds correct shift, so eliminates redundantindicator. Just one step indicator needed.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 42 / 65

Page 53: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Potency for an unknown shift

Detection of single location shift falling within a half-sample of the data(0 < T1 < T/2) using split-half analysis of SIS where DGP is:

y = λ1ιT1 + ε (22)

Add first half of step indicators, model is:

yt =

T/2∑j=1

γj1t6j + vt (23)

Intercept of zero highlights main aspects of the algebra, written as:

y = D1γ(1) + v (24)

where γ(1) = (γ1 . . .γT/2)′ and D1 = (ι1 . . . ιT/2). Then:

γ(1) =(D′1D1

)−1D′1y = λ

(D′1D1

)−1D′1ιT1 +

(D′1D1

)−1D′1ε

(25)

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 43 / 65

Page 54: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Properties following from D1

The inverse of (D′1D1) is the ‘double difference’ matrix:

(D′1D1

)−1=

2 −1 0 . . . 0 0−1 2 −1 . . . 0 00 −1 2 . . . 0 0...

. . . . . . . . . . . ....

0 0 0 . . . 2 −10 0 0 . . . −1 1

(26)

so:

(D′1D1

)−1D′1 =

1 −1 0 . . . 0 00 1 −1 . . . 0 00 0 1 . . . 0 0...

......

. . . . . ....

0 0 0 . . . 1 −10 0 0 . . . 0 1

is the forward-difference matrix.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 44 / 65

Page 55: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Split-half for an unknown shift

Letting ∇εt = εt − εt+1, from γ(1) = (D′1D1)−1

D′1y:

γ(1) = λ1r +∇ε(1)

where r is a T/2× 1 vector with unity at t = T1and zeroes elsewhere,so: (

γ(1) − λ1r)= ∇ε(1) (27)

where the (T/2× 1) vector ∇ε(1) = (∇ε1,∇ε2, . . . ,∇εT/2, εT/2)′.All elements of γ(1) up to the T1th are zero, only the T1th reflect λ1,corresponding to the location shift.

Only the value of λ1 at the shift is being picked up, incrementalinformation equivalent to an impulse indicator for T1:

γT1 = λ1 +∇εT1 (28)

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 45 / 65

Page 56: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Low potency if no selection

Hence: (γ(1) − λ1r

)app

N[0,σ2ε

(D′1D1

)−1]

(29)

The estimated error variance adjusted for degrees of freedom:

σ2ε =2

T

T∑t=T/2+1

(yt − yt)2

will be an unbiased estimator of σ2ε. However, for IID errors (becauseof ∇εT1):

V [γT1 ] = 2σ2ε (30)

so that:

tγT1=

γT1√2σε

≈(γT1 − λ1)√

2σε+

λ1√2σε

∼ N

[ψλ1√

2, 1

](31)

where ψλ1/√2 is the non-centrality. Note: indep. of length of shift.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 46 / 65

Page 57: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Selection is essential

V [γT1 ] = 2σ2ε due to collinearity between step indicators:

Eliminating insignificant indicators by sequential selection ormulti-path search is essential.

Example: At 1%, cα ≈ 2.7, normalizing on σε = 1, requires λ1 > 3.8for even a 50% chance of significance before simplification.

When insignificant indicators are deleted, V[γT1 ] falls rapidly:

If all irrelevant indicators eliminated, just ιT1 remains, thenon-centrality for a single shift ψ1 =

√T∗λ1/σε which is

√2T∗ larger

than before selection.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 47 / 65

Page 58: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Second half

First half step indicators are then eliminated and second half,D2 = (ιT/2+1 . . . ιT ) added.

If significant steps from first half retained: only α/2 of estimatedcoefficients of D2 should be significant

If significant steps are not retained then model becomes:

yt =

T∑j=T/2+1

γj1t6j + vt (32)

written as:y = D2γ(2) + v (33)

where γ(2) = (γT/2+1 . . .γT )′ and D2 = (ιT/2+1 . . . ιT ). From (22):

γ(2) =(D′2D2

)−1D′2y = λ1

(D′2D2

)−1D′2ιT1 +

(D′2D2

)−1D′2ε

(34)

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 48 / 65

Page 59: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Second half analyzed separately

(D′2D2

)=(D′1D1

)+

1

2Tcc′

where c is a T/2× 1 vector of ones and j is a T/2× 1 vector of zeroesother than unity at first element:

(D′2D2

)−1=

(IT/2 +

T

2jc′)−1 (

D′1D1

)−1

and:γ(2) = λ1T1

(IT/2 +

T

2jc′)−1

j +(D′2D2

)−1D′2ε (35)

Only first element of γ(2) depends on λ1: indicator nearest to shiftretained if relevant indicators not ‘carried forward’.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 49 / 65

Page 60: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Final Step

Finally, combine selected step indicators and re-select. When allirrelevant indicators are removed and the relevant one retained:

yt = γT11t6T1 + vt (36)

perfect selection coincides with DGP; retained irrelevant indicatorsreduce degrees of freedom, and increase variances.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 50 / 65

Page 61: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Simulating SIS for a singlelocation shift

Location Shift at T1 = 35, magnitude λ1 = 4σε, selection at α = 0.01.

ytE[yt]

0 20 40 60 80 100

0.0

2.5

5.0

Time SeriesaytE[yt]

γT1Normal~γT1Normal

0 2 4 6 8

0.5

1.0

1.5

2.0

2.5bγT1Normal~γT1Normal

t(γT1)

Normalt(~γT1

)Normal

0 5 10 15 20 25

0.1

0.2

0.3

ct(γT1)

Normalt(~γT1

)Normal

V[γT1]

NormalV[~γT1

]Normal

0 1 2 3 4

10

20

30dV[γT1

]NormalV[~γT1

]Normal

Variance

Estimator

t-Statistic

Sequential selection (grey) reduces variance vs. split-half (open, blue).

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 51 / 65

Page 62: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Simulating SIS for alternative methodsvarying break lengths and magnitudes

Exact (T1 = T1) retention frequency of break for alternative methods,varying break lengths l and magnitudes, λ1

Known Break: 2SD Split-Half: 2SD Multi-Path: 2SD

Known Break: 4D Split-Half: 4SD Multi-Path: 4SD

2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 25.0 27.5 30.0 32.5 35.0

0.2

0.4

0.6

0.8

1.0

Break Length

Pote

ncy

Known Break "Benchmark"

Multi-Path (4SD)

Multi-Path (2SD)

Split-Half (2SD)

Split-Half (4SD)

2SD Break: Dashed 4SD Break: SolidKnown Break: 2SD Split-Half: 2SD Multi-Path: 2SD

Known Break: 4D Split-Half: 4SD Multi-Path: 4SD

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 52 / 65

Page 63: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Mis-selected indicator

Selected step indicators may not exactly match location shift

Random draws of error: Mis-timed break indicator for shift at t = 25:

0 10 20 30 40 50 60 70 80 90 100

0

2

4True Break, t=25

SIS t-statistic = -9.24

SIS

SD

Detected: t=28

yt Fitted - SIS, T1=28

0 10 20 30 40 50 60 70 80 90 100

0

2

4True Break, t=25

Known shift t-statistic = -8.34

Known True Shift

SD

yt - Known shift

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 53 / 65

Page 64: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Mis-timing of selected indicators

SIS selection can ‘miss’ by periods. Low potency primarily due tomis-timing rather than not detecting the shift:

T1 T1±1 T1±2 T1±3

5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

0.25

0.50

0.75

1.00

Break Length

Pote

ncy

Exact timing: t=T1

Plus/Minus 1: t=T1±1Plus/Minus 2: t=T1±2

Plus/Minus 3: t=T1±3

Potency for varying break lengths and timing accuracy for step shift of 2SD

≈0.9

≈0.6

T1 T1±1 T1±2 T1±3

Even for λ = 2σε and short breaks, potency is 0.9 or higher by T1 ± 3.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 54 / 65

Page 65: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Generalization to retained regressors

Simulate SIS with n < T general regressors, with single step shift withunknown timing requiring two indicators, the DGP is:

yt = β′1zt+λ1(1t6T2 − 1t6T1

)+εt where εt ∼ IN

[0,σ2ε

](37)

Even including 10 relevant regressors (not selected over), densities ofthe two shift estimators, γi, centered around true value λ1 = 4σε:

Y Y Fitted Model

0 10 20 30 40 50 60 70 80 90 100

0.0

2.5

5.0 Model Fit

Step-Indicator Saturation with general regressors for a 4SD step shift

Y Y Fitted Model

D(γ25) Normal D(γ35) Normal

-8 -6 -4 -2 0 2 4 6 8

0.5

1.0 Densities of Break Indicators

D(γ25) Normal D(γ35) Normal

tγ25 Normal tγ35

Normal

-20 -15 -10 -5 0 5 10 15 20

0.1

0.2

0.3 t-Statistics of Break Indicators

tγ25 Normal tγ35

Normal

Potency unaffected by regressors (≈ 0.5 for 2SD, ≈ 0.9 for 4SD)Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 55 / 65

Page 66: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Implementation

Step and Impulse-Indicator Saturation (SIS, IIS) in PcGive/Ox

isat in in R-package ‘gets’ (Pretis et al. 2016)(with Genaro Sucarrat and James Reade)Note: SIS construction differs between PcGive & R

PcGive: dT1 = 1t6T1, γ > 0 implies negative shift.R: dT1 = 1t>T1, γ > 0 implies positive shift.

SIS in PcGive/Ox:

model.Autometrics(0.001, “SIS”, ...);

SIS in R: isat (gets)

isat(y, mxreg=..., ar=1:2, sis=TRUE, iis=FALSE, t.pval=0.005,...)

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 56 / 65

Page 67: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Application: Sea Surface Temperature

Global SST – Climate/Weather IndicatorShape of trend? Breaks in series? El Nino Southern Oscillation?

1900 1920 1940 1960 1980 2000

−0.8

−0.4

0.0

0.4

SS

T, C

1900 1920 1940 1960 1980 2000

−20

010

20

SO

I

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 57 / 65

Page 68: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

SST: Model Setup

Model:yt = f(zt) + βxt + δD + εt (38)

T=108 (1900-2008)y= Global Mean SST Anomalies (C) relative to 1950-79z=t, x= Southern Oscillation Index (SOI) (atm. pressure at SL)N = 108 + 6 = 114 variables

Specification:SIS, pα = 0.001 (0.1%)Nonlinear trend: B-spline basis (5 degree polynomial)

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 58 / 65

Page 69: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

SST: SIS Results (Fitted)

1900 1920 1940 1960 1980 2000

−0.8

−0.4

0.0

0.4

SS

T, C

1900 1920 1940 1960 1980 2000

−0.1

0.1

0.3

0.5

Bre

ak in

Inte

rcep

t, C

Two breaks: 1941, δ1 = 0.43∗∗∗C (se=0.046)1946, δ2 = −0.24∗∗∗C (se=0.048)

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 59 / 65

Page 70: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

SST: Interpreting the structural break

WWII: 1941/1942 Measurements: buckets to engine intakeDanger of measurements (light)Americans joined 1941/1942

Post-WWII: Partly changed back

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 60 / 65

Page 71: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

SST: Interpreting the structural break

WWII: 1941/1942 Measurements: buckets to engine intakeDanger of measurements (light)Americans joined 1941/1942

Post-WWII: Partly changed backBuckets: cold bias (≈ 0.3C) (Matthews, 2012)

SIS: δ1 = 0.43 (±0.046)

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 61 / 65

Page 72: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

SST: Linear & Break Component

SOI Effect (linear)SIS: β = −0.004∗∗∗ (se=0.001)Theory consistent: SOI > 0 (La Nina)→ lower temp. (hiatus?!)

BreaksCorrect SST record for ’bucket bias’

1900 1920 1940 1960 1980 2000

−0.8

−0.4

0.0

0.4

SS

T, C

1900 1920 1940 1960 1980 2000

−0.8

−0.4

0.0

0.4

SS

T, C

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 62 / 65

Page 73: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

Conclusions

Commence with general formulation – general unrestricted model:

yt = β′zt+γ′wt+

T∑j=1

δIIS,j1j=t+

T−1∑j=1

δSIS,j1j6t+vt t = 1, . . . , T

Embed theory ztExpand model wt (almost costless if theory correct)

Indicators δt (almost costless under null)

Ensuring valid conditioning – exogeneity

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 63 / 65

Page 74: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

References I

Castle, J. L., J. A. Doornik, D. F. Hendry, and F. Pretis (2015).Detecting locations shifts by step-indicator saturation during model selection.Econometrics 3, 240–264.

Chow, G. C. (1960).Tests of equality between sets of coefficients in two linear regressions.Econometrica 28, 591–605.

Hendry, D. F. (1999).An econometric analysis of US food expenditure, 1931–1989.See Magnus and Morgan (1999), pp. 341–361.

Hendry, D. F., S. Johansen, and C. Santos (2008).Automatic selection of indicators in a fully saturated regression.Computational Statistics 23, 337–339.

Hendry, D. F. and G. E. Mizon (2011).Econometric modelling of time series with outlying observations.Journal of Time Series Econometrics 3(1).

Hendry, D. F. and C. Santos (2005).Regression models with data-based indicator variables.Oxford Bulletin of Economics and statistics 67 (5), 571–595.

Johansen, S. and B. Nielsen (2009).An analysis of the indicator saturation estimator as a robust regression estimator.pp. 1–36. Oxford: Oxford University Press.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 64 / 65

Page 75: General-to-Specific Time Series Modellingfelixpretis.climateeconometrics.org/wp-content/uploads/...David Hendry trying to convince Søren Johansen at Engle and Granger Nobel award

References II

Johansen, S. and B. Nielsen (2016).Asymptotic theory of outlier detection algorithms for linear time series regression models.Scandinavian Journal of Statistics 43(2), 321–348.

Magnus, J. R. and M. S. Morgan (Eds.) (1999).Methodology and Tacit Knowledge: Two Experiments in Econometrics.Chichester: John Wiley and Sons.

Pretis, F., L. Schneider, J. E. Smerdon, and D. F. Hendry (2016).Detecting volcanic eruptions in temperature reconstructions by designed break-indicator saturation.Journal of Economic Surveys, 10.1111/joes.12148.

Reade, J. J. (2008).Updating Tobin’s food expenditure time series data.Working paper, Department of Economics, University of Oxford.

Salkever, D. S. (1976).The use of dummy variables to compute predictions, prediction errors and confidence intervals.Journal of Econometrics 4, 393–397.

Tobin, J. (1950).A statistical demand function for food in the U.S.A.A, 113(2), 113–141.

Pretis (Oxford) 2: Indicator Saturation Michaelmas 2017 65 / 65