Landslide case histories in Bolivia Gabriel Rodriguez Roca (ER ...
Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural...
Transcript of Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural...
![Page 1: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/1.jpg)
1
Structural Break Detection in Time SeriesStructural Break Detection in Time Series
Richard A. Davis
Thomas Lee
Gabriel Rodriguez-Yam
Colorado State University(http://www.stat.colostate.edu/~rdavis/lectures)
This research supported in part by an IBM faculty award.
![Page 2: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/2.jpg)
2
IntroductionBackgroundSetup for AR models
Model Selection Using Minimum Description Length (MDL)General principlesApplication to AR models with breaks
Optimization using Genetic AlgorithmsBasicsImplementation for structural break estimation
Simulation examplesPiecewise autoregressionsSlowly varying autoregressions
ApplicationsMultivariate time series EEG example
Application to nonlinear models (parameter-driven state-space models) Poisson modelStochastic volatility model
![Page 3: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/3.jpg)
3
Introduction
Structural breaks: Kitagawa and Akaike (1978)
• fitting locally stationary autoregressive models using AIC• computations facilitated by the use of the Householder transformation
Davis, Huang, and Yao (1995)
• likelihood ratio test for testing a change in the parameters and/or order of an AR process.
Kitagawa, Takanami, and Matsumoto (2001)
• signal extraction in seismology-estimate the arrival time of a seismic signal.
Ombao, Raz, von Sachs, and Malow (2001)
• orthogonal complex-valued transforms that are localized in time and frequency- smooth localized complex exponential (SLEX) transform.• applications to EEG time series and speech data.
![Page 4: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/4.jpg)
4
Introduction (cont)
Locally stationary: Dahlhaus (1997, 2000,…)
• locally stationary processes• estimation
Adak (1998)
• piecewise stationary
• applications to seismology and biomedical signal processing
MDL and coding theory:
Lee (2001, 2002)• estimation of discontinuous regression functions
Hansen and Yu (2001)• model selection
![Page 5: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/5.jpg)
5
Introduction (cont)
Time Series: y1, . . . , yn
Piecewise AR model:
where τ0 = 1 < τ1 < . . . < τm-1 < τm = n + 1, and {εt} is IID(0,1).
Goal: Estimate
m = number of segmentsτj = location of jth break point γj = level in jth epochpj = order of AR process in jth epoch
= AR coefficients in jth epochσj = scale in jth epoch
, if , 111 jj-tjptjptjjt tYYYjj
τ<≤τεσ+φ++φ+γ= −− L
),,( 1 jjpj φφ K
![Page 6: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/6.jpg)
6
Introduction (cont)
Motivation for using piecewise AR models:
Piecewise AR is a special case of a piecewise stationary process (see Adak 1998),
where , j = 1, . . . , m is a sequence of stationary processes. It is argued in
Ombao et al. (2001), that if {Yt,n} is a locally stationary process (in the sense of
Dahlhaus), then there exists a piecewise stationary process with
that approximates {Yt,n} (in average mean square).
Roughly speaking: {Yt,n} is a locally stationary process if it has a time-varying spectrum that is approximately |A(t/n,ω)|2 , where A(u,ω) is a continuous function in u.
,)/(~1
),[, 1∑=
ττ −=
m
j
jtnt ntIYY
jj
}{ jtY
, as ,0/ with ∞→→∞→ nnmm nn
}~{ ,ntY
![Page 7: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/7.jpg)
7
Data: yt = number of monthly deaths and serious injuries in UK, Jan `75 – Dec `84, (t = 1,…, 120)Remark: Seat belt legislation introduced in Feb `83 (t = 99).
Example--Monthly Deaths & Serious Injuries, UK
Year
Cou
nts
1976 1978 1980 1982 1984
1200
1400
1600
1800
2000
2200
![Page 8: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/8.jpg)
8
Data: xt = number of monthly deaths and serious injuries in UK, differenced at lag 12; Jan `75 – Dec `84, (t = 13,…, 120)Remark: Seat belt legislation introduced in Feb `83 (t = 99).
Example -- Monthly Deaths & Serious Injuries, UK (cont)
Year
Diff
eren
ced
Cou
nts
1976 1978 1980 1982 1984
-600
-400
-200
020
0
Traditional regression analysis:
⎩⎨⎧
≤<≤≤
=
++=
.12098 if 1, ,981 if 0,
,)(
tt
f(t)
WtbfaY tt
⎩⎨⎧ ≤≤
=
+=−= −
.otherwise 0, ,11099 if ,1
)(12
tg(t)
NtbgYYX
t
ttt
Model: b=−373.4, {Nt}~AR(13).
⎩⎨⎧
≤<≤≤
=
++=
.12098 if 1, ,981 if 0,
,)(
tt
f(t)
WtbfaY tt
⎩⎨⎧ ≤≤
=
+=−= −
.otherwise 0, ,11099 if ,1
)(12
tg(t)
NtbgYYX
t
ttt
![Page 9: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/9.jpg)
9
Model Selection Using Minimum Description Length
Basics of MDL:Choose the model which maximizes the compression of the data or, equivalently, select the model that minimizes the code length of the data (i.e., amount of memory required to encode the data).
M = class of operating models for y = (y1, . . . , yn)
LF (y) = = code length of y relative to F ∈ MTypically, this term can be decomposed into two pieces (two-part code),
where
= code length of the fitted model for F
= code length of the residuals based on the fitted model
,ˆ|ˆ( ˆ()( )eL|y)LyL FFF +=
|y)L F(
)|eL Fˆ(
![Page 10: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/10.jpg)
10
Model Selection Using Minimum Description Length (cont)
Applied to the segmented AR model:
First term : Let nj = τj – τj-1 and denote the length of the jth segment and the parameter vector of the jth AR process, respectively. Then
, if , 111 jj-tjptjptjjt tYYYjj
τ<≤τεσ+φ++φ+γ= −− L
|y)L F(
)|ˆ()|ˆ(),,(),,( )|ˆ()|ˆ(),,(),,(ˆ(
111
111
yLyLppLnnLL(m)yLyLppLLL(m)|y)L
mmm
mmm
ψ++ψ+++=ψ++ψ++ττ+=
LKK
LKKF
),,, , ( 1 jjpjjj jσφφγ=ψ K
Encoding:integer I : log2 I bits (if I unbounded)
log2 IU bits (if I bounded by IU)
MLE : ½ log2N bits (where N = number of observations used to compute ; Rissanen (1989))
θ θ
![Page 11: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/11.jpg)
11
Model Selection Using Minimum Description Length (cont)
)eL F|ˆ(
So,
∑∑==
++++=
m
jj
jm
jj n
ppnmm|y)L
12
1222 log
22
logloglogˆ(F
Second term : Using Shannon’s classical results on information theory, Rissanen demonstrates that the code length of can be approximated by the negative of the log-likelihood of the fitted model, i.e., by
e
)1)ˆ2((log2
ˆ|ˆ(1
22 +σπ≈ ∑
=
m
jj
jn)eL F
For fixed values of m, (τ1,p1),. . . , (τm,pm), we define the MDL as
2)ˆ2(log
2log
22
logloglog
)),(,),,(,(
1 1
222
1222
11
nnn
ppnmm
ppmMDLm
j
m
jj
jj
jm
jj
mm
∑ ∑∑= ==
+σπ++
+++=
ττ K
The strategy is to find the best segmentation that minimizes MDL(m,τ1,p1,…, τm,pm).To speed things up, we use Y-W estimates of AR parameters.
![Page 12: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/12.jpg)
12
Optimization Using Genetic Algorithms
Basics of GA:Class of optimization algorithms that mimic natural evolution.
• Start with an initial set of chromosomes, or population, of possible solutions to the optimization problem.
• Parent chromosomes are randomly selected (proportional to the rank of their objective function values), and produce offspring using crossover or mutation operations.
• After a sufficient number of offspring are produced to form a second generation, the process then restarts to produce a third generation.
• Based on Darwin’s theory of natural selection, the process should produce future generations that give a smaller (or larger) objective function.
![Page 13: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/13.jpg)
13
Map the break points with a chromosome c via
where
For example,
c = (2, -1, -1, -1, -1, 0, -1, -1, -1, -1, 0, -1, -1, -1, 3, -1, -1, -1, -1,-1)t: 1 6 11 15
would correspond to a process as follows:
AR(2), t=1:5; AR(0), t=6:10; AR(0), t=11:14; AR(3), t=15:20
Application to Structural Breaks—(cont)
Genetic Algorithm: Chromosome consists of n genes, each taking the value of −1(no break) or p (order of AR process). Use natural selection to find a near optimal solution.
),,,( )),(,),(,( n111 δδ=⎯→←ττ KK cppm mm
⎩⎨⎧
τ=−
=δ− . isorder AR and at timepoint break if ,
,at point break no if ,1
1 jjjt ptp
t
),,,( )),(,),(,( n111 δδ=⎯→←ττ KK cppm mm
⎩⎨⎧
τ=−
=δ− . isorder AR and at timepoint break if ,
,at point break no if ,1
1 jjjt ptp
t
![Page 14: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/14.jpg)
14
Implementation of Genetic Algorithm—(cont)
Generation 0: Start with L (200) randomly generated chromosomes, c1, . . . ,cL
with associated MDL values, MDL(c1), . . . , MDL(cL).
Generation 1: A new child in the next generation is formed from the chromosomes c1, . . . , cL of the previous generation as follows:
with probability πc, crossover occurs.
two parent chromosomes ci and cj are selected at random with probabilities proportional to the ranks of MDL(ci).
kth gene of child is δk = δi,k w.p. ½ and δj,k w.p. ½
with probability 1− πc, mutation occurs.
a parent chromosome ci is selected
kth gene of child is δk = δi,k w.p. π1 ; −1 w.p. π2 ; and p w.p. 1− π1−π2.
![Page 15: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/15.jpg)
15
Implementation of Genetic Algorithm—(cont)
Execution of GA: Run GA until convergence or until a maximum number of generations has been reached. .
Various Strategies:
include the top ten chromosomes from last generation in next generation.
use multiple islands, in which populations run independently, and then allow migration after a fixed number of generations. This implementation is amenable to parallel computing.
![Page 16: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/16.jpg)
16
Simulation Examples-based on Ombao et al. (2001) test cases
1. Piecewise stationary: Consider a time series following the model,
where {εt} ~ IID N(0,1).
⎪⎩
⎪⎨
⎧
≤≤ε+−<≤ε+−
<≤ε+=
−−
−−
−
,1024769 if ,81.32.1 ,769513 if ,81.69.1
,5131 if ,9.
21
21
1
tYYtYY
tYY
ttt
ttt
tt
t
Time
1 200 400 600 800 1000
-10
-50
510
![Page 17: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/17.jpg)
17
Replace worst 2 in Island 3 with best 2 from Island 2.Replace worst 2 in Island 4 with best 2 from Island 3.Replace worst 2 in Island 1 with best 2 from Island 4.
1. Piecewise stat (cont)
Implementation: Start with NI = 50 islands, each with population size L = 200.
Span configuration for model selection: Max AR order K = 10,
p 0 1 2 3 4 5 6-10
mp 25 25 30 35 40 45 50
πp .08 .1 .1 .09 .09 .09 .09
Replace worst 2 in Island 2 with best 2 from Island 1. 3
4
1
2Stopping rule: Stop when the max MDL does not change for 10 consecutive migrations or after 100 migrations.
After every Mi = 5 generations, allow migration.
![Page 18: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/18.jpg)
18
1. Piecewise stat (cont)
GA results: 3 pieces with breaks at τ1=513 and τ2=769. Total run time 16.31 secs
Time0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.1
0.2
0.3
0.4
0.5
Time0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.1
0.2
0.3
0.4
0.5
True Model Fitted Model
Fitted model: φ1 φ2 σ2
1- 512: .857 .9945513- 768: 1.68 -0.801 1.1134769-1024: 1.36 -0.801 1.1300
![Page 19: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/19.jpg)
19
1. Piecewise stat (cont)
Simulation: 200 replicates of time series of length 1024 were generated. (SLEX results from Ombao et al.)
GA% mean std ASE
Auto-SLEX% Change Points ASE
# of segments
0-1/202
3.64(.13)
.006
.006.500.749
82.025.42(4.56)1/4 , 3/460.03
3.73(.13)
.080
.110
.037
.476
.616
.76117.5
34.09(6.74)
1/4, 2/4, 3/434.04
032.77(5.20)
2/8, 4/8, 5/8, 6/8, 7/85.05
3.830.550.01(6.25)
1.0≥ 6
.64/2 ,)},/(log),/(ˆ{log)33/1(1
232
0
1 jntfntfnASE j
n
tj
jj π=ωω−ω= ∑∑
= =
− .64/2 ,)},/(log),/(ˆ{log)33/1(1
232
0
1 jntfntfnASE j
n
tj
jj π=ωω−ω= ∑∑
= =
−
![Page 20: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/20.jpg)
20
1. Piecewise stat (cont)
Simulation (cont):
True model:
⎪⎩
⎪⎨
⎧
≤≤ε+−<≤ε+−
<≤ε+=
−−
−−
−
,1024769 if ,81.32.1 ,769513 if ,81.69.1
,5131 if ,9.
21
21
1
tYYtYY
tYY
ttt
ttt
tt
t
Order 0 1 2 3 4 5 ≥ 6p1 0 99.4 0.60 0 0 0 0p2 0 0 86.0 11.6 1.8 0.6 0p3 0 0 89.0 10.4 0.6 0 0
AR orders selected (percent):
![Page 21: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/21.jpg)
21
Simulation Examples (cont)
2. Piecewise stationary:
where {εt} ~ IID N(0,1).⎩⎨⎧
≤≤ε+−<≤ε+
=−
−
1024198 if ,9. 1981 if ,9.
1
1
tYtY
Ytt
ttt
Time
1 200 400 600 800 1000
-50
5
![Page 22: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/22.jpg)
22
Time0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.1
0.2
0.3
0.4
0.5
Time0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.1
0.2
0.3
0.4
0.5
2. Piecewise stationary (cont)
Fitted model: φ1 σ2
1- 195: .872 1.081196 - 1024: -.883 1.078
GA results: 2 pieces with break at τ1=196. Total run time 11.96 secs
True Model Fitted Model
![Page 23: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/23.jpg)
23
2. Piecewise stationary (cont)
Simulation: 200 replicates of time series of length 1024 were generated. AR(1) models with break at 196/1024 =.191.
.220(.056)
37.06
.255(.100)
1.0≥ 9
.290(.096)
12.08
.211(.047)
6.07
1.81(.039)
44.05
-0≤ 4
Auto-SLEX% ASE
# of segments
≥ 4
3
2
1
# of segmnts
0
.060(.044)
.023
.320.154.323
3.0
.017(.013)
.002.19297.0
0
change points% mean std ASE
![Page 24: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/24.jpg)
24Time
1 100 200 300 400 500
-20
2
Simulation Examples (cont)
3. Piecewise stationary:
where {εt} ~ IID N(0,1).
⎩⎨⎧
≤≤ε+<≤ε+
=−
−
50050 if ,25. 511 if ,9.
1
1
tYtY
Ytt
ttt
GA results: 2 pieces with break at τ1=47
⎩⎨⎧
≤≤ε+<≤ε+
=−
−
50050 if ,25. 511 if ,9.
1
1
tYtY
Ytt
ttt
![Page 25: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/25.jpg)
25
3. Piecewise stationary (cont)
Simulation results: Change occurred at time τ1 = 51; 51/500=.1
.017.09689.02
≥ 4
3
1
# of segments
0
.020
.011.048.092
2.0
.9.0
change points% mean std
![Page 26: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/26.jpg)
26Time
1 200 400 600 800 1000
-6-4
-20
24
Simulation Examples (cont)
4. Slowly varying AR(2) model:
where and {εt} ~ IID N(0,1).
10241 if 81. 21 ≤≤ε+−= −− tYYaY ttttt
)],1024/cos(5.01[8. tat π−=
0 200 400 600 800 1000
time
0.4
0.6
0.8
1.0
1.2
a_t
10241 if 81. 21 ≤≤ε+−= −− tYYaY ttttt
)],1024/cos(5.01[8. tat π−=
![Page 27: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/27.jpg)
27Time0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.1
0.2
0.3
0.4
0.5
4. Slowly varying AR(2) (cont)
GA results: 3 pieces with breaks at τ1=293 and τ2=615. Total run time 27.45 secs
True Model Fitted Model
Fitted model: φ1 φ2 σ2
1- 292: .365 -0.753 1.149293- 614: .821 -0.790 1.176615-1024: 1.084 -0.760 0.960
Time0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.1
0.2
0.3
0.4
0.5
![Page 28: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/28.jpg)
28
4. Slowly varying AR(2) (cont)
Simulation: 200 replicates of time series of length 1024 were generated.
.232(.029)
35.06
.3081.0≥ 9
.269(.040)
15.08
.243(.033)
18.07
.228(.025)
27.05
.238(.030)
14.0≤ 4
Auto-SLEX% ASE
# of segments
.080(.016)
.071
.075.258.661
63.03
≥ 5
4
2
1
# of segments
0
.309
.550
.8601.0
.127(.014)
.044.50236.0
0
change points% mean std ASE
![Page 29: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/29.jpg)
29
4. Slowly varying AR(2) (cont)
Simulation (cont):
True model:
Order 0 1 2 3 4 ≥ 5p1 0 0 97.2 1.4 1.4 0p2 0 0 97.2 2.8 0 0
AR orders selected (percent): (2 segment realizations)
10241 if 81. 21 ≤≤ε+−= −− tYYaY ttttt
Order 0 1 2 3 4 ≥ 5p1 0 0 100 0 0 0p2 0 0 98.4 1.6 0 0p3 0 0 97.6 1.6 0.8 0
AR orders selected (percent): (3 segment realizations)
![Page 30: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/30.jpg)
30
4. Slowly varying AR(2) (cont)
True Model Average Model
In the graph below right, we average the spectogram over the GA fitted modelsgenerated from each of the 200 simulated realizations.
Time0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.1
0.2
0.3
0.4
0.5
Time
Freq
uenc
y
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.1
0.2
0.3
0.4
0.5
![Page 31: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/31.jpg)
31
Data: Yt = number of monthly deaths and serious injuries in UK, Jan `75 – Dec `84, (t = 1,…, 120)Remark: Seat belt legislation introduced in Feb `83 (t = 99).
Example: Monthly Deaths & Serious Injuries, UK
Year
Cou
nts
1976 1978 1980 1982 1984
1200
1400
1600
1800
2000
2200
![Page 32: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/32.jpg)
32
Piece 1: (t=1,…, 98) IID; Piece 2: (t=99,…108) IID; Piece 3: t=109,…,120 AR(1)
Data: Yt = number of monthly deaths and serious injuries in UK, Jan `75 – Dec `84, (t = 1,…, 120)Remark: Seat belt legislation introduced in Feb `83 (t = 99).
Example: Monthly Deaths & Serious Injuries, UK
Year
Diff
eren
ced
Cou
nts
1976 1978 1980 1982 1984
-600
-400
-200
020
0
Results from GA: 3 pieces; time = 4.4secs
![Page 33: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/33.jpg)
33
Application to Multivariate Time Series
( ),)ˆ(ˆ)ˆ(|)ˆlog(| 21log
22/)1(
logloglog )),(,),,(,(
1 1
11
2
111
1
∑ ∑ ∑
∑
= =
−τ
τ=
−
=
−
−−+++++
+
++=ττ
m
j
m
j tttt
Ttttj
j
m
jjmm
j
j
VVnddddp
pnmmppmMDL
YYYY
K
Multivariate time series (d-dimensional): y1, . . . , yn
Piecewise AR model:
where τ0 = 1 < τ1 < . . . < τm-1 < τm = n + 1, and {Ζt} is IID(0, Id).
In this case,
, if , 12/1
11 jj-tjptjptjjt tjj
τ<≤τΣ+Φ++Φ+= −− ZYYγY L
where and the AR parameters are estimated by the multivariate Y-W equations based on Whittle’s generalizationof the Durbin-Levinson algorithm.
21 )ˆ(ˆ and ),|(ˆ
tttttt EVE YYYYYY −== K
![Page 34: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/34.jpg)
34
GA results: TS 1: 3 pieces with breaks at τ1=513 and τ2=769. Total run time 16.31 secsTS 2: 2 pieces with break at τ1=196. Total run time 11.96 secs
• {Yt1} same as the series in Example 2 (3 segments: AR(1), AR(3), AR(2))
• {Yt2} same as the series in Example 1 (2 segments: AR(1), AR(1))
Example: Bivariate Time Series
Bivariate: 4 pieces with breaks at τ1=197, τ2=519, τ3=769: AR(1), AR(1), AR(2), AR(2)Total run time 1126 secs
Time
1 200 400 600 800 1000
-10
-50
510
Time
1 200 400 600 800 1000
-50
5
![Page 35: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/35.jpg)
35
GA univariate results: 14 breakpoints for T3; 11 breakpoints for P3
Data: Bivariate EEG time series at channels T3 (left temporal) and P3 (left parietal). Female subject was diagnosed with left temporal lobe epilepsy. Data collected by Dr. Beth Malow and analyzed in Ombao et al (2001). (n=32,768; sampling rate of 100H). Seizure started at about 1.85 seconds.
Example: EEG Time series
GA bivariate results: 11 pieces with AR orders, 17, 2, 6 15, 2, 3, 5, 9, 5, 4, 1
Time in seconds
EE
G T
3 ch
anne
l
1 50 100 150 200 250 300
-600
-400
-200
020
0
Time in seconds
EE
G P
3 ch
anne
l
1 50 100 150 200 250 300
-400
-300
-200
-100
0
T3 Channel P3 Channel
![Page 36: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/36.jpg)
36
Remarks:
• the general conclusions of this analysis are similar to those reached in Ombaoet al.
• prior to seizure, power concentrated at lower frequencies and then spread to high frequencies.
• power returned to the lower frequencies at conclusion of seizure.
Example: EEG Time series (cont)
Time in seconds
Freq
uenc
y (H
ertz
)
1 50 100 150 200 250 300
010
2030
4050
Time in seconds
Freq
uenc
y (H
ertz
)
1 50 100 150 200 250 300
010
2030
4050
T3 Channel P3 Channel
![Page 37: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/37.jpg)
37
Remarks (cont):
• T3 and P3 strongly coherent at 9-12 Hz prior to seizure.
• strong coherence at low frequencies just after onset of seizure.
• strong coherence shifted to high frequencies during the seizure.
Example: EEG Time series (cont)
Time in seconds
Freq
uenc
y (H
ertz
)
1 50 100 150 200 250 300
010
2030
4050
T3/P3 Coherency
![Page 38: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/38.jpg)
38
Application to Structural Breaks (Davis, Lee, Rodriguez-Yam)
State Space Model Setup:Observation equation:
p(yt | αt) = exp{αt yt − b(αt) + c(yt)}.
State equation: {αt} follows the piecewise AR(1) model given by
αt = γk + φkαt-1 + σkεt , if τk-1 ≤ t < τk ,
where 1 = τ0 < τ1 < … < τm < n, and {εt } ~ IID N(0,1).
Parameters: m = number of break pointsτk = location of break points γk = level in kth epochφk = AR coefficients kth epochσk = scale in kth epoch
![Page 39: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/39.jpg)
39
Application to Structural Breaks—(cont)
Estimation: For (m, τ1, . . . , τm) fixed, calculate the approximate likelihood
evaluated at the “MLE”, i.e.,
where is the MLE.
Goal: Optimize an objective function over (m, τ1, . . . , τm).
• use minimum description length (MDL) as an objective function
• use genetic algorithm for optimization
( )},2/)()()}y()({1yexp{||)y;ˆ( ****
2/1
2/1
n µ−αµ−α−−α−α+
=ψ nT
nTT
nn
na Gcb
GKGL
)ˆ,,ˆ,ˆ,,ˆ,ˆ,,ˆ(ˆ 22111 mmm σσφφγγ=ψ KKK
![Page 40: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/40.jpg)
40
Application to Structural Breaks—(cont)
Minimum Description Length (MDL): Choose the model which maximizes the compression of the data or, equivalently, select the model that minimizes the code length of the data (i.e., amount of memory required to store the data).
Code Length(“data”) = CL(“fitted model”) + CL(“data | fitted model”)
~ CL(“parameters”) + CL(“residuals”)
Generalization: AR(p) segments can have unknown order.
444 3444 214444444 34444444 21
K
)"("
))y;ˆ((log
)"("
)log(5.1)log()log(
),,,(
:1
11
1
1
residualsCL
L
ParametersCL
nmm
mMDL
jjja
m
jj
m
jj
m
ττ=
−=
−ψ−τ−τ++=
ττ
∑∑
))y;ˆ((log)log()2(5.0)log()log(
)),(,),,(,(
:1
11
11
1 jjja
m
jj
m
jjj
mm
Lpnmm
ppmMDL
ττ=
−=
−ψ−τ−τ+++=
ττ
∑∑K
444 3444 214444444 34444444 21
K
)"("
))y;ˆ((log
)"("
)log(5.1)log()log(
),,,(
:1
11
1
1
residualsCL
L
ParametersCL
nmm
mMDL
jjja
m
jj
m
jj
m
ττ=
−=
−ψ−τ−τ++=
ττ
∑∑
))y;ˆ((log)log()2(5.0)log()log(
)),(,),,(,(
:1
11
11
1 jjja
m
jj
m
jjj
mm
Lpnmm
ppmMDL
ττ=
−=
−ψ−τ−τ+++=
ττ
∑∑K
![Page 41: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/41.jpg)
41
time
y
1 100 200 300 400 500
05
1015
Count Data ExampleModel: Yt | αt ∼ Pois(exp{β + αt }), αt = φαt-1+ εt , {εt}~IID N(0, σ2)
Breaking PointM
DL
1 100 200 300 400 500
1002
1004
1006
1008
1010
1012
1014
True model:
Yt | αt ∼ Pois(exp{.7 + αt }), αt = .5αt-1+ εt , {εt}~IID N(0, .3), t < 250
Yt | αt ∼ Pois(exp{.7 + αt }), αt = −.5αt-1+ εt , {εt}~IID N(0, .3), t > 250.
GA estimate 251, time 267secs
blue =AL
red = IS
![Page 42: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/42.jpg)
42
time
y
1 500 1000 1500 2000 2500 3000
-2-1
01
23
Model: Yt | αt ∼ N(0,exp{αt}), αt = γ + φ αt-1+ εt , {εt}~IID N(0, σ2)
SV Process Example
True model:
Yt | αt ∼ N(0, exp{αt}), αt = -.05 + .975αt-1+ εt , {εt}~IID N(0, .05), t ≤ 750
Yt | αt ∼ N(0, exp{αt }), αt = -.25 +.900αt-1+ εt , {εt}~IID N(0, .25), t > 750.
GA estimate 754, time 1053 secs
Breaking Point
MD
L
1 500 1000 1500 2000 2500 3000
1295
1300
1305
1310
1315
![Page 43: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/43.jpg)
43
time
y
1 100 200 300 400 500
-0.5
0.0
0.5
Model: Yt | αt ∼ N(0,exp{αt}), αt = γ + φ αt-1+ εt , {εt}~IID N(0, σ2)
SV Process Example
True model:
Yt | αt ∼ N(0, exp{αt}), αt = -.175 + .977αt-1+ εt , {εt}~IID N(0, .1810), t ≤ 250
Yt | αt ∼ N(0, exp{αt }), αt = -.010 +.996αt-1+ εt , {εt}~IID N(0, .0089), t > 250.
GA estimate 251, time 269s
Breaking Point
MD
L
1 100 200 300 400 500
-530
-525
-520
-515
-510
-505
-500
![Page 44: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/44.jpg)
44
SV Process Example-(cont)
time
y
1 100 200 300 400 500
-0.5
0.0
0.5
Fitted model based on no structural break:
Yt | αt ∼ N(0, exp{αt}), αt = -.0645 + .9889αt-1+ εt , {εt}~IID N(0, .0935)
True model:
Yt | αt ∼ N(0, exp{αt}), αt = -.175 + .977αt-1+ εt , {εt}~IID N(0, .1810), t ≤ 250
Yt | αt ∼ N(0, exp{αt }), αt = -.010 +.996αt-1+ εt , {εt}~IID N(0, .0089), t > 250.
time
y
1 100 200 300 400 500
-0.5
0.0
0.5
1.0 simulated seriesoriginal series
![Page 45: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/45.jpg)
45
SV Process Example-(cont)Fitted model based on no structural break:
Yt | αt ∼ N(0, exp{αt}), αt = -.0645 + .9889αt-1+ εt , {εt}~IID N(0, .0935)
time
y
1 100 200 300 400 500
-0.5
0.0
0.5
1.0 simulated series
Breaking Point
MD
L
1 100 200 300 400 500-4
78-4
76-4
74-4
72-4
70-4
68-4
66
![Page 46: Richard A. Davis Thomas Lee Gabriel Rodriguez-Yamrdavis/lectures/Cyprus2_04.pdf · 1 Structural Break Detection in Time Series Richard A. Davis Thomas Lee Gabriel Rodriguez-Yam Colorado](https://reader031.fdocuments.us/reader031/viewer/2022022503/5aaff6357f8b9aa8438dfc3b/html5/thumbnails/46.jpg)
46
Summary Remarks
1. MDL appears to be a good criterion for detecting structural breaks.
2. Optimization using a genetic algorithm is well suited to find a near optimal value of MDL.
3. This procedure extends easily to multivariate problems.
4. While estimating structural breaks for nonlinear time series models is more challenging, this paradigm of using MDL together GA holds promise for break detection in parameter-driven models and other nonlinear models.