2013.06.17 Time Series Analysis Workshop ..Applications in Physiology, Climate Change and Finance,...

Post on 26-Jun-2015

195 views 4 download

Tags:

description

Professor Dimitris Kugiumtzis, Aristotle University of Thessaloniki, Greece, presented this workshop on linear stochastic processes as part of the Summer School on Modern Statistical Analysis and Computational Methods hosted by the Social Sciences Computing Hub at the Whitaker Institute, NUI Galway on 17th-19th June 2013.

Transcript of 2013.06.17 Time Series Analysis Workshop ..Applications in Physiology, Climate Change and Finance,...

Linear stochastic processes Linear time series (stochastic process)

t i t i

i

X Z

2~ WN(0, )t ZZ | |i

i

? E[ ] 0tX

2( )X i i

i

We assume μ=0

A linear time series is expressed as …

Considering the lag operator :

( )t tX B Z ( ) i

i

i

B B

If for i<0 0i

1 1 2 2

1

t t t t t i t i

i

X Z Z Z Z Z

1 moving average process MA(∞)

tXtZ( )B

Linear filter

1 1 2 2

1

t t t t i t i t

i

X X X Z X Z

2 autoregressive process AR(∞)

is stationary tX

0

| |i

i

is invertible tX the stochastic component

can be expressed in terms of

the current and past

observations

tZ

tX

1( )

( )t t t tB X Z X Z

B

1( )

( )B

B

Time series, Part 2

Autoregressive processes

Autoregression process of order p, AR(p)

2~ WN(0, )t ZZ 1 1 2 2t t t tX X X Z

We restrict the autoregression to the first p most recent terms

1 1 2 2t t t p t p tX X X X Z

Condition of stationarity

The roots of must be outside the unit circle

or

the roots of must be inside the unit circle 1

1 1 0p p

p p

( ) 0

2

1 2(1 )p

p t tB B B X Z ( ) t tB X Z

1

( ) 1p

i

i

i

B B

2

1 2( ) 1 p

pB B B B characteristic polynomial

1 1 2 2

1

t t t t i t i t

i

X X X Z X Z

2 autoregression AR(∞)

Autoregressive process of order one, AR(1)

2

1 2

0

i

t t t t t i

i

X

Successive backward substitutions: 2

2 2 4 2 2 2

02

Var[ ] (1 )1

i

t

i

XX

2

1 1 11 1 1 1E[ ] E[ ] E[ )] 1) 1( (t t t t t tt t Xt t t X XX X Z X X X X X ZX X

Autocorrelation? (we assume stationarity)

1 1E[ ] E[ ] E[ ] ( ) ( 1)t t t t t t t t tt X XtX X Z X X XX X ZX X

( )X

1t t tX X Z 2~ WN(0, )t ZZ Stationarity condition: | | 1

0 2 4 6 8 10-1

-0.5

0

0.5

1

(

)

()

0 2 4 6 8 10-1

-0.5

0

0.5

1

(

)

()

0.8 0.8

Autoregressive process of order two, AR(2)

1 1 2 2t t t tX X X Z 2~ WN(0, )t ZZ

2

1 1 2

1,2

2

4

2

Ρίζες:

2 1

1,2 2 1

2

1

1 1

1 1

?

-3 -2 -1 0 1 2 3-3

-2

-1

0

1

2

3

1

2

Stationarity condition for AR(2)

real distinct roots

complex roots

real single root

two real roots: 2

1 24 0

complex conjugate roots: 2

1 24 0

one double real root: 2

1 24 0

The roots of must be outside the unit circle

Stationarity condition

2

1 2( ) 1B B B

The roots of must be inside the unit circle 2

1 2 0

0 5 10 15 20-1

-0.5

0

0.5

1

(

)

()

1=1.6

2=-0.89

0 5 10 15 20-1

-0.5

0

0.5

1

(

)

()

1=-1.6

2=-0.89

(α) λ1=0.8+0.5i

λ2=0.8-0.5i

(β) λ1=-0.8+0.5i

λ2=-0.8-0.5i

0 5 10 15 20-1

-0.5

0

0.5

1

(

)

()

1=1.6

2=-0.64

0 5 10 15 20-1

-0.5

0

0.5

1

(

)

()

1=-1.6

2=-0.64

(γ) λ1=0.8

λ2=0.8

(δ) λ1=-0.8

λ2=-0.8

0 5 10 15 20-1

-0.5

0

0.5

1

(

)

()

1=1.75

2=-0.76

0 5 10 15 20-1

-0.5

0

0.5

1

(

)

()

1=-0.15

2=0.76

0 5 10 15 20-1

-0.5

0

0.5

1

(

)

()

1=0.15

2=0.76

0 5 10 15 20-1

-0.5

0

0.5

1

(

)

()

1=-1.75

2=-0.76

(ε) λ1=0.8

λ2=0.95

(στ) λ1=0.8

λ2=-0.95

(ζ) λ1=-0.8

λ2=0.95

(η) λ1=-0.8

λ2=-0.95

Autocorrelation

Autoregressive process of order two, AR(2)

1 2 21 1 1t tt tt tX XX X ZX

Autocorrelation ? (we assume stationarity)

1 1 1 1 2 1 2 1E[ ] E[ ] E[ ] E[ ]t t t t t t t tX X X X X X X Z

1

2

21 2(1) (1) (1) (1)X XX X X

1 1 2 1

2 2 22 1 1t tt tt tX XX X ZX

2 1 2 1 2 2 1 2E[ ] E[ ] E[ ] E[ ]t t t t t t t tX X X X X X X Z

2

2 1 21(2) (1 (2) (1))X X X X X

2 1 1 2

11

2

2

12 2

2

1

1

1 21 2

1

2

2 12 2

1

(1 )

1

1

Για υστέρηση τ:

1 2( ) ( 1) ( 2)X X X

1 1 2 2

can be

computed

recursively

2

1 2(1 ) 0B B

characteristic

polynomial

real roots: exponential decay

complex roots: decaying harmonic

function

1 1 2 2t tt tt tX ZXX X X 2 2

1 2(1) (2)X X X Z

22

1 1 2 21

ZX

variance

Autoregressive process of order p, AR(p)

2~ WN(0, )t ZZ 1 1 2 2t t t p t p tX X X X Z

2

1 2(1 )p

p t tB B B X Z

Roots of must be outside the unit circle

Stationarity condition

2

1 2( ) 1 p

pB B B B

Autocorrelation ? (we assume stationarity)

For lag τ:

1 1 2 2t t t p t p tt tX X X X ZX X

1 2( ) ( 1) ( 2) ( )X X X p X p

1 1 2 2 2E[ ] E[ ] E[ ] E[ ] E[ ]t t t t t t p t p t t p tX X X X X X X X X Z

1 2( ) ( 1) ( 2) ( ) ( ) 0X X X p X p B

real roots : exponential decay

complex roots : decaying harmonic function

Autoregressive process of order p, AR(p)

1 1 2 2 p p

1

2

p

1 1 2 1 1

2 1 1 2 2

1 1 2 2

p p

p p

p p p p

Yule-Walker

equations

1

2

p

1

2

p

p

1 2 1

1 1 2

1 2 3

1

1

1

p

p

p

p p p

p p

1

p p

1 1 2 2t t t p tt t p tX X XX X ZX

2 2

1 2(1) (2) ( )X X X p X Zp 2

2

1 1 2 21

ZX

p p

Variance

Partial autocorrelation

Yule-Walker

equations

1 2 1 1 1

1 1 2 2 2

1 2 3

1

1

1

k

k

k k k k k

For each k we compute

the coefficient k kk

1k 111

2k

1

21 2 2 1

1

2 2

1

1

2

1

1 1

1

3k

1 1

1 2

2 1 3

1

3

1 1

2 1

3

2

1

1

1

1

1

12 11 2

1231 1

11 2 31

21 2 1

131 1 2

111 2 3

k

k

kkk

k k k

k k

k k

k k k

partial autocorrelation for lag (order) k

Recursive algorithm of Durbin-Levinson

the coefficients of AR(p) 1 2, , , ,p p pp

are computed recursively, and for each order k

the coefficients are computed from the

coefficients of order k-1

(α) λ1=0.8+0.5i

λ2=0.8-0.5i

(β) λ1=-0.8+0.5i

λ2=-0.8-0.5i

(γ) λ1=0.8

λ2=0.8

(δ) λ1=-0.8

λ2=-0.8

(ε) λ1=0.8

λ2=0.95

(στ) λ1=0.8

λ2=-0.95

(ζ) λ1=-0.8

λ2=0.95

(η) λ1=-0.8

λ2=-0.95

0 5 10 15 20-1

-0.5

0

0.5

1

(

, )

()

1=1.6

2=-0.89

0 5 10 15 20-1

-0.5

0

0.5

1

(

, )

()

1=-1.6

2=-0.89

0 5 10 15 20-1

-0.5

0

0.5

1

(

, )

()

1=1.6

2=-0.64

0 5 10 15 20-1

-0.5

0

0.5

1

(

, )

()

1=-1.6

2=-0.64

0 5 10 15 20-1

-0.5

0

0.5

1

(

, )

()

1=1.75

2=-0.76

0 5 10 15 20-1

-0.5

0

0.5

1

(

, )

()

1=-0.15

2=0.76

0 5 10 15 20-1

-0.5

0

0.5

1

(

, )

()

1=0.15

2=0.76

0 5 10 15 20-1

-0.5

0

0.5

1

(

, )

()

1=-1.75

2=-0.76

Partial autocorrelation

Moving average processes

Moving average process of order q, ΜΑ(q)

invertibility condition

The roots of must be outside the unit circle

( ) 0

2

1 2(1 )q

t q tX B B B Z ( )t tX B Z 2

1 2( ) 1 q

qB B B B

characteristic polynomial

1 moving average MA(∞)

2~ WN(0, )t ZZ

We constrain the white noise terms to the first q most recent terms

1 1 2 2t t t tX Z Z Z

1 1 2 2t t t t q t qX Z Z Z Z i i

1 1 2 2

1

t t t t t i t i

i

X Z Z Z Z Z

ΜΑ(q) is stationary ?

ΜΑ(q) is invertible if 1( )t tZ B X

Moving average process of order one, MA(1)

2~ WN(0, )t ZZ Invertibility condition: | | 1 1t t tX Z Z

1 1

2 2 2(1 )... X Zt tt t t tX Z ZX Z Z

? 2

1 1 2 1 21 (1. . )1

.t t t Xt t t ZX Z ZX Z Z

2 3 12 ...t t tt t tX ZX Z Z Z (2) 0X

21

1

0 2

1| | 1/ 2

For one there are two solutions for θ

and only one satisfies the invertibility condition 1

?

10.4t t tX Z Z

12.5t t tX Z Z

Example 1

12.9

0 2

1t t tX Z Z 11/t t tX Z Z and

they have the same autocorrelation

1 0B If the root of is outside

the unit circle

the root of is inside

the unit circle 1 1/ 0B

Moving average process of order one, MA(1)

Partial autocorrelation

11 1 21

2 2

12,2 2 2 4

11 1

3 3

13,3 2 2 4 6

11 2 1

2

, 2( 1)

(1 ), 1

1

0 2 4 6 8 10-1

-0.5

0

0.5

1

(

)

()

0 2 4 6 8 10-1

-0.5

0

0.5

1

(

, )

()

0 2 4 6 8 10-1

-0.5

0

0.5

1

(

)

()

0 2 4 6 8 10-1

-0.5

0

0.5

1

(

, )

()

0.8

0.8

- ϕττ of ΜΑ(1) decays

the same as ρτ of AR(1)

- ρτ of ΜΑ(1) decays

the same as ϕττ of AR(1)

- … but for MA(1),

ρτ and ϕττ are always ≤0.5

Moving average process of order two, MA(2)

2

1 1 2 2( ) , ~ WN(0, )t t t t t t ZX B Z Z Z Z Z

2

1 2( ) 1B B B

MA(2) is always stationary

MA(2) is invertible if the roots of θ(Β) are outside the unit circle

11 1

2

2 12,2 2

11

3

1 1 2 23,3 2 2

2 1 2

(2 )

1 2 (1 )

2 2 2 2

1 2(1 )X Z Variance

1 2

2 2

1 2

2

2 2

1 2

(1 )1

1

21

0 2

Autocorrelation Partial autocorrelation

, ... complicated

expression

characteristic polynomial

Autocorrelation

Partial autocorrelation

λ1=0.8+0.5i

λ2=0.8-0.5i

0 5 10 15 20-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

(

)

()

1=1.6

2=-0.89

0 5 10 15 20-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

(

, )

()

1=1.6

2=-0.89

λ1=-0.8+0.5i

λ2=-0.8-0.5i

0 5 10 15 20-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

(

)

()

1=-1.6

2=-0.89

0 5 10 15 20-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

(

, )

()

1=-1.6

2=-0.89

λ1=0.8

λ2=0.95

0 5 10 15 20-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

(

)

()

1=1.75

2=-0.76

0 5 10 15 20-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

(

, )

()

1=1.75

2=-0.76

λ1=0.8

λ2=-0.95

0 5 10 15 20-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

(

)

()

1=-0.15

2=0.76

0 5 10 15 20-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

(

, )

()

1=-0.15

2=0.76

- ϕττ of ΜΑ(2) decays

same as ρτ of AR(2) - ρτ of ΜΑ(2) decays

same as ϕττ of AR(2)

- … but for MA(2),

ρτ and ϕττ is always ≤0.5

Moving average process of order q, MA(q)

2~ WN(0, )t ZZ 1 1 2 2( )t t t t t q t qX B Z Z Z Z Z

2 2 2 2

1(1 )X q Z Variance

Autocovariance

2

1 1( ) 1,2, ,

0

Z q q q

q

Autocorrelation

1 1

2 2 2

1 2

1,2, ,1

0

q q

q

q

q

The partial autocorrelation decays in a way that is determined

from the roots of the characteristic polynomial

2

1 2( ) 1 q

qB B B B characteristic polynomial

The expressions of ϕττ in terms of the

coefficients θ1, θ2, ..., θq are complicated

2~ WN(0, )t ZZ 1 1 2 2t t t p t p tX X X X Z

2

1 2(1 )p

p t tB B B X Z

( ) t tB X Z

Autoregressive process

order p, AR(p)

1 1 2 2t t t t q t qX Z Z Z Z

Moving average process

of order q, ΜΑ(q)

( )t tX B Z

2

1 2(1 )q

t q tX B B B Z 2

1 2( ) 1 q

qB B B B 2

1 2( ) 1 p

pB B B B

ΜΑ(∞) AR(∞) ( )t tX B Z

1

( )t tX B Z

2

1 2( ) 1B B B

such that ( ) ( ) 1B B

( ) t tB X Z

1

( ) t tB X Z

2

1 2( ) 1B B B

such that ( ) ( ) 1B B

AR(p) ↔ MA(∞) MA(q) ↔ AR(∞) AR(p) and MA(q)

have dual relation

The autocorrelation and partial autocorrelation of

AR(p) and MA(q) have also dual relation

AR(p): ρτ decays exponentially to 0, ϕττ gets zero for τ>p

MA(q): ϕττ decays exponentially to 0, ρτ gets zero for τ>q

Wold's decomposition (1)

every covariance-stationary

time series can be written

as an infinite moving

average (MA(∞)) process

of its innovation process.

AR(p) stationary MA(q) invertible

Relation between AR and MA processes

Autoregressive moving average process ARMA(p,q)

2~ WN(0, )t ZZ

Autoregressive process

of order p, AR(p)

1 1 2 2t t t t q t qX Z Z Z Z

Moving average process

of order q, ΜΑ(q)

1 1 2 2t t t p t p tX X X X Z

1 1 2 2 1 1 2 2t t t p t p t t t q t qX X X X Z Z Z Z

( ) ( )t tB X B Z

1 1 2 2 1 1 2 2t t t p t p t t t q t qX X X X Z Z Z Z

( )

( )t t

BX Z

B

( )

( )t t

BX Z

B

Stationarity is determined by the AR part

Invertibility is determined by the MA part

Autocorrelation:

1 1 2 2 1 1 2 2( )t t t p t p t t t q tt qtX X X X ZX X Z Z Z

1 1 1( ) ( 1) ( ) E[ ] E[ ] E[ ]X X p X t t t t q t t qp X X X

For q 1 1 p p

1 1 p p such as for AR(p)

Για q mixing of autocorrelation for AR(p) and MA(q)

Process ARMA(1,1)

2

1

( )(1 )1

1 2

2

?

1 1t t t tX X Z Z (1 ) (1 )t tB X B Z (1 )

(1 )t t

BX Z

B

Invertibility condition: | | 1

Stationarity condition: | | 1

1 1( )t t t tt tX XX X Z Z

1 1E[ ] E[ ]t t t tX X

2 2 2

0 1 ( )X Z Z 0

2

1 0 Z 1

22

0 2

1 2

1Z

2

1 2

( )(1 )

1Z

1 1 such as for AR(1)

Autocorrelation:

Partial autocorrelation : decays with the lag such as for MA(1)

An ARMA(p,q) process with small p,q, exhibits correlation pattern (ρτ and ϕττ)

that can be attained by AR(p) only for large order p, or

by MA(q) only for large order q

Estimation of models AR, MA, ARMA

(stationary) time series

(stochastic process) t tX

(stationary) time series

of n observations 1 2, , , nx x x

autocovariance

2

(( ) )( )t t

t t t

X X

X X X

sample autocovariance

2

1

(1

( ))n

t t

t

x x xn

c c

0,1, , 1n

1

1 n

t

t

xxn

sample mean value mean value μ

(

(0)(

))

cr r

c

sample autocorrelation autocorrelation

( )

0)(

()

stochastic process AR(p)

1 1 2 2t t t p t p tX X X X Z 2~ WN(0, )t ZZ

stochastic process MA(q)

1 1 2 2t t t t q t qX Z Z Z Z

1 1 2 2

1 1 2 2

t t t p t p t

t t q t q

X X X X Z

Z Z Z

stochastic process ARMA(p,q)

Estimation of the process (model)

● order p or/and q ?

● estimation of model parameters ?

? 2

1 2AR( ) : , , , ,pp

2

1 2ΜΑ( ) : , , , ,qq

2

1 2 1 2ARΜΑ( , ) : , , , , , , , ,p qp q

● AR, MA or ARMA ? other model ?

Estimation of model AR(p)

1 2, , , nx x xWe assume a stochastic process AR(p) generate the time series

Fit of a model AR(p) estimation of parameters 2

1 2, , , ,p

Method of moments or method of Yule-Walker (YW)

Estimation of the parameters from the sample autocorrelations 2

1 2, , , ,p Xr r r s 2

1 2ˆ ˆ ˆ, , , ,p s

2

1 2, , , ,p Xr r r sEstimation of 2

1 2, , , ,p X and then substitution …

Yule-Walker

equations

1 2 1 1 1

1 1 2 2 2

1 2 3

1

1

1

p

p

p p p p p

22

1 1 2 21

ZX

p p

p p

11 2 1 1

1 1 2 22

1 2 3

ˆ1

ˆ1

1 ˆ

p

p

p p p pp

r r r r

r r r r

r r r r

22

1 1 2 2ˆ ˆ ˆ1

ZX

p p

ss

r r r

ˆp pR r

1ˆp p

R r 2 2

1 1 2 2ˆ ˆ ˆ(1 )Z X p ps s r r r

The estimation method of ordinary least squares (OLS)

1 2, , , nx x x with a mean μ

General form of AR(p) 1 1 2 2( ) ( ) ( )t t t p t p tX X X X Z

Fit of model AR(p) to the data

Minimization of the sum of squares of the fitting errors

2

1 1 1

1

min ( , , ) min ( ) ( )n

p t t p t p

t p

S x x x

w.r.t. 1 2, , , , p

1 2ˆ ˆ ˆˆ , , , , p

1 1ˆ ˆˆ ˆ ˆˆ ( ) ( ) ( ), 1, ,t t t p t pz x x x t p n

2

1

2 1ˆ

n

t

t p

Z zsn p

1

ˆ1 n

t

t

xn

x

AR(1) 1 1( )t t tX X Z

2

1 1 1

2

( , ) ( )n

t t

t

S x x

(2) (1)ˆ

ˆˆ1

x x

(2)

2

1

1

n

t

t

x xn

(1) 1

2

1

1

n

t

t

x xn

1 12 2

2 2

2 2

ˆ ˆ( )( ) ( )( )ˆ

ˆ( ) ( )

n n

t t t tt t

n n

t tt t

x x x x x x

x x x

ˆ x

1 12

1( )( )

n

t ttc x x x x

n

2

0 1

1( )

n

ttc x x

n

11

0

ˆc

rc

Other methods for estimation of AR(p)

● backward – forward approach (FB)

● maximum likelihood (ML)

- conditioned

- unconditioned

● Burg’s algorithm (Burg)

The ML estimation is optimal, the other methods approximate it

The ML reduces to OLS when the time series is from a Gaussian process

Asymptotically (for large n) all methods converge to the same (ML) estimates

The YW has the slowest convergence rate to ML

Determination of order p of an AR model

1,1 1t t tx x z

1,2 1 2,2 2t t t tx x x z

1,3 1 2,3 2 3,3 3t t t t tx x x x z

estimation of for model AR(τ)

partial autocorrelation for lag τ: 1 1, , , ,t tt tx xx x accounting for the correlation with

correlation of

The order is p if and for τ>p ,ˆ 0p p ,

ˆ 0 (fall from non-zero to zero

partial autocorrelation)

the criterion of partial autocorrelation

the criterion based on fitting errors

● 2 2AIC( ) ln( )z

pp s

n Akaike information criterion (AIC)

● 2 ln( )BIC( ) ln( )z

p np s

n Bayesian information criterion (BIC)

● 2FPE( ) z

n pp s

n p

Final prediction error (FPE)

Παράδειγμα Growth rate of gross national product (GNP) of USA

quarter-annual observations, 2nd quarter 1947 – 1st quarter 1991).

The time series is corrected for seasonality

0 50 100 150-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

t

xt

GNP of USA: increments

0 5 10 15 20

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

( )

incr.GNO(USA): autocorrelation

0 5 10 15 20

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

,

incr.GNO(USA): partial autocorrelation

0 5 10 15 20

-9.26

-9.24

-9.22

-9.2

-9.18

-9.16

p

AIC

(p)

incr.GNO(USA): AIC

stationary ?

correlation ?

AR(3) ?

order of

AR model ?

parameter estimation

ˆ 0.0077 ˆt tx x

OLS

0 1 2 3ˆ ˆ ˆ ˆˆ 1 0.0047

1ˆ 0.35 2

ˆ 0.18 3ˆ 0.14

1 2 30.0047 0.35 0.18 0.14t t t t tx x x x z fitted AR(3)

1 2 3ˆ 0.0047 0.35 0.18 0.14t t t tx x x x 4, ,176t estimation

ˆ 0.0098z zs ˆ

t̂ t tz x x errors or residual of fit 2 2ˆ 0.0000989z zs

Diagnostic check for model adequacy

Is the residual time series independent? test for independence on 1

ˆn

t t pz

0 50 100 150 200-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

time t

x(t)

incr.GNP(USA): AR(3) fit

100 110 120 130 140-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

time tx(t

)

incr.GNP(USA): AR(3) fit

1 2, , , nx x xWe assume a stochastic process MA(q) for the time series

Fit of the process (model) MA(q) parameter estimation 2

1 2, , , ,q

Fit of the model MA(q)

stochastic process AR(p)

1 1 2 2t t t p t p tX X X X Z 2~ WN(0, )t ZZ

stochastic process MA(q)

1 1 2 2t t t t q t qX Z Z Z Z

1 1 2 2

1 1 2 2

t t t p t p t

t t q t q

X X X X Z

Z Z Z

stochastic process ARMA(p,q)

Estimation of the process (model)

● order p or/and q ?

● estimation of model parameters ?

? 2

1 2AR( ) : , , , ,pp

2

1 2ΜΑ( ) : , , , ,qq

2

1 2 1 2ARΜΑ( , ) : , , , , , , , ,p qp q

● AR, MA or ARMA ? other model ?

Method of moments

Autocorrelation

1 1

2 2 2

1 2

1,2, ,1

0

q q

q

q

q

2 2 2 2

1(1 )X q Z Variance

Nonlinear equation system

w.r.t. the parameters 1 2, , , q

2

1 2, , , ,q Xr r r sEstimation of 2

1 2, , , ,q X

Method of ordinary least squares

Fit of model MA(q) to the data

Minimization of sum of squares of fitting errors

2

1 1 1

1

min ( , , ) min ( )n

q t t q t q

t q

S x z z

w.r.t. 1 2, , , , q

1 2ˆ ˆ ˆ, , , q

1 1 2 2t t t t q t qX Z Z Z Z MA(q)

Numerical optimization method

Innovation algorithm

MA(1) 1t t tX Z Z

Method of moments

2

1

1

1

1 1 ,2

2

1ˆ 1 1 4ˆ| | 0 5 ˆ

2. 0

r

rr r r

11

1

ˆ| | 0.5| |

rr

r

We choose the solution that gives rise to invertibility ˆ| | 1

21

1

0 2

2 2 2(1 )q

X Z 2

2

2ˆ1

XZ

ss

Method of ordinary least squares

1 1z x

2 2 1 2 1z x z x x

3 3 2 3 2 1 3 2 1

2( )z x z x x x x x x

2 2 2 2 2 1 2

1 2 1 3 2 1 1 1

1

min min ( ) ( ) ( )n

n

t n n

t

z x x x x x x x x x

2 2

0 1 2 2min n

na a a

2 2

1 1

1

2 2 1n n n n n

n

n

nz x z x x x x x

computational algorithm: least

squares with constraints for

invertibility

2n-2 solutions, we select the

solution ˆ| | 1 0 0z 0 We assume (and ) 1t t tz x z

Παράδειγμα Growth rate of gross national product (GNP) of USA

quarter-annual observations, 2nd quarter 1947 – 1st quarter 1991).

The time series is corrected for seasonality

0 50 100 150-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

t

xt

GNP of USA: increments

0 5 10 15 20

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

r()

incr.GNP(USA): autocorrelation

ΜΑ(2) ?

order of the

MA model ?

0 2 4 6 8 10-9.24

-9.22

-9.2

-9.18

-9.16

-9.14

q

AIC

(q)

incr.GNP(USA): AIC of MA models

parameter estimation

1 20.0077 0.312 0.272t t t tx z z z 1, ,176t fitted ΜΑ(2)

0.00983zs variance of errors (residuals) 2 0.000097zs

0.0077x

OLS 1ˆ 0.312 2

ˆ 0.272

0 50 100 150 200-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

time t

x(t)

incr.GNP(USA): MA(2) fit

100 110 120 130 140-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

time t

x(t

)

incr.GNP(USA): MA(2) fit

fit with

ΜΑ(2)

0 50 100 150 200-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

time t

x(t)

incr.GNP(USA): AR(3) fit

100 110 120 130 140-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

time t

x(t

)

incr.GNP(USA): AR(3) fit

fit with

AR(3)

Diagnostic check for model adequacy

Is the residual time series independent? test for independence on 1

ˆn

t t pz

1 2, , , nx x xWe assume a stochastic process ARMA(p,q) for the time series

Fit of the process (model) ARMA(p,q)

estimation of parameters 2

1 2 1 2, , , , , , , ,p q

Εκτίμηση μοντέλου ARMA(p,q)

stochastic process AR(p)

1 1 2 2t t t p t p tX X X X Z 2~ WN(0, )t ZZ

stochastic process MA(q)

1 1 2 2t t t t q t qX Z Z Z Z

1 1 2 2

1 1 2 2

t t t p t p t

t t q t q

X X X X Z

Z Z Z

stochastic process ARMA(p,q)

Estimation of the process (model)

● order p or/and q ?

● estimation of model parameters ?

? 2

1 2AR( ) : , , , ,pp

2

1 2ΜΑ( ) : , , , ,qq

2

1 2 1 2ARΜΑ( , ) : , , , , , , , ,p qp q

● AR, MA or ARMA ? other model?

The methods of moments and least squares as for MA(q)

Method of ordinary least squares

1 1z x

2 2 1 1 2 1( )z x x z x x

We assume (and ) 0 0z 0 0x

3 3 2 2 3 2 1( ) ( )z x x z x x x

2

1 1 1 2 1( ) ( ) ( )n

n n n n n n nz x x z x x x x

ARMA(1,1) 1 1( )t t t tX X Z Z

Solution of equation system w.r.t. ,

2

1 2, , , ,p Xr r r sEstimation of 2

1 2, , , X Method of moments

2

1

( )(1 )1

1 2

2

22 2

2

1 2

1X Z

22 2

2

ˆ1

ˆ ˆ ˆ1 2Z Xs s

?

2

1

minn

t

t

z

computational algorithm of least squares with

constraints for invertibility and stationarity

Παράδειγμα Growth rate of gross national product (GNP) of USA

quarter-annual observations, 2nd quarter 1947 – 1st quarter 1991).

The time series is corrected for seasonality

0 50 100 150-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

t

xt

GNP of USA: increments

0 5 10 15 20

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

r()

incr.GNP(USA): autocorrelation

0 2 4 6 8 10

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

p

p,p

incr.GNP(USA): partial autocorrelation

-1 0 1 2 3 4 5 6-9.24

-9.22

-9.2

-9.18

-9.16

-9.14

p

AIC

(p,q

)

incr.GNP(USA): AIC of ARMA models

q=0

q=1

q=2

q=3

q=4

q=5

ARMA(2,2) ?

order of

ARMA ?

0 50 100 150 200-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

time t

x(t)

incr.GNP(USA): AR(3) fit

100 110 120 130 140-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

time t

x(t

)

incr.GNP(USA): AR(3) fit

0 50 100 150 200-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

time t

x(t)

incr.GNP(USA): MA(2) fit

100 110 120 130 140-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

time t

x(t

)

incr.GNP(USA): MA(2) fit

fit with

ΜΑ(2)

fit with

AR(3)

parameter estimation

OLS

1 2 1 2ˆ 0.0065 0.614 0.455 0.301 0.600t t t t t tx x x z z z 1, ,176t fitted ARΜΑ(2,2)

0.00983zs variance of errors (residuals) 2 0.000097zs

0.0077x

1ˆ 0.614 1

ˆ 0.301 2ˆ 0.600 2

ˆ 0.455

0 50 100 150 200-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

time t

x(t)

incr.GNP(USA): ARMA(2,2) fit

100 110 120 130 140-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

time t

x(t

)

incr.GNP(USA): ARMA(2,2) fit

fit with

ARΜΑ(2,2)

Model for time series with trends (ARIMA)

E 0tX 2 2E tX

random walk

1 21t t t tX X XY Y X

1t t

Y

t tX

iid

process AR(1) for 1

(non-stationary process)

First differences: 1(1 )t t t tX B Y Y Y iid process

1t t

Y

non-stationary process that exhibits trends

first differences: 1t t tX Y Y stationary process? NO

second order differences: 1 1 22t t t t t tX X X Y Y Y

stationary process ?

YES

YES

AR(p), MA(q), ARMA(p,q) ?

NO

1t t

Y

non-stationary process ARIMA(p,d,q)

1 1 2 2 1 1 2 2t t t p t p t t t q t qX X X X Z Z Z Z

( ) ( )t tB X B Z

( ) ( )d

t tB Y B Z

stationary after d order differences: 1t t

X

d

t tX Y

(1 )d

tB Y

The polynomial has a

unit root and all other roots are

outside the unit circle

( )(1 )dB B

( )(1 ) ( )d

t tB B Y B Z

Usually 1d

Fit of model ARIMA (Box-Jenkins approach)

1 2, , , ny y ytime series

indication that there is trend

? autocorrelation (strong and slowly decaying)

other ?

?

1 2, , , nx x xstationary time series

d-order differences

other ? (1 )d

t tx B y

fit of model AR(p), MA(q), ARMA(p,q)

model order

estimation of model parameters

model adequacy

diagnostic test

model ARMA(p,q) for 1 2, , , nx x x

then using the inverse transform (1 )d

t tx B y

we get the model ARΙMA(p, d,q) for 1 2, , , ny y y

time series history diagram

if autocorrelation

decays to zero

the time series

is stationary

if the autocorrelation

is statistically not

significant

is it iid ?

STOP

test for

independence YES

NO

nonlinear

model ?

?

prediction ?

0 5 10 15-0.2

0

0.2

0.4

0.6

0.8

r()

annual global temperature: autocorrelation

1840 1860 1880 1900 1920 1940 1960 1980 2000 2020-1

-0.5

0

0.5

1

year

d(t

em

p)

first differences of annual land air temperature anomalies

0 5 10 15-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

r()

first difference of annual global temperature: autocorrelation

1840 1860 1880 1900 1920 1940 1960 1980 2000 2020-1

-0.5

0

0.5

1

1.5

year

glo

bal te

mpera

ture

annual land air temperature anomalies

1 2, , , ny y y real observations

Example Annual index of global temperature (anomaly of surface temperature

of the north hemisphere at grid 5ο x 5ο), time period 1850-2011 Source: http://www.cru.uea.ac.uk/cru/data/temperature

stationary

time series?

stationary

time series?

1 2, , , nx x x first differences

NO

YES

Model for time series ? 1 2, , , nx x x

0 5 10 15-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

r()

first difference of annual global temperature: autocorrelation

0 5 10 15-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

( )

diff of temp: partial autocorrelation

-1 0 1 2 3 4 5 6-3.25

-3.2

-3.15

-3.1

-3.05

-3

-2.95

-2.9

-2.85

p

AIC

(p,q

)

diff of temp: AIC of ARMA models

q=0

q=1

q=2

q=3

q=4

q=5

autocorrelation partial autocorrelation AIC criterion

The most appropriate model ?

1840 1860 1880 1900 1920 1940 1960 1980 2000 2020-1

-0.5

0

0.5

1

time t

x(t

)

diff of global temperature: ARMA(0,4) fit

1930 1935 1940 1945 1950 1955 1960-1

-0.5

0

0.5

1

time t

x(t

)

diff of global temperature: ARMA(0,4) fit

fit of ΜΑ(4) ( ) 0.008x

1 2 3 40.008 0.758 0.022 0.219 0.275t t t t t tx z z z z z 0.2035zs 2 0.0414zs

Model for time series 1 2, , , ny y y

ARIΜΑ(0,1,4) 4(1 ) ( )t tB Y B Z

Model of time series with seasonality (ARMAs)

Hypothesis: there are correlations but only between the same components

of each period (the dependence occurs at time steps s) :

1 2 3 1 2 2 2 1 2 2 3 3 1 3 2, , , , , , , , , , , , , , ,s s s s s s s s s nx x x x x x x x x x x x x

k cycles of period s

Given the time series without trend and with seasonality s 1 2, , , nx x x

s – differences (difference of lag s) (1 )s

t s t t t t sX Y B Y Y Y

Removal of seasonality of period s, : /k n s

Estimation of the periodic components si i=1,…,s 1

1 k

i i js

j

s yk

t t tx y s

Symmetric moving

average of order s /2 /2 1 /2 1 /2

1(0.5 0.5 )t t s t s t s t sx y y y y

s s even

( 1)/2

( 1)/2

1 s

t t i

i s

x ys

s odd

1 2, , , ny y yGiven the time series without trend and with seasonality (periodicity)

1 ( 1) ( 1) 1 ( 1) ( 1)i st i s t P i s P i st i s t Q i s QX X X Z Z Z

model ARMA(P,Q)s for 2, , , ,i i s i s i ksx x x x the same for 1,2, ,i s

1,2, ,i s

1 1t t s P t Ps t t s Q t QsX X X Z Z Z 1, 2, ,t Ps Ps n

( ) ( )s s

t tB X B Z model ARMA(P,Q)s for 1 2, , , nx x x

Given the time series with seasonal trend and

given that the correlations are between components with the same periodic order

Model of time series with seasonality (ARIMAs)

1 2, , , ny y y

This is an extension of ARMA(P,Q)s when the time series has “seasonal trend”,

meaning trend at the time points t, t+s, t+s, …

s – differences (difference of lag s)

(1 )s

t s t t t t sx y B y y y

1 2, , ,s s nx x x 1 2, , , ny y y

ARMA(P,Q)s ( ) ( )s s

t tB X B Z

ARIMA(P,1,Q)s

( )(1 ) ( )s s s

t tB B Y B Z

In general, ARIMA(P,D,Q)s

( )(1 ) ( )s s D s

t tB B Y B Z

Example Mean monthly temperature at Thessaloniki station, period 1930-2000

YR JAN FEB MAR APR MAY JUN ΙJUL AUG SEP OCT NOV DEC

1930 6,7 6,7 11,3 15,7 19 22,6 26,2 26 22,8 17,5 12,1 8,9

1931 7,9 8,8 10 12,7 19,7 24,9 27,4 26,9 21,5 16 10,8 4,1

1932 5 2,9 7,4 14,1 19,4 24,1 27,2 26,1 24,1 21,5 11,7 9,2

1933 5,2 7,6 9,1 13,5 17,6 22,8 25,5 25,3 20,5 17,7 14,2 5,5

1934 5,3 5,7 12,6 15,9 20,5 23,9 26,9 26,3 23,1 17,6 13,4 10

1935 4,5 6,8 8,2 14,7 18,7 24,9 26,1 26 22,8 19,8 12,1 9,2

1936 10,5 8,3 12,2 16 18,4 23 27,1 25,9 21,6 16,3 12,3 7,2

1937 4,8 8,2 12,9 14,5 19,7 24,4 26,4 26 23,6 16,9 12,8 8

1938 4,8 6,6 11 12,9 18,4 24,1 27,5 27,1 22,2 17,9 12,8 8,8

1939 7,9 7,9 8,5 15,5 19,8 23,1 27,2 26,6 21,9 18,5 12,3 7,7

1940 3,1 6,8 8,1 13,9 17,5 22,9 26,6 24,3 21,4 18,5 13 4,2

1941 6,9 10,2 11 15,6 19,2 23,7 26 26 18,9 15,7 10,1 4,8

1942 0,9 5,6 8,6 13,9 20,7 24,6 25,4 26,1 23,8 17,5 10,7 7,3

part of the record

30 35 40 45 50 55 60 65 70 75 80 85 90 95 00 050

5

10

15

20

25

30

year

Tem

p

Temperature Thessaloniki, period 1/1930-12/2000

1930 1940 1950 1960 1970 1980 1990 20000

5

10

15

20

25

30

year

Tem

p

Temperature Thessaloniki, period 1/1930-12/2000 - month

1 2, , , nx x x 71*12 852n

removal of

seasonality

30 35 40 45 50 55 60 65 70 75 80 85 90 95 00 05-6

-4

-2

0

2

4

6

year

Tem

p

Temp Thess: subtract average period

Estimation of seasonal component

30 35 40 45 50 55 60 65 70 75 80 85 90 95 00 0514.5

15

15.5

16

16.5

17

17.5

18

year

Tem

p

Temp Thess: moving average with order 12

moving average

30 35 40 45 50 55 60 65 70 75 80 85 90 95 00 05-10

-5

0

5

10

year

Tem

p

Temp Thess: 12-differencing

12-order difference

same model

ARMA(P,Q)s

for each month

?

0 20 40 60 80 100-1

-0.5

0

0.5

1

r()

Temperature Thessaloniki: autocorrelation

strong seasonality

(periodicity)

600

5

10

15

20

25

30

time t

x(t

)

Temp Thess: ARMA(1,1)12

fit

600

5

10

15

20

25

30

time t

x(t

)

Temp Thess: ARMA(1,1)12

fit

-1 0 1 2 3 4 5 62

2.5

3

3.5

4

4.5

5

p

AIC

(p,q

)

Temp Thess: AIC of ARMA12

models

q=0

q=1

q=2

q=3

q=4

q=5

model ARMA(P,Q)s

30 35 40 45 50 55 60 65 70 75 80 85 90 95 00 050

5

10

15

20

25

30

time t

x(t

)

Temp Thess: ARMA(1,1)12

fit

Fit of ARMA(1,1)12

15.928x

12 120.0075 0.9995 0.5242t t t tx x z z

1.932zs 2 3.733zs

1.603zs

standard deviation

of residual time

series

Fit with the estimation of

the periodic component 1.427zs

Model of time series with trend and seasonality (SARIMA)

removal of trend removal of seasonality

( ) ( )(1 ) (1 ) ( ) ( )s d s D s

t tB B B B Y B B Z

1 2, , , nx x x 1 2, , , ny y y

dependence between successive

observations (time step 1)

2 1 1 2, , , ,t t t t tx x x x x

dependence between seasonal

components of the same seasonal

order (time step s)

2 2, , , ,t s t s t t s t sx x x x x

SARIMA(p,d,q)×(P,D,Q)s

Seasonal multiplicative model

1 2, , , ny y yGiven that the time series has

trend and seasonality s

ARIMA(p,1,q)

( )(1 ) ( )d

t tB B Y B Z ( )(1 ) ( )s s D s

t tB B Y B Z

ARIMA(P,1,Q)s

most often

1d

0D

SARMA(p,q)×(P, Q)s 0d 0D and

0 20 40 60 80 100-0.2

0

0.2

0.4

0.6

0.8

1

1.2

r()

monthly global temperature: autocorrelation

50 60 70 80 90 00 10 20 30 40 50 60 70 80 90 00 10 20-3

-2

-1

0

1

2

3

year

glo

bal te

mpera

ture

land air temperature anomalies, period 1/1850-12/2011

1 2, , , ny y y real observations

Example Monthly index of global temperature (anomaly of surface temperature

of the north hemisphere at grid 5ο x 5ο), time period 1850-2011 Source: http://www.cru.uea.ac.uk/cru/data/temperature

1840 1860 1880 1900 1920 1940 1960 1980 2000 2020-3

-2

-1

0

1

2

3

year

glo

bal te

mpera

ture

land air temperature anomalies, period 1/1850-12/2011

Jan

May

Sep

removal of trend ?

removal of seasonality / periodicity ?

dependences between successive

observations (time step 1) ?

dependence between seasonal

components of the same seasonal

order (time step s) ?

Jan50 Jan52 Jan54 Jan56 Jan58 Jan60 Jan62-1.5

-1

-0.5

0

0.5

1

year

d(t

em

p)

first differences of monthly global temperature

first differences

Jan50 Jan52 Jan54 Jan56 Jan58 Jan60 Jan62-1.5

-1

-0.5

0

0.5

1

1.5

year

d(t

em

p)

first differences of month global temperature

differences of lag 12

significant autocorrelations

for τ=1,2,…

for τ=12,24,…

0 20 40 60 80 100-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

r()

first difference of monthly global temperature: autocorrelation

0 20 40 60 80 100-0.6

-0.4

-0.2

0

0.2

0.4

r()

12-difference of monthly global temperature: autocorrelation

0 20 40 60 80 100-0.4

-0.3

-0.2

-0.1

0

0.1

( )

diff of temp: partial autocorrelation

-1 0 1 2 3 4 5 6

-1.6

-1.55

-1.5

-1.45

-1.4

-1.35

-1.3

-1.25

p

AIC

(p,q

)

diff of monthly temp: AIC of SARMA(p,q)x(0,0)

q=0

q=1

q=2

q=3

q=4

-1 0 1 2 3 4 5 6

-1.6

-1.55

-1.5

-1.45

-1.4

-1.35

-1.3

-1.25

p

AIC

(p,q

)

diff of monthly temp: AIC of SARMA(p,q)x(1,0)

q=0

q=1

q=2

q=3

q=4

-1 0 1 2 3 4 5 6

-1.6

-1.55

-1.5

-1.45

-1.4

-1.35

-1.3

-1.25

p

AIC

(p,q

)

diff of monthly temp: AIC of SARMA(p,q)x(1,1)

q=0

q=1

q=2

q=3

q=4

-1 0 1 2 3 4 5 6

-1.6

-1.55

-1.5

-1.45

-1.4

-1.35

-1.3

-1.25

p

AIC

(p,q

)

diff of monthly temp: AIC of SARMA(p,q)x(0,1)

q=0

q=1

q=2

q=3

q=4

-1 0 1 2 3 4 5 6

-1.6

-1.55

-1.5

-1.45

-1.4

-1.35

-1.3

-1.25

p

AIC

(p,q

)

diff of monthly temp: AIC of SARMA(p,q)x(0,2)

q=0

q=1

q=2

q=3

q=4

-1 0 1 2 3 4 5 6

-1.6

-1.55

-1.5

-1.45

-1.4

-1.35

-1.3

-1.25

p

AIC

(p,q

)

diff of monthly temp: AIC of SARMA(p,q)x(2,0)

q=0

q=1

q=2

q=3

q=4

-1 0 1 2 3 4 5 6

-1.6

-1.55

-1.5

-1.45

-1.4

-1.35

-1.3

-1.25

p

AIC

(p,q

)

diff of monthly temp: AIC of SARMA(p,q)x(2,3)

q=0

q=1

q=2

q=3

q=4

-1 0 1 2 3 4 5 6

-1.6

-1.55

-1.5

-1.45

-1.4

-1.35

-1.3

-1.25

p

AIC

(p,q

)

diff of monthly temp: AIC of SARMA(p,q)x(1,2)

q=0

q=1

q=2

q=3

q=4

min(AIC)=-1.622 for SARMA(3,3) (1,2)12 SARMA(1,2) (1,1)12 AIC=-1.618

50 60 70 80 90 00 10 20 30 40 50 60 70 80 90 00 10 20-4

-2

0

2

4

time t

x(t

)

diff global temp: ARMA(3,3)x(1,2)12

fit

60-4

-2

0

2

4

time t

x(t

)

diff global temp: ARMA(3,3)x(1,2)12

fit

SARMA(3,3) (1,2)12

1 2 3 12 13 14 15

1 2 3 12 13 14 15

24 25 26 27

1.12 0.70 0.22 0.95 1.11 0.70 0.18

0.42 0.22 0.95 1.01 0.48 0.23 0.93

0.13 0.08 0.05 0.08

t t t t t t t t

t t t t t t t t

t t t t

x x x x x x x x

z z z z z z z z

z z z z

0.445zs

0.0013x

50 60 70 80 90 00 10 20 30 40 50 60 70 80 90 00 10 20-4

-2

0

2

4

time t

x(t

)

diff global temp: ARMA(1,2)x(1,1)12

fit

60-4

-2

0

2

4

time t

x(t

)

diff global temp: ARMA(1,2)x(1,1)12

fit

SARMA(1,2) (1,1)12

1 12 13

1 2 12 13 14

0.35 0.98 0.34

1.04 0.1 0.93 0.99 0.12

t t t t

t t t t t t

x x x x

z z z z z z

0.446zs

Prediction of time series

Index and volume of the Athens Stock Exchange (ASE)

Models for time series (AR, MA, ARMA, ARIMA, SARIMA) prediction

Many applications

Can we predict the index or volume the first day(s) of May 2002

given the observations until the end of April 2002?

At what level GICP is to be moved in the next months?

General index of consumer prices (GICP)

01 02 03 04 05 06100

105

110

115

120

125

years

Genera

l In

dex o

f C

om

sum

er

Prices

General Index of Comsumer Prices, period Jan 2001 - Aug 2005

Given the number of sunspots up

to the current date, how many

sunspots will be next year(s)?

Sunspots

1700 1750 1800 1850 1900 1950 20000

50

100

150

200

years

num

ber

of

sunspots

Annual sunspots, period 1700 - 2001

1900 1920 1940 1960 1980 2000

20

40

60

80

100

120

140

160

180

200

years

num

ber

of

sunspots

Annual sunspots, period 1900 - 2001 1960 1970 1980 1990 20000

50

100

150

200

years

num

ber

of

sunspots

Annual sunspots, period 1960 - 2001

Heart rate

What is the next heart rate(s) ?

The problem of time series prediction • We are given the time series up to time n

• We want to estimate xn+k

Prediction xn(k)

Prediction error:

nxxx ,,, 21

)()( kxxke nknn

Stochastic process

prediction Xn(k) is the estimation of the observation Xn+k of

}{ nX

}{ nX

1( ) E[ | , , ]n n k n nX k X X X Best prediction :

Properties of a good prediction: • unbiasedness :

• efficiency, meaning small prediction error E[ ( )]n n kX k X

Var[ ( )] Var[ ( )]n n k nk X X k

Optimizing both unbiasedness and efficiency

minimization of the mean square prediction error

2

E ( )n k nX X k

2

21 1rmse( ) ( ) ( )

1 1

n l k n l k

j j k j

j n j n

k e k x x kl k l k

root mean square error (rmse)

Evaluation of a prediction model :

given also

1 2{ , , , }n n n lx x x

nxxx ,,, 21 Given

prediction model

1( ), ( ), , ( )n n n l kx k x k x k

prediction errors k

time step ahead 1( ), ( ), , ( )n n n l ke k e k e k

( ) ( )j j k je k x x k

, 1, ,j n n n l k

2

21 1mse( ) ( ) ( )

1 1

n l k n l k

j j k j

j n j n

k e k x x kl k l k

Estimation of mean square error (mse)

2

2

1( )

1nrmse( )

1

1

n l k

j k j

j n

n l k

j k

j n

x x kl k

k

x xl k

normalized root mean

square error (nrmse)

nrmse 0

very good prediction

nrmse ≈ 1

prediction at the level

of mean value prediction

learning /

training set test / validation set

Statistical measures of error

Predictions:

1 / 2( ) Var ( )n nx k c e kprediction limits

Given , we want to evaluate the predictability

of a prediction model lnnn xxxxx ,,,,,, 121

Given , we predict )(,),2(),1( kxxx nnn nxxx ,,, 21

Prediction many steps ahead for a given current time

Prediction at a given time step ahead for different current times

2~ N(0, )t zz 1 / 2 1 / 2c z

2

21 1rmse( ) ( ) ( )

1 1

n l k n l k

j j k j

j n j n

k e k x x kl k l k

3. We compute a statistic of prediction errors

)(,),(),( 1 kxkxkx klnnn 2. We pursue predictions for some time step ahead k

nxxx ,,, 21 1. We estimate the model parameters based on the time series

ttt zx

Simple prediction techniques

Deterministic trend (revisited)

trend, a small varying function of time

white noise ),0WN(~ 2

ztz

1 1( ) E | , , ,n n k n k n n n kx k z x x x Prediction:

knknkn zx

Solution: Extrapolation of function μt for times > n

knn zke )(Prediction error:

μt = ?

known simple substitution

unknown estimation

m

mm tctcctp 10)(e.g. polynomial

global (fit to ) nxxx ,,, 21

local (fit only to the m last observations)

nmnmn xxx ,,, 21

Index and volume of ASE, prediction with trend extrapolation (polynomial fit of trend)

01 02 03 04 05 06100

105

110

115

120

125

years

Genera

l In

dex o

f C

om

sum

er

Prices

General Index of Comsumer Prices, period Jan 2001 - Aug 2005

56

1{ }t tx

Deterministic seasonal term ttt zsx

deterministic seasonal term and deterministic trend tttt zsx

Same approach: estimation of the deterministic term

t t t tx s z

01 02 03 04 05 06-4

-3

-2

-1

0

1

2

3

years

detr

ended G

ICP

General Index of Comsumer Prices, linear trend is subtracted

t t tx x

01 02 03 04 05 06-4

-3

-2

-1

0

1

2

3

4

years

year

cycle

of

GIC

P

General Index of Comsumer Prices, year cycle

1

n

t ts

01 02 03 04 05 06-4

-3

-2

-1

0

1

2

3

4

years

detr

ended a

nd d

eseasoned G

ICP

General Index of Comsumer Prices, trend and period comp. subtracted

t t t t t tz x s x s

( )n n k n kx k s

103.9 + 0.31t t

57 103.9 + 0.31*57 121.70

Prediction of Sept 2005

9 0.16s

56(1) 121.86x

GICP, January 2001 – August 2005

1

01

n

j jc

1

0

11110)(n

j

jnjnnnn xcxcxcxckx

Exponential smoothing

Estimation of xn+k as a weighted sum of former observations

110 nccc Desired condition on the weights:

10,1,,1,0,)1( njc j

j

Determination of the weights

with a single parameter :

)()1()( 1 kxxkx nnn

recursive relation :

Index and volume of ASE

Prediction at one time step ahead for all days in May 2002

Comparison of the prediction performance of exponential smoothing for different

Large (weighting most the most recent observations) gives the best prediction

Prediction with exponential smoothing: Examples

Heart rate

Sunspots

Predictions with AR, MA and ARMA

Prediction with autoregressive models (AR)

AR(1) model ttt zxx 1

n

k

n xkx )(

white noise ),0WN(~ 2

ztz

11 nnn zxx 1 nt

Optimal prediction at time step 1: nn xx )1(

Optimal prediction at time step 2:

212 nnn zxx 2 nt

nnn xxx 2)1()2(

Optimal prediction at time step k:

nxxx ,,, 21 Given the time series

knt

1)1( nn ze

Prediction error:

2Var (1)n ze

Prediction error :

21)2( nnn zze

2 2Var (2) ( 1)n ze

2

2

2

1Var ( )

1

k

n ze k

Prediction error :

knknn

k

n zzzke

11

1)(

Prediction of stationary time series with linear models

Optimal prediction at time step 2:

2121 )1()2( pnpnnn xxxx

nxxx ,,, 21 Given the time series

22112 npnpnn zxxx 2 nt

1)1( nn ze

Prediction error :

2)1(Var zne

Prediction error :

21121 )1()2( nnnnn zzzee

22 )1()2(Var zne

AR(p) model tptptt zxxx 11

1 nt 1111 npnpnn zxxx

Optimal prediction at time step 1:

11)1( pnpnn xxx

knt

knkpnpknkn zxxx 11

Optimal prediction at time step k:

)()1()( 1 pkxkxkx npnn

prediction ( ) 0where ( )

observation 0

n

n

n j

x j jx j

x j

Prediction error :

knnknn zekeke )1()1()( 1

1

0

)(k

j

jknjn zbke

1

0

22)(Vark

j

jzn bke

AR(1) AR(6) AR(11)

0.9995 1.1535 1.1523

-0.2126 -0.2131

0.0944 0.0961

-0.0655 -0.0663

0.0103 0.0135

0.0194 -0.0031

-0.0058

0.0190

-0.0175

0.0458

-0.0213

Index ASE, multi-step prediction for May 2002 20,,1,2002.04.30),( knkxn

Index ASE, one step ahead prediction in May 2002 2002.05.312002.05.2),1( nxn

Volume ASE, multi-step prediction for May 2002 20,,1,2002.04.30),( knkxn

AR(1) AR(6) AR(11)

0.9097 0.3412 0.3251

0.2092 0.1955

0.1557 0.1380

0.1369 0.1138

0.0773 0.0455

0.0528 0.0009

0.0350

0.0068

0.0249

0.0420

0.0527

Volume ASE, one step ahead prediction in May 2002 2002.05.312002.05.2),1( nxn

AR(1) AR(6) AR(11)

0.8205 1.3231 1.1848

-0.5297 -0.4385

-0.1655 -0.1718

0.1895 0.1933

-0.2576 -0.1324

0.1702 0.0311

0.0157

-0.0203

0.1993

-0.0186

0.0352

Sunspots, multi step prediction from 1991 to 2001 21,,1,1990),( knkxn

Sunspots, prediction one year ahead in period 1991-2001 20011991),1( nxn

Heart rate, prediction of the next 21 heart rates 21,,1,1060),( knkxn

AR(1) AR(6) AR(11)

0.8065 0.7850 0.7803

-0.1205 -0.0736

0.1983 0.1759

0.1438 0.0858

-0.1407 -0.1239

-0.0465 -0.1899

0.1413

0.0761

0.0073

-0.0463

0.0347

Heart rate, prediction of the next heart rates 10811061),1( nxn

Growth rate of GNP of USA

The observations are at annual-quarters, from the second

quarter of 1947 till the first quarter of 1991 (n=176)

0 50 100 150-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

t

x t

Rate growth of GNP of USA

0 5 10 15 20

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

r( )

Autocorrelation of rate growth

0 2 4 6 8 10

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

p

p,p

Partial Autocorrelation for Rate Growth

0 2 4 6 8 10-9.23

-9.22

-9.21

-9.2

-9.19

-9.18

-9.17

-9.16

-9.15

p

AIC

(p)

AIC for Rate Growth

164 166 168 170 172 174 176-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04Prediction of rate growth with AR(3)

164 166 168 170 172 174 176-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04Prediction of rate growth with AR(1)

( ), 170, 1, ,6nx k n k

1 2 30.0047 0.35 0.18 0.14t t t t tx x x x z

ˆ 0.0077 1ˆ 0.35 2

ˆ 0.18 3ˆ 0.14

0 1 2 3ˆ ˆ ˆ ˆˆ 1 0.0047

ˆ 0.0098z zs

0 1 1 2 2 3 3t t t t tx x x x z AR(3)

AR(1)

10.0047 0.38t t tx x z

ˆ 0.0099z zs

Growth rate of GNP of USA

AR(p), p=1,…,10

(1), 126 176nx n

predictability for k step ahead

0 2 4 6 8 100.7

0.8

0.9

1

1.1

p

nrm

se(p

)

nrmse(k) on the last 50 data

k=1k=2

(1), 146 176nx n

0 2 4 6 8 100.7

0.8

0.9

1

1.1

p

nrm

se(p

)

nrmse(k) on the last 30 data

k=1k=2

Growth rate of GNP of USA

Optimal prediction at time step 1: nn zx )1( 1)1( nn ze

Prediction error :

2)1(Var zne

Prediction error :

2)2( nn xe

2

2Var)2(Var xnn xe

MA(1) model 1 ttt zzx

),0WN(~ 2

ztz

0 αν

0 αν0,,|E 1 jz

jxxz

jn

nnjn

1 nt nnn zzx 11

2 nt 122 nnn zzx

Optimal prediction at time step 2: 0)2( nx

1για0

1για)(

k

kzkx

n

n

1για

1για)(

1

kx

kzke

kn

n

n

For time step k:

1)1( nn ze

Prediction error :

2)1(Var zne

Prediction error :

112)2( nnn zze

22 )1()2(Var zne

Prediction error :

1111)( nkknknn zzzke

1

0

)(k

j

jknjn zke

1

0

22)(Vark

j

jzn ke

MA(q) model qtqttt zzzx 11

1 nt 1111 qnqnnn zzzx

Optimal prediction at time step 1:

11)1( qnqnn zzx

2 nt 221122 qnqnnnn zzzzx

Optimal prediction at time step 2:

22)2( qnqnn zzx

knt qknqknknkn zzzx 11

Optimal prediction at time step k:

1 1 if( )

0 if

k n k n q n q k

n

z z z k qx k

k q

Growth rate of GNP of USA

The observations are at annual-quarters, from the second

quarter of 1947 till the first quarter of 1991 (n=176)

( ), 170, 1, ,6nx k n k

164 166 168 170 172 174 176-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04Prediction of rate growth with MA(2)

(1), 146 176nx n

0 2 4 6 8 100.7

0.8

0.9

1

1.1

q

nrm

se(q

)

nrmse(k) with MA(q) on the last 30 data

k=1k=2

ΜΑ(2)

1 20.0077 0.41 0.40t t t tx z z z

ˆ 0.0109z zs

ARMA(p,q) model qtqttptptt zzzxxx 1111

111111 qnqnnpnpnn zzzxxx

1 nt

1111)1( qnqnpnpnn zzxxx

Optimal prediction at time step 1: 1)1( nn ze

Prediction error :

2)1(Var zne

1

1

( 1) ( ) if( )

( 1) ( ) if

n p n k n q n q k

n

n p n

x k x k p z z k qx k

x k x k p k q

Optimal prediction at time step k:

Prediction with ARMA: merging of the prediction with AR and MA

Growth rate of GNP of USA

1 2 3 1 20.0034 0.15 0.29 0.12 0.33 0.13t t t t t t tx x x x z z z

ARMA(3,2)

ˆ 0.0105z zs

( ), 170, 1, ,6nx k n k

164 166 168 170 172 174 176-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04Prediction of rate growth with ARMA(3,2)

(1), 146 176nx n

0 2 4 6 8 100.7

0.8

0.9

1

1.1

p

nrm

se(p

)

nrmse(k) with ARMA(p,1) on the last 30 data

k=1k=2

1 2, , , ny y yGiven a non-stationary time series

t t t ty s x standard decomposition model for yt :

n k n k n k n ky s x knt prediction of yn+k :

Estimation of μt and st

as functions of time t Removal of μt and st (using

differences) prediction with

models ARIMA or SARIMA t t t tx y s 1.

2. : prediction (of type ARMA) of xn+k ( )nx k

( ) ( )n n k n k ny k s x k 3.

Stages of prediction:

( )ny k 1 2, , , ny y y1. transformation to stationary

1 2, , , nx x x

1 2. prediction of xn+k with some model

( )nx k2

3. inverse transform on the prediction

3

Prediction of non-stationary time series

ARIMA(p,1,q)

(1)nx2. prediction of xn+1 with ARMA(p,q)

(1) (1)n n ny y x

(1) (1)n ny x3. inverse transform :

1t t tx y y

1 2 2 3, , , , ,n ny y y x x x

Stages of prediction of :

stationary 1. transformation :

(1)ny

)1(~)1( nn ee

prediction error :

prediction error of (1)nx

For prediction at k steps ahead :

( ) ( 1) ( )n n ny k y k x k

known from the prediction of yn+k-1 ARMA(p,q) prediction of xn+k

Similar procedure for the prediction with models

ARIMA(p,d,q) or SARIMA(p,d,q)(P,D,Q)s

ASE index Period from January 2002 to September 2005

02 03 04 05 061000

1500

2000

2500

3000

3500

years

clo

se index

ASE General Index, Jan 2002 - Sep 2005

0 10 20 30 40 500

0.2

0.4

0.6

0.8

1

r( )

Autocorrelation of ASE General Index

Returns 1

1

t tt

t

y yx

y

02 03 04 05 06-0.04

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

0.05

years

clo

se index r

etu

rns

Returns of ASE General Index

0 5 10 15 20-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

r( )

Autocorrelation of returns of ASE General Index

ASE index Period from January 2002 to September 2005

0 5 10 15 20-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

p

p

,p

Partial autocorrelation of returns of general index

0 5 10 15 20-9.135

-9.13

-9.125

-9.12

-9.115

-9.11

-9.105

p

AIC

(p)

AIC of returns of general index

Prediction of many steps, all for current time on 20/9/2005

18 25 02 09 16-0.015

-0.01

-0.005

0

0.005

0.01

0.015

days

retu

rns o

f in

dex

yn(k) of index return, n=20.9.2005

return of indexy

n(T), AR(7)

returns of ASE

18 25 02 09 163200

3250

3300

3350

3400

3450

days

clo

se index

xn(k) of general index, n=20.9.2005

general indexx

n(T), AR(7)

ASE index

1(1 )t t ty y x

1 1(1 )n n ny y x

Order of AR model 1

1

t tt

t

y yx

y

( ) ( 1)(1 ( ))n n ny k y k x k

(1) (1 (1))n n ny y x

Prediction

ASE index Period from January 2002 to September 2005

18 25 02 09 163200

3250

3300

3350

3400

3450

days

clo

se index

xn(1) of general index n=20.9.2005 to 12.10.2005

general indexAR(1)AR(7)

One step ahead prediction for period

20/9/2005 – 12/10/2005

0 5 10 15 200.5

1

1.5

p

nrm

se(p

)

nrmse of AR for general index, 20.9.2005-12.10.2005

k=1k=2k=5

Estimation of prediction error

with ΑR(p) models

for the period 20/9/2005 – 12/10/2005