ARIMA Models - SPIA UGASPIA UGAspia.uga.edu/faculty_pages/monogan/teaching/ts/Barima.pdf · ARIMA...

ARIMA Models

Jamie Monogan

University of Georgia

January 16, 2018

Jamie Monogan (UGA) ARIMA Models January 16, 2018 1 / 27

Objectives

By the end of this meeting, participants should be able to:

Argue why standard regressions on trending series are inappropriate.

Describe the logic of the Box-Jenkins modeling strategy.

Define stationarity and describe the processes of a stationary series.

Explain why Maximum Likelihood Estimation is essential for ARIMAmodels and how MLE is tailored to time series’ needs.

Identify, estimate, and diagnose ARIMA models.

Identify and estimate seasonal elements of an ARIMA model.


Spurious Regressions in Time Series

Trend on Trend Spuriousness

We also start with error specification because it is essential toaccurately testing our theories.

Imagine that we have a simple hypothesis, say

Population−→GDP

Our measures of each trend (upward, but it doesn’t matter if they areopposite).

Granger and Newbold thesis: any two variables which trend willproduce spurious evidence of causal connection.

In the negative: It is impossible to observe the absence of a statisticalrelationship between any two trending series!

Source: Granger and Newbold. 1974. Spurious Regressions inEconometrics. Journal of Econometrics. 2: 111-120.



What’s the bias with trending series?

The definition of unbiased is: E (β̂) = β

For the case of two trending series we estimate two components:1 β itself, and2 the ratio of the the two trends.

Thus β̂= β + Trend-Ratio

Thus β̂ is unbiased if and only if Trend-Ratio = 0.0

It cannot be zero if both series trend.



First Differencing Trending Series?

First differencing is performing the operation (1-B)z = a

or simply: wt = zt - zt−1

In R: use the “diff” function from the “timeSeries” library

In Stata:

tsset monthd.approval

So, under what conditions is it allowable to regress one trending timeseries on another?

Never!


Structure, Error, and Procedure

Modeling Form: End Game for Inference & Forecasting

Regression

yt = {β0 + β1x1 + β2X2 + · · · }+ ut

= {structure}+ error

The two components are structure and error: Y = structure + error

Box-Jenkins

yt = [transfer function] + ARIMA Model

yt =f(x) + Nt

So we start out working on ARIMA models of error aggregation, theNt , and then later develop transfer functions as tests of theories wecare about. (Opposite what we usually do.)

The causal flow of the transfer function cannot be observed until wesuccessfully model the error processes.



How a Time Series is Produced

We assume the data generating process is:

at −→ Linear filter −→ zt

A white noise input is systematically filtered into the observed timeseries.

Getting Back to White Noise

at −→ Linear filter −→ zt

implies

zt −→ Inverse of Linear filter −→ at

So, if we can solve for z= f(a), then we can invert f and produce a(which is white noise).



The Box-Jenkins Procedure

Identification What class of models probably produced zt?

Estimation What are the model parameters?

Diagnosis Are the residuals, at , from the estimated model white noise?

Empirically Identifying the Error Process

We can infer the data generating process because knowing themathematics, we know the empirical “signature.”

We will develop the signatures of a family of error aggregation modelsthat are autoregressive (AR), integrated (I), moving average(MA)—and all combinations ARIMA(P,D,Q)


Structure, Error, and Procedure AR(1) Example

AR(1): A very important special case

Notation: AR(1) means autoregressive, 1st orderOnly the first lag of z appears in the equation

zt = θ0 + φ1zt−1 + atWhat is its signature? To answer that question it is useful totransform the equation into “shock” form, where z is a function of allprevious a’s.zt = θ0 + at + φ1at−1 + φ2at−2 + φ3at−3 · · ·+ φt−1a1

Simplify

That’s an ugly equation that has T terms, but it has usefulinformation about the expected association of each of the a’s with zt .

Lag 1: φ, that is φ1

Lag 2: φ2

Lag 3: φ3

Lag k: φk



Autocorrelation Function for AR(1)

Since φ is constrained to be <1.0, that means that each exponentialpower of φ is a progressively smaller number, looking like:



Next Steps

If we observe an empirical series that shows this (very common)pattern of autocorrelation (and a couple other details), we judge it tobe AR(1)

This is the IDENTIFICATION stage: we’re using empirical evidencesuch as correlograms to determine the error process.

Once we’ve tentatively judged the class of model, we estimate theparameters of such a model using MLE (ARIMA ESTIMATION, morelater).

After we estimate the model, calculate the residuals, at .

Now the really neat part: If our judgment was correct, at , theestimated residuals must be white noise and we know how to test forthat property! (DIAGNOSIS)

If it is white noise, then we can use this filtered series for our analysis.


Stationarity and Integration


A stationary series is one that tends to return to some equilibriumlevel after being disburbed.

A nonstationary series, or an integrated series, has no equilibrium.The most common integrated series is the random walk (i.e., DJIA).

Box-Jenkins models are defined only for stationary time series.

The good news: Integrated series can be made stationary bydifferencing them. (Which you know how to do.)

How to Know Stationarity

The ACF of a stationary series tends to approach zero after just a fewlags

And stay there.

Integrated series show systematic behavior over very long lag lengths.

In the regression tradition, we will develop the Dickey-Fuller test forunit roots.



Macropartisanship: A Non-stationary SeriesSlowly Decaying Autocorrelation Function


Notation

Three Models and Their Notation

The AR(1) Model

Functional: zt = φzt−1 + at , where -1 < φ < 1

Backshift: (1 - φB)zt = at

Shock form: zt = at + φat−1 + φ2at−2 + φ3at−3 + ... + φta0

The MA(1) Model

Functional: zt = θ0 + at - θ1at−1

Backshift: zt = θ0 + (1 - θ1B)at

Shock form is MA, so there is no expansion

The I(1) Model

Functional: zt = zt−1 + at

Shock form: zt = at + at−1 + at−2 + at−3 ... + a0

Different from AR(1): No decay terms on shocks, permanent memory.


Notation

The General ARMA(P,Q) Model

zt = θ0 +

p∑i=1

φizt−i +

q∑i=1

θiat−i + at

In econometric notation:

yt = α0 +

p∑i=1

αiyt−i +

q∑i=1

βiεt−i + εt


Identification Issues

Some Low Order Models

White noise z = a

Random walk zt - zt−1 = at(1 - B)z = a∆z = a [i.e., cumulated white noise]

AR(1) zt = φzt−1 + at(1 - φB)z = a

Note that (1 - φB) = a for φ=1.0 is then exactly arandom walk. φ = 1.0 is called a unit root.

MA(1) zt = at - θat−1

z = (1 - θB)a

IMA(1,1) (1 - B)z = (1 - θB)aThus an IMA(1,1) is simply a MA(1) operating on firstdifferences.

ARMA(1,1) (1 - φB)z = (1 - θB)a



Autocorrelation Function Expectations

AR(1)zt = φzt−1 + at

zt−1 = φzt−2 + at−1 (from stationarity)

zt = φ(φzt−2 + at−1) + at substituting

zt = φat−1 + φ2zt−2 + at

zt = φat−1 + φ2at−2 + φ3zt−3 + at

...

zt = at + φat−1 + φ2at−2 + φ3at−3 + ... φ`at−`

We expect exponential decay in ACF: 1, φ , φ2 , φ3 , . . .

MA(1)zt = θ0 + at - θ1at−1

ACF: E(ρ1) = -θ1/(1 + θ21)

Crude empirical rule of thumb: ACF(1)= -θ1/2.

This implies ACF(1) negative and less than .5 in absolute value.



Cookbook Rules for IdentificationFuller List: Table 6.1 in Box, Jenkins, & Reinsel (2008, 199)

AR(P) Exponential decay in the ACF, P significant spikes in thePACF

I(1) Slow decay in the ACF, 1 significant spike in the PACF

MA(Q) Q significant spikes in the ACF, exponential decay in thePACF

The Partial Autocorrelation Function

The PACF shows the autocorrelation at lag k controlling for allprevious lags.

Thus it shows the effects at lag k which could not have beenpredicted from lower lags.

In effect then, it shows the independent effects of processes at lag k.

PACF(1) = ACF(1)


Estimation

Least Squares and MLE

Some ARIMA models, e.g., AR(1), are essentially linear and could beestimated by least squares.For example zt = φ1zt−1 + at can be estimated by least squaresregression if you just drop the first case.

R:z <- ts(data$z1)

l.z <- lag(z, -1)

data2 <- ts.union(z, l.z)

reg.1 <- lm(z∼l.z, data=data2)

Stata: tsset month, then reg z l.z

The coefficient on the lagged dependent variable is a LS estimate of φ1

In practice ARIMA software uses a generalized maximum likelihoodalgorithm for all ARIMA models.The φ’s estimated by LS and ML are not identical, but the differenceis nearly always trivial.This is not a case like OLS, where LS and ML solutions are provenidentical when OLS assumptions hold.


Estimation

Maximum Likelihood Unmasked

Maximum Likelihood Estimation is really nothing more than efficienttrial and error.It has three components:

1 A function to be maximized, the log of likelihood for the equationWhy log instead of likelihood itself?

2 An algorithm for generating efficient guesses of parameter values3 Starting values for the parameters.

Log of Likelihood for ARIMA Estimation

LL(θ) = −T

2log(2π)− T

2log(σ2)−

T∑t=1

a2t

2σ2

This applies to any ARIMA(P,D,Q) model.

When the likelihood is known, as here, the problem reduces to findingout how to estimate θ and at .


Estimation

MA(1) illustration

zt = -θ1at−1 + at + θ0

Drop θ0 for simplicity

Note inherent nonlinearity of -θ1at−1

Both θ and at−1 are unobserved quantities to be estimated

Step by step

Presume for the moment that we somehow know θ

How do we estimate at?

Except for the first case; just solve one case at a time: zt is given


Estimation

Solve the MA(1) Equation for atJust Algebra

1 zt = -θat−1 + at2 zt + θat−1 = + at (adding θat−1 to both sides)

3 at = zt + θat−1 (reversing)

So, beginning at time zero, if we know a0, we can solve for a1, if weknow a1, we can solve for a2, if we know a2, we can solve for a3,recursive all the way to aT

So assuming or computing a value for a0 is the key to everything.


Estimation Conditional Maximum Likelihood

Conditional Maximum Likelihood

E(at)=0.0;

Therefore initialize a0 = 0.0

Then maximize log-likelihood conditional on that starting value

The assumption will be false, but its effect is transient

That is guaranteed by the stationarity condition


Estimation Unconditional Maximum Likelihood with Backforecasting

Unconditional Maximum Likelihood

Backforecast a0

Because the backforecast is a product of known z and maximumlikelihood estimates of θ and a, it will be optimum.Hence full ML is preferred to Conditional ML

Backforecasting for MA(1)

Given: zt = -θ1at−1 + at

zt - at= -θ1at−1

(zt - at)/-θ1 = at−1

Inititalize aT+1 = 0. Solve each previous at−1 recursively, even a0.

Why is this optimal?

Backforecasting’s assumption about unobservables is at a series’ end,

and the error in that assumption is transient, so

the transient error will not affect the a0 forecast at the series’ origin.


Seasonality

Cyclical Data

So far, difference equations have focused on trend and error processesin relation to recent values.

Perhaps the data cycle across quarters or months.

One Approach: Linear Seasonal Terms

Perhaps just add another autoregressive or moving-average term intothe difference equation:

ARMA(1,4): yt = a1yt−1 + εt + β1εt−1 + β4εt−4

ARMA(4,1): yt = a1yt−1 + a4yt−4 + εt + β1εt−1

The rub: nonseasonal patterns will interact with the seasonal.

This influences the empirical signature.


Seasonality

Multiplicative Seasonality

The multiplicative model accounts for interaction among terms:

ARMA(1,1), MA seasonal: (1− a1L)yt = (1− β1L)(1− β4L4)εt

ARMA(1,1), AR seasonal: (1− a1L)(1− a4L4)yt = (1− β1L)εt

MA seasonal, in functional terms:

yt = a1yt−1 + εt + β1εt−1 + β4εt−4 + β1β4εt−5

Now our task will be spotting the empirical signature of such a series.

Notation

The seasonal ARIMA model is now: ARIMA(p,d,q)(P,D,Q)s

When differencing series, subscripts refer to the seasonal period, andsuperscripts refer to number of differences.

Using the above terms, our differencing notation is: ∆d∆Ds


Assignments

For Next Time

Write down the research question for your term paper. What is thestatus of the project? Brand new? Data gathered? What?

Complete questions #1 and #2.a-2.c from page 185 of PoliticalAnalysis Using R.

Reading: Time Series Analysis for the Social Sciences, Chapter 2(pp. 58-67) and Section 7.3 (pp. 187-205).


ARIMA Models - SPIA UGASPIA UGAspia.uga.edu/faculty_pages/monogan/teaching/ts/Barima.pdf · ARIMA...

Documents

Transcript of ARIMA Models - SPIA UGASPIA UGAspia.uga.edu/faculty_pages/monogan/teaching/ts/Barima.pdf · ARIMA...