Stationarity and Cointegration...

116
Stationarity and Cointegration analysis By Tinashe Bvirindi [email protected]

Transcript of Stationarity and Cointegration...

Stationarity and Cointegration analysis

By

Tinashe Bvirindi

[email protected]

layout• Unit root testing

• Cointegration

• Vector Auto-regressions

• Cointegration in Multivariate systems

Introduction

• Stationarity or otherwise of a series can strongly influence its behaviourand properties.

• For instance a ‘shock’ dies away with stationarity but is persistent if nonstationary.

• ‘Spurious regressions’ - if variables are trended over time it may producesignificant coefficients and high R2 but it is a meaningless relationship.

• Two types of trend• Stochastic Trend - [random walk]• Deterministic Trend

• Why distinguish between them?

• May look the same but have very different properties

Deterministic trends

Deterministic trends

• Taking the first difference of a trend stationary series removes the non-stationarity but at the cost of introducing an MA(1) process in the residuals.

Non invertible MA process i.e. cannot be written as an AR process

Deterministic trend

Stochastic trends• Consider the following

Stochastic trends • Take this process forward S periods in time:

• As S approaches infinity • the values of Y do not become independent of the error terms; and

• the drift term increases over time

• This process is know as the stochastic trend because it is dependent on the drift and the stochastic progression of error terms

Stochastic trend

Detecting unit root- dickey fuller tests

• Dickey and Fuller (Fuller, 1976; Dickey and Fuller, 1979).- pioneers on testing for a unit root in time series

• The basic objective of the test is to examine the null hypothesis that:

• Against a one sided alternative

Dickey Fuller tests

Reject Null if DF statistic is more negative than the critical values

Augmented Dickey Fuller Unit root Test

Augmented Dickey Fuller Unit root Test • The ADF test requires a specific lag length to augment the

autoregressive process of Yt so as to soak any dynamic structurepresent in the dependent variable and to expunge any possible serialcorrelation in the regression residuals.

• However:• larger lag length may increase the standard errors of the coefficients- degrees

of freedom are used up.

• lower lag length will not remove all the autocorrelation and will bias theestimated results (Enders, 2010; and Brookes, 2010).

• Use information criterion to choose lag length

An Eviews Demonstration

Unit root testing

Augmented Dickey Fuller results

Computed value

Order of integration

• A time series is said to be integrated of order d, I(d), if after differencing d times it becomes stationary

• If a variable is stationary it is said to be I(0)

• If the first difference of a non stationary variable is stationary it is said to be I(1)

• Most economic data is I(1)

Why are we concerned about the order of integration?• Has a direct bearing on the appropriateness and statistical validity of

regression results• If we wish to regress Y on X when:

• Yt and Xt are stationary, classical OLS is valid• Yt and Xt are integrated of different orders, regression is meaningless• Yt and Xt are integrated of same order and residuals are non stationary, regression

may be spurious• Yt and Xt are integrated of same order and residuals are stationary, regression may

indicate a cointegrating relationships

• CLR is founded on asymptotc theory, which implies convergence of variance on a constant

• This is not the case when variables are non stationary- sample moments converge to Brownian Motion or Weiner Processes

Consequences of non stationarity

• Sampling distributions take a non standard form

• We can no longer rely on t and F distributions in statistical inference

• Normal hypothesis testing is invalidated

• There is a tendency to reject the null of no association between individual and all regressors jointly- the proble only intensifies withanincrease in the sample size

High level over-view on Cointegration

Introduction • Modern econometric analysis emphasise the importance of unit root

testing in conducting empirical econometric work.

• Granger and Newbold (1974) non-stationary data yield misleading or spurious regression results i.e. regressions that do not make sense e.g

Introduction • Results exhibit high R2 values which converge to 1, high F and t-

statistics and very low Durbin Watson statistics (serial correlation in residuals.

• Phillips (1986) –a pioneer on asymptotic theory with I(1) variables, concurs with Granger and Newbold and proves that in the above regression:• While beta parameters should converge to zero as sample size approaches

infinity, they are non zero;

• R2 statistics approach ; and

• T-statistics approach infinity.

• Brooks (2010), results reflect contemporaneously correlated time trends instead of the true underlying relationships

Introduction

• To avoid spurious regressions and to compute BLUE parameters variables are often differenced to achieve stationarity

• However, economic and finance theory is anchored on long run relationship and differencing removes long run information from the data series

Cointegration • Engle and Granger (1987) – it is possible to estimate valid regressions using

non-stationary data.

• Develop a technique to estimate valid parameters and to test for longrunrelationships between nonstationary variable (Granger Representation Theorem)

• A set of non-stationary variables integrated of the same order, say I(1), are linked to form an equilibrium relationship spanning the long-run if they combine to form a lower order series integrated of the order I(0), they are said to be cointegrated.

Cointegration

• This implies that variables will moveclosely together and will not driftarbitrarily over time and thedistance between them will bestationary

• The concept of cointegration mimicsthe existence of a long-runequilibrium relationship to whichthe variables converge over time.

• the distance that the system is awayfrom equilibrium at any given time istermed the equilibrium error

Cointegration

• The distance that the system is away from equilibrium at any given time is termed the equilibrium error.

• It allows for a richer study of the short-run dynamics of adjustment towards equilibrium through the use of error correction models.

Cointegration in bivariate systems

Testing for Cointegration (residuals based test)

Cointegration and error correction

Procedure in testing for Cointegration

Two step Engel and Granger procedure

• Step 1: Run a static regression in levels between the variables

• Save the residuals series: and

• Step 2: Test for stationary of residuals• If stationary- Cointegration, proceed to estimate ECM

• If non stationary- No Cointegration

Step 1:Estimating a static Longrun equation

Go to Quick, then select estimateEquation on the drop down menu

Step 1:Estimating a static long run equation

In the equation dialog box Type the equation you wish to estimate. Always remember to include a constant

Step 1: long run equation resultsRecall spurious regressions

Caution

• Check whether the coefficients in the long-run equation conform to apriori expectations in terms of direction of the impact and not on the magnitude and significance of the coefficients.

• The Rsqr statistic is useless and should not be interpreted.

Step 1: Creating a residual series/ equilibrium error

To create a residuals series Go to the proc button and select the Make residual series option

Create residual series/ Equilibrium error

Creating residual series/ Equilibrium error

Name the residual series and Click OK

Testing for unit roots in the residuals

Once the residual series is createdClick on view and select the unit root testbutton

Step 2:Testing for Cointegration

Select the ADF test on the testType window

Select the level button

Always select nonefor the ADF model when conducting unit root tests

Step 2 residual test

Theoretically, the ADF critical values are not valid

Should ordinarily be based on MacKinnon surface response functions, Harris, 1995.

However, in practiceADF is used as a proxy for the true critical values

Since the ADF is more negative than the critical values we reject the null that the variables are not cointegrated

Step 2: Estimating an error correction model

• The error correction model also known as the dynamics ofadjustment are estimated using the lagged differences of the dataseries and the lag of the equilibrium error we have calculated above

Estimating an error correction model

In the equation dialogue box enter the variables in their differences

First difference of lm3 i.e. money supply

constant Lagged equilibrium error

Estimate an error correction (2 step model)

Valid regressions?

Error correction term/ Speed of adjustment

Estimating a one step EG Cointegration ECM equation• After testing for Cointegration as above, proceed to estimate an error

correcting model

• In the equation dialogue box type the following equation

Lm3(-1) Captures the speed of adjustment towards equilibrium

Longrun componentShortrun dynamics

Cointegration- One step Engel Granger Procedure

Error correction term/ speed of adjustment should always be negative

Calculation of Elasticities

• The elasticity of money demand to the changes in the longru variables are calculated as follows: e.g. the elasticity of nominal gdp

• LNGDP elasticity=(coefficient of gdp)/(coefficient of adjustment

= 0.1575/0.0875

=1.8

• Therefore we would say that a 10% change in nominal income will result in an 18% change in the money demanded in the longrun.

• Note that: the coeffient in the longrun equation estimated in the two step procedure and the elasticity above are almost the same

• This is due to the superconsistency property of OLS

Diagnostic tests- subject equations to a battery of tests

Whilst the equation is still open, click on View to see the menu of diagnostic tests

Diagnostic testing (plot of residual series)

Correlogram of residuals

Correlogram of squared residuals

Residual tests- normality test

Serial correlation test Fail to reject the null of no serial correlation

Heteroskedasticity tests Fail to reject the null of no Heteroskedasticityat 5%

Cumulative sum of residuals test

Cumulative sum of squared residuals tests

Equation is unstable

Investigate and find out why..In this case it’s the run-up to the attainment of independence and the end of apartheid.. To control for this we need dummy variables

Multivariate Cointegrationand Vector Auto-regressions

Advantages of the Engle and Granger approach• Relatively simple

• Useful as a first indication of the existence of a longrun equilibrium relationship

• Where there is a consistent Cointegration vector it allows us to use the superconsistency property of OLS to obtain consistent estimates of the cointegrating vector

• Provides longrun equilibrium information and the short term dynamics

• Provides speed of adjustment to equilibrium

Limitations of the E_G approach• Distribution of test statistics is only a rough guide and will be slightly

different in any application

• For more than two variables it is no longer possible to demonstrate the uniqueness of the Cointegration vector• If we have a vector of N variables each integrated of the same order, we can have

up to N-1 Cointegration vectors

• Has no systematic procedure to estimate multiple Cointegration vectors

• Results are based on asymptotic theory but we do not have infinitely large samples in practise

• Carry over error bias

Multivariate cointegration

• Johansen and Jesilius (1988) and Stock and Watson (1988) develop max likelihood procedure to test for Cointegration

• Their test could estimate and test the number of cointegrationequations and to test restricted versions of the cointegrating vectors and speeds of adjustment

• Allows verification of theories through coefficient restrictions e.t.c

• The test based on the stationary VAR

Vector Autoregressive (VAR) models• We popularised by Sims(1980) as a natural generalisation of

univariate autoregressive models.

• Variables should be treated symmetrically to avoid ‘incredible identification restrictions’

• Let the data speak for itself i.e. no apriori assumption about exogeneity of variables

• Very helpful in identifying the relationship among a set of macroeconomic models

Vector Autoregressive (VAR) models

• Multiequation time series model

• Considers a number of interrelated variables

• Imposes zero restrictions on estimation of parameters

• Atheoretical i.e. no strict reliance on theory to formulate the model

• ‘Everything causes everything’

• However, the number of estimated parameters makes the model difficult to interpret

Vector Autoregressive (VAR) models

• Advantages of VARs over simple regression models:• Every variable is endogenous (no incredible exogeneity assumptions).

• Every variable depends on the others (no incredible exclusion restrictions).

• Simple to estimate and use.

• General disadvantages:• It is a reduced form model; no economic interpretation of the dynamics is

possible.

• Potentially difficult to relate VAR dynamics with DSGE dynamics (which have an ARMA structure)Can be specified as follows

• Can’t be used for certain policy analyses (Lucas critique).

Vector Autoregressive (VAR) models

• Multi-equation time series model

𝑌𝑡 = 𝜇 + 𝑖=1𝑘 θ𝑖 𝑌𝑡−𝑖 + 𝜀𝑡

Yt is a (𝑚 × 1) vector of I(0) variables

𝜇 is a (𝑚 × 1) vector of constants, and

θ1… . . 𝜃𝑘are (𝑚 ×𝑚) matrices of parameters,

k is the appropriate lag length of the model,

𝜀𝑡 is a ( 𝑚 × 1) vector of normally distributed error terms.

Vector Autoregressive (VAR) models

• The properties are:• The variabels are stationary

• Error terms are white noise disturbances with a constant variance

• Error terms are not serially correlated

• The structure of the system allows for feedback effects

• If contemporanoues effects are assumed to be zero the VAR is said to be in standard form and estimation can proceed using OLS

Vector Autoregressive (VAR) modelsStationarity and VAR

• Brookes (2010) it is important that all of the variables in the VAR processbe stationary otherwise hypothesis are invalid

• Sims (1980) and Sims, Stock and Watson (1990) as cited in Enders (2010)recommend against differencing even if variables contain a unit root.

• Argue that the objective of VAR analysis is to determine interrelationshipsamong variables and not to determine parameter estimates.

• Also argue against detrending data in a VAR

• Canova (2005) If we want a constant coefficient VAR, we needstationarity of the variables. If non-stationarities are present a VARrepresentation exists, but with time varying coefficients.

Vector Autoregressive (VAR) models

• To determine the appropriate lag length/ order of VAR• Akaike information criterion• Schwarts information criterion • Likelihood ratio test• Final prediction error• HQ information criterion• Maximum lag for autoregressive models• Experimentation i.e. general to specific modeling

• You choose the lag length that soaks or expunges serial correlation in the residuals

• If maximum lag length is p, then it’s a VAR(p)

Example of a VAR

Variance decompositions

• Enders (2010) enables us to study the variation in Y that is due to its own shocks versus the component of the variation that is due to shocks in other variables

• help determine the relative importance of each innovation in explaining the variables in the system.

• To conduct variance decompositions, the AR process is inverted into an MA process of the errors using Walds Decomposition Theorem

• Rewrites the AR process into

𝑌𝑡 = 𝜇 +

𝑖=1

𝑘

θ𝑖 𝑌𝑡−𝑖 + 𝜀𝑡 𝑋𝑡 = 𝜇 +

𝑖=0

𝜃𝑖𝜀𝑡−𝑖

Forecast error variance decomposition

In a 2 variable case

Forecast error

This reduces to Variance in Y due to itself

Due to others

Variance decomposition• If the forecast error variance is explained by shocks in the variable itself, the the

variable is exogenous

• It is typical for a variable to explain almost all its forecast error variance for short horizons and smaller proportions at longer horizons (Enders, 2010)

• It is also subject to an under identification problem as is the impulse response function, thus there might be need to place additional restrictions on the system in order to obtain the decomposition and impulse responses

• One such restriction is the Choleski decomposition• The contemporaneous value of Y has no contemporaneous effect on X

• This implies an ordering of the variables

• Brooks and Tsolacos (1998) and Enders (2010- the Choleski ordering of the variables has important ramifications on the resulting impulse responses and variance decompositions and is equivalent to an identifying restriction on the VAR.

Impulse response functions

• Impulse responses allows for tracing the time profile of various shocks on the variables in the VAR system

Impulse response function

• Impulse response functions are a practical tool which aid in visualising the behaviour of the variables understudy in response to various shocks.

• They show the dynamics of transmission of shocks, direction and magnitude of the shocks.

• In practice you should always plot your impulse responses together with their standard deviation bands

Multivariate Cointegration

• Johansen and Jesilius enhance the VAR (p) by including the long runcomponents (cointegrating relations) in the VAR (p) process i.e.separating permanent effects from transitory effects.

• It specifies a VECM among variables

Multivariate Cointegration

The cointegrated VAR and VECM

Multivariate cointegration

Testing for Cointegration

• JJ suggest five assumption on which test can be conducted

1. No deterministic trends in the VAR system and the cointegratingrelationship has no intercept and no trend;

2. No deterministic trends in the VAR system and the cointegratingrelationship has an intercept and no trend;

3. Linear trend in the VAR system and the cointegrating relationship has no trend but has an intercept;

4. Linear trend in the VAR system and the cointegrating relationship only has a deterministic trend; and

5. A quadratic trend in the VAR and the cointegrating relationship has a linear deterministic trend.

Testing for cointegration

Testing for cointegration

• Trace statistics tests the null hypothesis that the rank r = 0 (i.e. no cointegration) against the alternative that r > 0 (i.e. there is one or more cointegrating vectors).

• The maximum Eigenvalue statistics on the other hand tests the null hypothesis that the number of cointegrating vectors is r against the specific alternative of r + 1 cointegrating vectors

Testing for Cointegration rank

• The critical values for the tests are obtained using Monte Carlo approach

• The distribution of statistics depends on two components:• The number of non stationary components under the null hypothesis

• The form of the deterministic components, constant, trend or both- has similarity with the Dickey fuller test

• Sometimes the two tests may give conflicting results

• Harris(1995) the maximum eigen value has a sharper alternative hypothesis and is preferred to pin down the number of cointegratingvectors.

• The sequence of the Trace tests leads to a consistent procedure.

Testing for cointegration rank

• Cheung and Lai (1993) propose choosing cointegration rank based on the Trace statistic. They state that the trace statistic is more robust to skewness and excess kurtosis in residuals than the maximum Eigen value statistic.

• Enders (2010) concurs with these findings and states that when the two tests for cointegration rank are in conflict the Trace statistic is likely to give more reliable results.

• Current practise is to only consider the Trace test.

Practical demonstration of Multivariate Cointegration in E-views

Step 1: Pretest data• Pretest all variables to determine their order of integration i.e. test

for unit roots

• Plot the variables to see if a linear time trend is likely to appear in the data series

Step 2: Estimating an Unrestricted VAR

Go to Quick and select estimate VAR

Estimating a VAR

Enter the variables of interest ad click OK

Choose the sample size over which to estimate the VAR

Step 3: Choosing the optimal lag length

In the estimated VAR window, go to View, Lag structure, lag length criteria

Choosing the optimal lag length

Leave the default and click ok

Selecting the lag length

If a long lag is required to ake residuals white noise, reconsider the choice of variables and look for another important explanatory variable to include in the information set

Asterisk indicates lag length selected by Information criteria

Selecting the lag length

• A summary of test statistics that measures the magnitude of the residual autocorrelation in given b the Portmanteau test

• Eviews uses Wald Lag exclusion tests to determine the default lag

• In our case we will select the lag based on SIC (a more stricter test)• AIC gives a generous lag lenth

• HQ is a middle of the road approach

• Chosen lag length for this exercise is 2

Step 4: Deterministic trend specification of the VAR• The variables may have non zero means and deterministic and

stochastic trends

• Similarly Cointegration equations may have intercepts and deterministic trends

• Since the asymptotic distributions of the LR test statistic for Cointegration does not have the usual Chi Square distribution and depends on the restrictions we make with respect to deterministic trends, we need to make assumptions regarding trends underling our data.

• Eviews allows for the 5 trend specification of Johansen and Jesilius

Step 5: Estimation and Determination of Rank

Go to view and select Cointegration tests

Step 5: Estimation and Determination of Rank

Choose the trend assumption you have made in 4: in unique circumstances will you consider a trend in the Cointegration vector

Enter the chosen lag length from step 3 and click OK

Step 5: Estimation and determination of rank

The test is done in specific order from the largest eigen value to the smallest.

We use the ‘Pantula Principle’ where we test for significance until you no longer reject the null

The first null is that there is non stationary relations in the data (r=0)…

So long as TS/MES> critical value reject the null

Use p =-values to make decision

Step 5: Estimation and determination of rank

Click estimate, Select Vector Error Correction

Cross check if lag interval is correct and click on the Cointegration tab

Step 5: Estimation and determination of rank

Enter the number of Cointegration equationsand click OK

Estimations

Cointegration equations/ Longruncomponent

Speed of adjustment

The 2 separate longrunrelationships enter into each of the 4 equations

Estimation

• NB: even if we are only interested in the first cointegratingrelationship, both coinntergation relationships should enter that equation separately.

• There are in effect two ECM terms in the equation

• There is a longrun positive relationship between money supply and inflation and a negative relationship betwee n money supply and the interest rate

• And approximately 8% of deviations in the money supply from its long run equilibrium are cleared in the next quarter

Step 6: Diagnostic testing

• Once a VEC is estimated, a number of diagnostic tests should be performed

• These tests assist in checking the appropriateness of the estimated VAR

• Residual tests:• Portmanteau Autocorrelation test: computes the multivariate Box-Pierce/

Ljung Box Q statistics for serial correlation upto a specified order. Eviewsreports both tests under the null hypothesis of no serial correlation

• Autocorrelation LM test: reports the multivariate LM test statistics for residual serial correlation. Under the null hypothesis of no serial correlation of order h the LM test is asymptotically Chi Square distributed with K^2 degrees of freedom

Step 6: Diagnostic testing

To conduct residual tests go to view, residual tests, portmanteau test

Residual tests

Reject the null of probabilitys of the Q-stat and Ajd Q are less than 0.05

Diagnostics residual tests

We reject the Null Hypothesis of lag order 2 in the residuals of the VECM and conclude that residuals are autocorrelated

Normality tests

• Reports the Multivariate extension fo the Jarque Bera test

• For the multivariate test you must choose a factorisation of residuals that are orthogonal to each other:• Cholesky

• Inverse square root of residual correlation matrix Doornik and Hansen (1994)

• Inverse square root of residual covariance matrix Urza (1997)

• Factorisation from identified VECM

• Eviews reports the test statistics for each othorgonal component

Normality

Choose factorisation method, select cholesky

Normality

The P is the inverse of the lower triangular Cholesky factor of the residual covariance matrix

Reports the joint normality test for our four component equation

Normality

• NB: Paruolo (1997) points out that if normality of the error terms is rejected for other reasons (kurtosis), Johansen results are not affected.

• That is we should not worry if our skewness results are fine.

Whites heteroskedasticity test

• Is an extension of the white’s 1980 test • No cross terms: uses only levels and squares of regressors

• Cross terms: includes all non redundant cross products of regressors(heteroskedasticity of an unknown form)

Heteroskedasticity

Reject the null of no heteroskedasticity

Impulse responses

Click on Impulse, then in the impulse box select the variables you wish to shock and in the responses box select the variable you want to be affected Then click on impulse definition

Choose the number of quarter for the graph

Cholesky ordering

Select the order of the variables i.e. as identified in theory …

This is similar to an identification restriction on the impulses

Impulse responses

Profile and direction of shocks

Variance decompositions

Click on view then choose variance decomposition

Variance decomposition

Select the table option and specify the choleskyordering

Variance decomposition

LNGDP accounts for about 28% of objservedvariations in money supply

Testing theoretical restrictions

Click estimate, the VECM restrictions, impose restrictions

What can go wrong in Johansen Methodology

• We need normally distributed white noise

• The test is asymptotic and can be sensitive to how we formulate the VECM model in limited samples

• Test assumes there are no structural breaks

• If we put a stationary variable in the model, the number of cointegrating vectors may increase

• Weak exogeneity: if weak exogeneity is foun then use a single equation model

References

• Enders, W., 2010, Applied Econometric Time Series 3e, Wiley, USA

• Brookes