Chapter--3333 Volatility: Volatility:: : Concepts and...

ChapterChapterChapterChapter----3333

VolatilityVolatilityVolatilityVolatility: : : : Concepts and Concepts and Concepts and Concepts and Various ModelsVarious ModelsVarious ModelsVarious Models

Chapter-3 Volatility: Concepts and Various Models

116

CHAPTER-3

VOLATILITY: CONCEPTS AND VARIOUS MODELS

Present chapter includes accumulation of knowledge from various textbooks and

articles about the concept of volatility and certain issues related to it. There are four

sections in this chapter. The first section defines and explains the concept of return.

The second section covers the various definitions of the term “volatility”. The third

section deals with the volatility modeling and explains the various models for

measuring volatility. The fourth section summarizes the chapter.

3.1. DEFINITIONS OF RETURNS

Although prices are what we observe in financial markets, most empirical studies focus

on returns. The reason is that, in general, prices are non-stationary whereas returns are

stationary. There are several return definitions, some of which are given as follows:

(1) Simple Price Differences

If Pt is the asset price at time t, then the simple price difference, denoted by Dt is

defined as:

The variability of the simple price difference is an increasing function of the price level

and this might bias the inference at least when the price level increases significantly

during the analysis period. Fortunately, the use of logarithmic return neutralizes most of

this effect (Fama 1965a, 45).

(2) Percentage Returns or Simple Returns

The one-period simple return for holding an asset, which is numerically very close to

logarithmic return for small changes, is defined as:

1

1

−

−−=

t

ttt

p

PPR or )1(1 ttt RPP += −


117

where, Pt is the price (including dividends) of the asset at time t, and Rt is the one-

period simple return from time t-1 to t.1

(3) Log Returns

The log return, rt, is defined as:

rt = log (Pt) - log (Pt-1) = pt-pt-1

where pt=log(Pt) is the log price. In addition, the change in logarithmic price is the

yield, with continuous compounding, from holding a security for the period in question.

Proof of this follows from Fama (1965a, 45):

exp ln ln

Table 3.1: Return Aggregation

Aggregation Temporal Cross-section

Percent Return 1 1 !

Logarithmic Return " " " # !$%&

'

Source: RiskMetricstm

– Technical Document 1996, 49.

When applying logarithmic return, the continuous time generalizations of discrete time

results are easier and returns over more than one day are simple functions of single day

returns (Taylor 1986, 13). In other words, a key advantage of the log return is that the

multiple-period return is simply the sum of one-period returns, so that

1 If the asset is held for k periods from t-k to t, the k-period simple return is calculated as:

kt

kttt

P

PPkR

−

−−=][ or )1(.........)1(])[1( 1 tktkttktt RxxRPkRPP ++=+= +−−− where Rt[k]

is the k-period simple return from t-k to t. Therefore, the simple one-period return and k-period

return is non-linear. In some cases, if returns are small, we may use the approximation

∑−

=

−≈1

0

k

j

jtt RR but it is too crude in many applications.


118

∑∑−

=

−

−

=

− =+=+=1

0

1

0

)1log(])[1log(][k

i

jt

k

j

jttt rRkRkr

In addition to these points, the return aggregation is of importance in most of the

financial applications. Table 3.1 summarizes the difference between percentage returns

and logarithmic returns. Wi denotes the weight of asset I, t denotes the time and p

denotes the portfolio. The table indicates that when the aggregation is done across time

it is more convenient to work with logarithmic returns and, in case of aggregation

across assets, percentage return results in a simpler expression.

Figure 3.1: Terms of Measurement

0

1000

2000

3000

4000

5000

6000

7000

2-Jan-91

2-Jan-92

2-Jan-93

2-Jan-94

2-Jan-95

2-Jan-96

2-Jan-97

2-Jan-98

2-Jan-99

2-Jan-00

2-Jan-01

2-Jan-02

2-Jan-03

2-Jan-04

2-Jan-05

2-Jan-06

2-Jan-07

2-Jan-08

2-Jan-09

Closing Prices of NIFTY

-600

-500

-400

-300

-200

-100

0

100

200

300

400

2-Jan-91

2-Jan-92

2-Jan-93

2-Jan-94

2-Jan-95

2-Jan-96

2-Jan-97

2-Jan-98

2-Jan-99

2-Jan-00

2-Jan-01

2-Jan-02

2-Jan-03

2-Jan-04

2-Jan-05

2-Jan-06

2-Jan-07

2-Jan-08

2-Jan-09

Price Differences of NIFTY

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

2-Jan-91

2-Jan-92

2-Jan-93

2-Jan-94

2-Jan-95

2-Jan-96

2-Jan-97

2-Jan-98

2-Jan-99

2-Jan-00

2-Jan-01

2-Jan-02

2-Jan-03

2-Jan-04

2-Jan-05

2-Jan-06

2-Jan-07

2-Jan-08

2-Jan-09

Percentage Returns for NIFTY

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

2-Jan-91

2-Jan-92

2-Jan-93

2-Jan-94

2-Jan-95

2-Jan-96

2-Jan-97

2-Jan-98

2-Jan-99

2-Jan-00

2-Jan-01

2-Jan-02

2-Jan-03

2-Jan-04

2-Jan-05

2-Jan-06

2-Jan-07

2-Jan-08

2-Jan-09

Log Returns for NIFTY


119

However, since percentage return and logarithmic return are very close for small

changes, it is common to approximate a portfolio return in case of logarithmic return as:

" ( !")

*

This leads to a situation where the one-day model computed with rt extends easily to

returns of greater than one day. In general, the use of logarithmic return is commonly

accepted among the financial researchers. Use of different return concepts may lead to

different interpretations. Figure 3.1 illustrates the behavior of the three measures when

applied to Nifty 50. Especially, the effect of price level on price variability, in case of

the simple price differences, is clearly visible from it.

3.2. CONCEPT OF VOLATILITY

Volatility is a crucial concept in financial theory and practice. Nonetheless, it is not easy

to define the concept of volatility. Since volatility is often calculated as a sample

standard deviation, people often mistakenly assume that volatility is equivalent to

standard deviation, which is in fact only a biased estimator of true volatility. As

described by Andersen, Bollerslev, Christoffersen and Diebold (2005):

“In everyday language, volatility refers to the fluctuations observed in some phenomena

over time. Within economics, it is used slightly more formally to describe, without a

specific implied metric, the variability of the random (unforeseen) component of a time

series.” (p. 1)

This description of volatility provides a relatively clear, intuitive explanation of

volatility. Andersen, Bollerslev, Christoffersen and Diebold (2005) further explains:

“More precisely, or narrowly, in financial economics, volatility is often defined as the

(instantaneous) standard deviation (or sigma) of the random Wiener-driven component

in a continuous-time diffusion model. Expressions such as the “implied volatility” from

option prices rely on this terminology.” (p. 1)


120

Volatility, in other words, is defined as the spread of all likely outcomes of an uncertain

variable. Typically, in financial markets, we are often concerned with the spread of

asset returns. Volatility is the degree to which financial prices tend to fluctuate. Large

volatility means that returns (that is: the relative price changes) fluctuate in a wide

range.

Statistically, volatility most frequently refers to the sample standard deviation of the

continuously compounded returns (refer to section 3.1.2) of a financial instrument with

a specific time horizon. The sample standard deviation is defined as:

∑−

−−

=T

t

trT 1

2)(1

1ˆ µσ [3.1]

where, rt is the return on day t , and µ is the average return over the T-day period.

Standard deviation is often used to quantify the risk of the instrument over that time

period. Volatility is typically expressed in annualized terms, and it may either be an

absolute number ($5) or a fraction of the mean (5%). For a financial instrument, whose

price follows a Gaussian random walk, or Wiener process, the volatility increases as

time increases. Conceptually, this is because there is an increasing probability that the

instrument's price will be farther away from the initial price as time increases. However,

rather than increase linearly, the volatility increases with the square-root of time as time

increases, because some fluctuations are expected to cancel each other out, so the most

likely deviation after twice the time will not be twice the distance from zero.

More broadly, volatility refers to the degree of (typically short-term) unpredictable

change over time of a certain variable. It may be measured via the standard deviation of

a sample, as mentioned above. However, price changes actually do not follow Gaussian

distributions. Better distributions used to describe them actually have "fat tails"

although their variance remains finite. Therefore, other metrics may be used to describe

the degree of spread of the variable. As such, volatility reflects the degree of risk faced

by someone with exposure to that variable.


121

Different Definitions of Volatility

Volatility is a theoretical construct. Models for volatility often use an unobservable

variable that controls the degree of fluctuations of the financial return process. This

variable is usually called the volatility. Generally, two different volatility models, will

lead to different concepts of volatility. For example, in GARCH models the volatility is

thought of as conditional variance (or standard deviation) of the return, whereas in

diffusion models (stochastic differential equations) the volatility refers to either the

instantaneous diffusion coefficient or the quadratic variation over a given time period

(often called the integrated volatility). It is useful to keep the following questions in

mind in the context of any volatility definition: (i) what is the asset price model? (ii)

what is the time horizon for this volatility? (iii) Is the volatility forward looking,

backward looking or instantaneous? (iv) Is the volatility a model variable, or the

estimator of a model variable? (v) In case of an estimator of the model variable: is the

estimated volatility extracted from returns data (price fluctuations), or from option

prices (implied volatilities)? Each bullet treats a different volatility definition:

• Conditional Variance/Conditional Standard Deviation: Given the information

untill until now +,, the variance of the financial return ",, over the next

period is:

./"",0+,1 [3.2]

The square root of this quantity is the conditional standard deviation. Note that

this variance depends on the information set. To obtain an explicit number, one

has to make model assumptions for the returns. If one assumes that the financial

returns are iid Normal, then the volatility is constant and the usual variance

estimator is appropriate:

234 1/6 1 ∑ ", "8,4), [3.3]

Time series models are designed to deal with the situation of time varying

volatility.

• Time Series Volatility: Discrete time models for time varying volatility often have

a product structure:

", 2, 9,. [3.4]


122

The financial return ", over period n is the product of the volatility 2, and the

zero mean, unit variance innovation 9,. Models for 2, include ARCH/GARCH,

Stochastic Volatility, Long Memory, Markov Switching, etc.

• Spot Volatility: the instantaneous volatility. One needs a model describing the

continuous time price movements. Consider a stochastic differential equation like:

;< 2<;=< [3.5]

where B denotes standard Brownian motion and p(t) the log price process. The

variable 2< is called the Spot volatility. Models of this class are often used in

option pricing.

• Quadratic Variation: Consider a continuous time stochastic process over a given

time period. Divide the time period into small, adjacent intervals. Determine the

sum of the squared returns over these intervals:

∑ < <4 [3.6]

The quadratic variation is defined as the limit of these sums as the length of the

sampling intervals goes to zero.

• Realized Volatility Measures: The finite sample quantities in equation (3.4) are

often called “realized volatility” or “realized variance”. Popular sampling

frequencies are measured every 5 minutes and every 30 minutes. It is possible to

construct alternative proxies for volatility, for example, using the high low ranges

over intraday intervals.

• Implied Volatility: Given a specific asset price model, one can determine the

volatility that matches the theoretical option prices from the model to the real life

option prices in the market. This volatility is called the implied volatility. The

Black and Scholes is often used for determining implied volatilities. There also

exist ‘model free implied volatilities’.

Standard Deviation Versus Variance

Sometimes, variance, 2σ , is also used to measure volatility. Since variance is simply

the square of standard deviation, it makes no difference whichever measure we may use

when we compare volatility of two assets. However, variance is much less stable and


123

less desirable than standard deviation as an object for computer estimation and volatility

forecast evaluation. Moreover, standard deviation has the same unit of measure as the

mean, i.e. if the mean is in rupees, then standard deviation is expressed in rupees

whereas variance is expressed in rupee square. For this reason, standard deviation is

more convenient and intuitive when volatility is examined.

Volatility Versus Risk

Volatility is related to, but not exactly the same as risk. Risk is associated with

undesirable outcome, whereas volatility, as a measure strictly for uncertainty, could be

due to a positive outcome. Moreover, volatility is not a good or perfect measure of risk

because volatility (or standard deviation) is only a measure for the spread of a

distribution and has no information on its shape. The only exception is the case of

normal or lognormal distribution where the mean and the standard deviation are

sufficient statistics for the entire distribution (i.e through these two alone one can

reproduce the empirical distribution).

Volatility Versus Direction

Volatility does not imply direction. This is due to the fact that all changes are squared.

An instrument that is more volatile is likely to increase or decrease in value more than

one that is less volatile. For example, a savings account has low volatility. While it

won’t lose 50% in a year, it also won’t gain 50%.

Volatility is Unobservable

Common to all models of volatility, is that volatility itself is not observed. The situation

may be compared to rolling a dice: one can observe the realizations of the dice [1, 2, 3,

4, 5, 6], but cannot observe the tendency of the (unfair) dice to yield extreme outcomes

like “1” or “6”. This tendency may be estimated though, and after sufficiently many

observations there remains only a small uncertainty about the probability of obtaining a

“6”. The situation of estimating volatility is comparable: here too one observes the

realizations (the returns), and not the tendency to yield extreme returns. However, the

situation is less encouraging in an important respect: in contrast to the setting of rolling

a dice, the estimate of volatility does not necessarily improve as one collects more data,

since volatility itself changes. That is, data one month from now may have little to do


124

with the current volatility. So there will always be uncertainty about the current and past

values of volatility.

3.3. VOLATILITY MEASUREMENT

Volatility estimation in the context of option pricing must be considered within the

broader context of asset price and return volatility. Mathematical option pricing models

require estimation of volatility. The future volatility of the underlying asset price is a

parameter that must be input into the option pricing model. The volatility estimate used

in option valuation is the annualized standard deviation of the logarithms of the asset

returns (or the continuously compounded asset returns). The volatility estimate is a

measure of the uncertainty about the returns on the asset. It is used to generate the

distribution of asset prices at the option expiration to calculate the fair value of option.

There are several aspects of volatility estimate that should be considered: Firstly, the

volatility parameter required to derive option value is forward looking, that is, the

relevant volatility is the asset return volatility in the period to option expiry. Secondly,

volatility is assumed to be constant between pricing date and option expiry. Thirdly,

volatility is assumed to be time homogeneous, that is, it is the same over the life of the

option. And lastly, uncertainty about the asset price at option maturity is assumed to be

directly proportional to the asset price at commencement. The estimation of the

volatility of the underlying asset price is particularly problematic because it is the only

parameter of most mathematical models that is not observable directly. Moreover, the

sensitivity of the option value to this parameter places additional demands on the

estimation of volatility.

Types of Volatility Models

There are various classes of models and estimators, which have been proposed in the

literature for measuring volatility of asset returns. Models and estimators, assuming

volatility to be constant are the oldest ones among the models, which have been used to

estimate and forecast volatility. These models measure “unconditional volatility”. With

the recognition of empirical regularity that the volatility in financial markets is clustered

in time and is time varying, these models gave way to models measuring “conditional

volatility”. In addition, volatility estimated from the value of options, in which typically


125

volatility is the only unobservable parameter for valuation, allowed researchers and

practitioners to use “implied volatility”, i.e., the market forecast of volatility in valuing

the traded options. Moreover, Andersen etal (2001a, 2001b), shows that volatility

becomes observable and does not remain latent, if high frequency data is available. The

“realized volatility” estimated using high frequency data is model-free under very weak

assumptions.

Moreover, volatility models can be linear or non-linear2 in nature. When the function

of the regressor variables is linear then it is called linear time series model. In other

word, if the function relating to the observed time series Xt and the underlying shocks,

say ut is linear then this is called linear time series model. Linear volatility models are

linear in the parameters, so that there is one parameter multiplied by each variable in the

model. For example, a structural model could be something like:

> ? ?44 ?@@ ?AA B

or more compactly y=X? B. It is additionally assumed that B~60, 24. The linear

paradigm is a useful one. The properties of linear estimators are very well understood.

Many models that appear, prima facie, to be nonlinear, can be made linear by taking

logarithms or some suitable transformation. As such, linear structural (and time series)

models, as explained above, are unable to explain a number of important features

common to financial data, like leptokurtosis (that is , tendency for financial asset returns

to have distributions that exhibit fat tails and excess peakedness at the mean), volatility

clustering or volatility pooling ( the tendency for volatility in financial markets to

appear in bunches) and leverage effects (the tendency for volatility to rise more

following a large price fall than following a price rise of the same magnitude).

Examples of linear time series models include: the Autoregressive models, the Moving

2 Treating an asset return (e.g. log return rt of a stock) as a collection of random variables overtime

we have a time series rt).linear time series analysis provides a natural framework to study the

dynamic structure of such a series. The theories of a linear time series include stationarity,

dynamic dependence, autocorrelation function, modeling and forecasting. Examples of

econometric models included in the category of linear time series models can be simple

autoregressive (AR) models, simple moving average MA ) models, seasonal models, unit-root non-

stationarity, regression models with time series errors, fractionally differenced models for long

range dependence, etc. For an asset return rt, simple models attempt to capture the linear

relationship between rt and the information available prior to time t.


126

Average models, ARMA(p,q) model, ARIMA(p,d,q) model, etc. During the last two

decades or so, a new area of “Non-Linear time series modeling” is fast coming up.

Here, there are basically two possibilities, viz. Parametric or Nonparametric

approaches. Evidently, if in a particular situation there is surety about the functional

form, then the former should be used, otherwise the latter may be applied. When

dealing with nonlinearities, Campbell et al. (1997) identified that in linear time series

the shocks are assumed to be uncorrelated but not necessarily identically independently

distributed (iid). Moreover, in non-linear time series the shocks are assumed to be iid,

but there is a non-linear function relating to the observed time series Xt and the

underlying shock ut. there can be models that are Non-linear in mean, like the Non-

linear moving average model (it is non-linear in mean but linear in variance); and there

can be models that are non-linear in variance, like the Engle’s ARCH model (it is non-

linear in variance but linear in mean).

The combination of powerful methodological advances and important applications

within empirical finance produced explosive growth in the financial econometrics of

volatility dynamics, with the econometrics and finance literatures cross-fertilizing each

other furiously. Initial developments were tightly parametric, but the recent literature

has moved in less parametric, and even fully non-parametric directions. The parametric

approaches to volatility are based on explicit functional form assumptions regarding the

expected and/or instantaneous volatility corresponding to the strength of volatility

process at a point of time. In the discrete-time ARCH class of models, the expectations

are formulated in terms of directly observable variables, while the discrete- and

continuous-time stochastic volatility (SV) models both involve latent state variable(s).

The non-parametric approaches to volatility are generally free from such functional

form assumptions and hence afford estimates of notional volatility that are flexible yet

consistent (as the sampling frequency of the underlying returns increases). The non-

parametric approaches include ARCH filters and smoothers designed to measure the

volatility over infinitesimally short horizons, as well as the recently popularized realized

volatility measures for (nontrivial) fixed-length time intervals.

Examples of univaraite volatility models include Engle’s ARCH model, the GARCH

model, the exponential GARCH (EGARCH) model, the threshold GARCH (TGARCH)


127

model, the conditional heteroscedasticity autoregressive moving average (CHARMA)

model the random coefficient autoregressive (RCA) model, the stochastic volatility

models of Melino and Turnbull (1990), Taylor (1994), Harvey, Ruiz, and Shephard

(1994), etc. The univaraite volatility models can be generalized to the multivariate case

in which there are some simple methods for modeling the dynamic relationships

between volatility processes of multiple asset returns. Multivariate volatilities measure

the conditional covariance matrix of multiple asset returns. The multivariate volatility

models include the GARCH models for bivariate returns (includes the constant

correlation model as well as the time-varying correlation bivariate GARCH model), etc.

If EF denotes the correlation between /; F, then a time series is said to have a

short memory provided ∑ EF,F* converges to a constant as n becomes large. A series

with long memory has autocorrelation values that decline slowly at a hyperbolic rate.

For example, the popular ARCH class of models is a short-memory volatility model.

The examples of long-memory models include the break model [Granger and Hyung

(2004), Starica and Granger (2004)], regime switching models [Hamilton and Susmel

(1994), Diebold and Inoue (2001), Hillebrand (2005)], fractionally integrated GARCH,

etc. Most previous research has focused on linear models, fractionally integrated models

in particular, to study long memory. More recent studies have shown that a number of

non-linear volatility models can also produce long memory characteristics in volatility.

Examples of such models include the previous mentioned break models and the regime

switching models. In these two categories of models, volatility is characterized by short

memory between breaks and within each regime. Without controlling for the breaks and

the changing regimes, volatility will produce spurious long-memory characteristics.

Each of these models represents a very different volatility structure and produces

volatility forecasts that are very different from each other and different from those of

the fractionally integrated and short-memory models.

3.3.1. Historical Volatility Models

Historical volatility is a measure of price fluctuation overtime. Historical volatility uses

historical (daily, weekly, monthly, quarterly and yearly) price data to empirically

measure the volatility of an asset in the past. Compared with the other types of volatility

models, historical volatility models (HIS) are easy to manipulate and construct. HIS


128

models can be categorized into two major categories: single state models and regime

switching models. All HIS models differ by the number of lag volatility terms included

in the model and the weights assigned to them, reflecting the choice on the tradeoff

between increasing the amount of information and more updated information.

(1) Single State Models

(i) Random Walk Model

The simplest historical price model is the random walk model, where the difference

between consecutive period volatility is modeled as a random noise;

ttt v+= −1σσ

So the best forecast for tomorrow’s volatility is today’s volatility;

tt σσ =+1ˆ

where tσ alone is used as a forecast for 1+tσ .

(ii) Historical Average Method

In contrast, the Historical Average method makes a forecast based on the entire history

23 2 2 G 2

In other words, under the assumption of a stationary mean, the best forecast of today’s

volatility is a long-term average (LTM) of past observed volatilities.

(iii) Moving Average Method

The simple Moving Average method:

23 1H 2 2 G 2F

is similar to the historical average method, except that older information is discarded.

The value of H (i.e. the lag length to past information used) could be subjectively chosen

or based on minimizing in-sample forecast error, I 2 23. The multi-period

forecast 23F JK" H L 1 will be the same as the one-step-ahead forecast 23 for all

three methods described above.


129

(iv) Exponential Smoothing Methods

The Exponential method:

23 1 ?2 ?23 I /; 0 M ? M 1, 23 1 ?2 ?2N

is similar to the historical method, but more weight is given to the distant past. The

smoothing parameter ? is estimated by minimizing the in-sample forecast error I. The

exponential smoothing methods are essentially methods of fitting a suitable curve to

historical data of a given time series. There are a variety of these methods, such as

single exponential smoothing, Holt’s linear method, Holt-Winter’s method and their

variations. Although used in several areas of Business and Economic forecasting, these

methods are now supplanted by other better methods.

(v) Exponentially Weighted Moving Average Method (EWMA)

The Exponentially weighted moving average method:

23 ?2F

* ?F*O

is the moving average method with exponential weights. Again the smoothing ? is

estimated by minimizing the in-sample forecast errors I.

All the Historical volatility methods above have a fixed weighting scheme or a

weighting scheme that follows some declining pattern. Other types of historical models

have weighting schemes that are not pre-specified.

(vi) Simple Regression Method

The simplest method with weighting schemes having no pre-specifications is the simple

regression method:

2 P ?2 ?424 G ?,2, Q,

23 P ?2 ?42 G ?,2,

which expresses volatility as a function of its past values and an error term. It is

principally an autoregressive method.


130

(vii) ARIMA Model

If Y, a variable at time t, is modeled as:

R S TR S U

where S is the mean of Y and U is an uncorrelated random error term with zero mean

and constant variance 24 (i.e. it is white noise), then Yt is said to follow a first-order

autoregressive (or AR (1)), stochastic process3.

If the variable Y is modeled as:

R B ?VU ?U ?4U4

where B is a constant and U, as before, is the white noise stochastic error term, then Y is

said to follow a first-order moving average, (MA(1)), process. Here, Y at time t is equal

to a constant plus a moving average of the current and past error terms. In a similar

fashion one can think of a second-order, MA (2), or a qth-order, MA (q), moving

average process. In short, a moving average process is simply a linear combination of

white noise error terms.

If Y has characteristics of both AR and MA then it is said to follow an ARMA process.

Y following an ARMA (1,1) process is modeled as

R W TR ?VU ?U

where W represents a constant term. The above equation includes one autoregressive and

one moving average term. In general, in an ARMA (p,q) process, there will be p

autoregressive and q moving average terms.

In all the above three equations, (for AR, MV and ARMA processes), it is assumed that

the time series involved are (weakly) stationary in the sense that the mean and variance

of the time series are constant and its covariance is time-invariant. But, in reality, the

economic time series are generally, non-stationary, that is, they are integrated.

3 Here the value of Y at time t depends on its value in the previous time period and a random term;

the Y values are expressed as deviations from their mean value. In other words, it means the

forecast value of Y at time t is simply some proportion ( T of its value at itme (t-1) plus a

random shock or disturbance at time t. In a similar fashion one can think of a second-order

autoregressive process AR (2) or a pth-order autoregressive process AR (p) for the variable Y.


131

If a time series is integrated of order 1 (i.e. it is I(1)), its first differences are I(0), that is

stationary4. If a time series is differenced d times to make it stationary, and then the

ARMA (p,q) is applied to model it, then the original time series is ARIMA (p,d,q), that

is, it is an autoregressive integrated moving average time series, where p denotes the

number of autoregressive terms, d the number of times the series is differenced before it

becomes stationary, and q the number of moving average terms. If d=0 (i.e. a series is

stationary to begin with), then ARIMA(p,d=0,q)=ARMA(p,q). An ARIMA (p,0,0)

process means a purely AR(p) stationary process; an ARIMA(0,0,q) means a purely

MA(q) stationary process. Given the values of p,d,q, one can tell what process is being

modeled. Popularly known as the Box-Jenkins methodology, but technically known as

the ARIMA methodology, the emphasis of this method is not on constructing single-

equation or simultaneous-equation models but on analyzing the probabilistic, or

stochastic, properties of economic time series on their own under the philosophy: let the

data speak for themselves. Unlike the regression models, in which Yt is explained by k

regressors X1,X2,……Xk, the BJ-type time series models allow Yt to be explained by

past, or lagged, values of Y itself and the stochastic error terms. For this reason,

ARIMA models are sometimes called atheoretic models because they are not derived

from any economic theory. ARIMA models pertaining to a single time series are

categorized as univaraite ARIMA models, whereas those pertaining to multiple time

series are categorized as multivariate ARIMA models.

(2) Regime-Switching and Transition Exponential Smoothing

Regime switching allows the stock price process to switch between k regimes

randomly; each regime is characterized by different model parameters, and the process

describing which regime price process is in at any time is assumed to be Markov (that

is, the probability of changing regime depends only on the current regime, not on the

history of the process). The rationale behind the regime-switching framework is that the

market may switch from time to time between, say, a stable low-volatility state and a

more unstable high-volatility regime. Periods of high volatility may arise, for example,

because of short-term political or economic uncertainty. In fact, an important stylized

4 Similarly, if a time series is I(2), its second difference is I(0). In general, if a time series is I(d),

after differencing it d times we obtain an I(0) series.


132

fact of stock volatility known as non-linearity, can be explained as a realization of

regime switching (Robe and Kosfeld (2001), de Lima (1998)). The switching is able to

describe a number of features inherent in the stock market returns, such as leptokurtosis

and mean reversion. Blanchard and Watson (1982) identify surviving bubbles or

collapsing bubbles as regime switching process.

The basic intuition of Regime-switching models is that the data generation process can

change across different states (examples). In particular, regime-switching models

assume that the relevant relationships are linear within each regime, but possibly

different across regimes. Example:

> X Y

where Y~0, 24 and i=1,2. In regime-switching models the transition across regimes

are assumed to be stochastic.

There are two types of regime-switching models: threshold models, in which the

regimes are characterized by an observable variable (SETAR, STAR, etc); and the

Markov-switching models, in which, regimes are characterized by an unobservable

variable.

(i) Threshold Models

Threshold models are an interesting alternative for modeling both returns and

volatilities. The fundamental idea behind these models is the introduction of regimes

based on thresholds, thus allowing the analysis of complex stochastic systems from

simple subsystems. In this category of models, the regime changes are characterized as

a deterministic function of past realizations of some observed variable. The overall

process is non-linear, while following a linear AR model in each regime. This category

of models can further have following variants:

(a) Threshold Autoregressive (TAR) Models

Given by Tong (1983) and Tsay (1989), the regime in this model switches according to

the observable past history of the system. There are several attractive features of these

models; like, limit cycles, amplitude dependent frequencies and jump phenomena, but,


133

there are somewhat arbitrary identification of the threshold variable and the threshold

value(s).

(b) Self Exciting Threshold Autoregressive (SETAR) Model

Given a time series of data Xt, the SETAR model is a tool for predicting future values in

this series, assuming that the behavior of the series changes once the series enters a

adifferent regime. The switch from one regime to to another depends on the past values

of the X series (hence the Self-Exciting portion of the name). The model consists of k

autoregressive parts, each for a different regime. The model is usually refreed to as the

SETAR(k,p) model where k is the number of regimes and p is the order of the

autoregressive part (since those can differ between regimes, the p portion is sometimes

dropped and models are denoted simply as SETAR(k)).

The SETAR model is a special case of Tong’s general threshold autoregressive models.

The latter allows the threshold variable to be very flexible, such as an exogenous time

series in the open-loop threshold autoregressive system, a Markov chain in the Markov-

chain driven threshold autoregressive models, which is also known as the Markov

switching model.

Recent research has led to the introduction of hybrid models, known as the second-

generation models, by combining SETAR model with the threshold Stochastic

Volatility model (THSV) and the threshold GARCH (TGARCH) models. The

conditional heteroscedasticity models (GARCH(1,1), SV) are non-linear time series

models most commonly used in literature. Generally, these two models cannot capture

the asymmetry of volatility. This characteristic known as the leverage effect, is related

to the asymmetric behavior of the market, in the sense that it is more volatile after a

continuous decrease in prices than after a rise (both of the same magnitude). The non-

linear SETAR models, however, allow the asymmetries in the mean (non-linearity in

the mean of returns) to be captured. The former models belong to the class of “first-

generation models”. To capture the asymmetric effect in the volatility, the introduction

of thresholds in the volatility equation has been proposed, thus obtaining the Threshold

GARCH (TGARCH) MODEL OR Threshold Stochastic Volatility (THSV) model. If

both kinds of non-linearity is explained in terms of the mean and variance, the SETAR


134

model is combined with the previous ones, obtaining the SETAR-TGARCH model and

the SETAR-THSV model. These econo-generation models capture the main features of

volatility. The difference is that the SETAR-THSV is more flexible than the SETAR-

TGARCH to capture kurtosis though the estimation procedure is computationally more

costly.

(c) Smooth Transition Autoregressive (STAR) Model

STAR models are applied to time series data as an extension of autoregressive models

in order to allow for higher degree of flexibility in model parameters through a smooth

transition. Given a time series of data Xt , the STAR model is a tool for predicting

future values in the series, assuming that the behavior of the series changes depending

on the value of the transition variable. The transition might depend on the past values of

the X series (similar to SETAR model) or exogenous variables. The model consists of

two autoregressive parts linked by the transition function. The model is usually referred

to as STAR(p) models proceeded by the letter describing the transition function and p is

the order of the autoregressive part. Most popular transition function includes

exponential function and first and second-order logistic functions. They give rise to

Logistic STAR (LSTAR) and Exponential STAR (ESTAR) models.

STAR models were introduced and developed by Kung-sik Chan and Howell Tong in

1986, in which the same acronym was used. It originally stands for Smooth Threshold

Autoregressive. The models can be understood as two-regime SETAR model with

smooth transition between regimes, or as continuum of regimes. In both cases the

presence of the transition function is the defining feature of the model as it allows for

changes in values of the parameters.

(ii) Markov-Switching Models

A Markov-switching model is a non-linear specification in which different states of the

world affect the evolution of the time series. The dynamic properties depend on the

present regime, with the regimes being realizations of a hidden Markov chain with a

finite state space. Markov switching models were introduced to the econometric

mainstream by Hamilton (1988, 1989) and continue to gain popularity in the financial

time series analysis. Even the most basic Markov-switching model with constant regime


135

parameters is capable of describing many typical characteristics of financial time series.

Hamilton’s original motivation was to model long swings in the growth rate of output,

but instead he found evidence for discrete switches in the growth rate at business cycle

frequencies. Output growth was modeled as the sum of a discrete Markov chain and a

Gaussian Autoregression:

R Z [

where

Z /V /\, \ 0 K" 1 and

[ X[ X4[4 X@[@ XA[A 2], \ 1|\ 1 , \ 0|\ 0 _, ]~60,1

The major estimation difficulty with the model is the lack of separate observability of Zt

and Xt.

There have been many extensions and generalizations of the original Markov switching

model given by Hamilton. In the original model the conditional distribution of a

realization depends on the previous four values of the Markov process, which makes the

estimation of the model computationally demanding. Hamilton (1990, 1994) suggests

alternative Markov switching models where the regression parameters, rather than the

mean, switch between states. This apparently minor modification simplifies substantial

estimation, which can be carried out as an application of the EM algorithm. Multivariate

Markov-switching models are also introduced.

According to Hansen (1992), once one generalizes the switching model to all regression

parameters, current states are independent of their previous realizations, so that a simple

switching model (mixture model), rather than a Markov switching model, seems

appropriate. A generalization of Markov switching GARCH models was developed by

Gray (1996) and subsequently modified by Klaassen (2002). While the model of Gray

is attractive since it combines Markov-switching with the GARCH effects, its analytical


136

intractability is a serious drawback. As a consequence, conditions for covariance

stationarity have yet to be established. Closely related is the lack of an analytic

expression for the covariance structure of the squared process. Cai (1994) and Hamilton

and Susmel (1994) propose Markov-switching ARCH models, though (standard)

GARCH (1,1) models are known to provide better descriptions of market volatility than

even high-order ARCH specifications. Their restrictions to ARCH models was due to

the path dependence in Markov switching GARCH models that arise when “literally”

translating the GARCH model of Bollerslev (1986) to a regime switching setting.

3.3.2. ARCH Model and its Variants

Time series models have been initially introduced either for descriptive purposes like

prediction and seasonal correction or for dynamic control. In the 1970s, the research

focused on a specific class of time series models, the so-called autoregressive moving

average process (ARMA), which were very easy to implement. In these models the

current value of the series of interest is written as a linear function of its own lagged

values and current and past values of some noise process, which can be interpreted as

innovations to the system. However, this approach has two major drawbacks: (1) it is

essentially a linear setup, which automatically restricts the type of dynamics to be

approximated; (2) it is generally applied without imposing a priori constraints on the

autoregressive and moving average parameters, which is inadequate for structural

interpretations.

Among the field of applications where standard ARMA fit is poor are financial and

monetary problems. The financial time series features various forms of non linear

dynamics, the crucial one being the strong dependence of the instantaneous variability

of the series of its own past. Moreover, financial theories based on concepts like

equilibrium or rational behavior of the investors would naturally suggest including and

testing some structural constraints on the parameters. In this context, ARCH models,

introduced by Engle in 1982, arise as an appropriate framework for studying these

problems. Currently, there exist more than a hundred papers and dozens of Ph.D. thesis

on this topic, which reflects the importance of this approach for statistical theory,

finance and empirical work.


137

(1) Basic ARCH Model

The underlying logic of this category of volatility models is that of volatility clustering

in financial time series, that is, there are periods in which prices show wide swings for

an extended time period followed by periods in which there is relative calm. In reality

one often observes that large positive and large negative observations in financial time

series tend to appear in clusters. A characteristic of most of these financial time series is

that in their level form they are random walk; that is, they are non-stationary. On the

other hand, in the first difference form, they are generally stationary. Therefore, instead

of modeling the levels of financial time series, authors try to model their first

differences. But these first differences often exhibit wide swings, or volatility,

suggesting that the variance of financial time series varies over time. The so-called

autoregressive conditional heteroscedasticity (ARCH) model, originally developed by

Engle (1982), comes in handy in modeling such “varying variance”. As the name

suggests, heteroscedasticity, or unequal variance, may have an autoregressive structure

in that heteroscedasticity observed over different periods may be auto correlated.

Suppose the following AR(1), or ARIMA(1,0,0)5 model is considered:

[4 ?V ?[4 U [3.7]

where [ is the mean-adjusted relative change in stock prices and U is a white noise

error term. This model postulates that volatility in the current period is related to its

value in the previous period plus a white noise error term. If ?4 is positive, it suggests

that if volatility was high in previous period, it will continue to be high in current

period, indicating volatility clustering. If ?is equal to zero, than there is no volatility

clustering. The statistical significance of the estimated ?4 can be judged by the usual t-

test. Suppose the following AR(p) model is considered:

[4 ?V ?[4 ?4[44 G ?[4 U [3.8]

5 For details of AR and ARIMA models refer to section 3.3.1 point (1) (vii).


138

This model suggests that volatility in the current period is related to volatility in the past

p periods, the value of p being an empirical question. Equation (3.7) is an example of

ARCH(1) model and equation (3.8) is an example of ARCH(p) model, where p

represents the number of autoregressive terms in the model.

An important element of ARCH models is it more readily explains “fat tailed”

leptokurtic distributions of asset price changes.

(2) Basic GARCH Model

Since its discovery in 1982, ARCH modeling has become a growth industry, with all its

variations on the original model. One that has become popular is the Generalized

Autoregressive Conditional Heteroscedasticity (GARCH) model, originally proposed

by Bollerslev in 1986. The simplest GARCH model is the GARCH(1,1) model, which

can be written as:

24 TV TU4 T424

which says that the conditional variance of u at time t depends not only on the squared

error term in the previous time period [as in ARCH(1,1)] but also on its conditional

variance in the previous time period. This model can be generalized to a GARCH(p,q)

model in which there are p lagged terms of the squared error term and q terms of the

lagged conditional variances.

It can be noted that a GARCH(1,1) model is equivalent to an ARCH(2)model and a

GARCH(p,q) model is equivalent to an ARCH(p+q) model.

(3) Variants of ARCH/GARCH

There have been breathtaking variations, extensions and applications in the basic

ARCH/GARCH model since their introduction. The list of these variations, extensions

or applications is very long and out of the scope of the present study, and therefore, only

some of these are explained below. The details of the respective models can be gathered

from their respective original articles.


139

(i) Augmented ARCH (AARCH)

The AARCH model of Bera, Higgins and Lee (1992) extends the linear ARCH(q)

model to allow the conditional variance to depend on cross-products of the lagged

innovations. Defining the qx1 vector `Y, Y4, … , Ybc, the AARCH(q)

model may be expressed as:

24 d e f

where A denotes a qxq symmetric positive definite matrix. If A is diagonal, the model

reduces to the standard linear ARCH(q) model. The Generalized AARCH, or

GAARCH model is obtained by including lagged conditional variances on the right-

hand-side of the equation. The slightly more general GQARCH (Generalized Quadratic

ARCH) representation was proposed independently by Sentana (1995). The

GQARCH(p,q) model is defined as:

24 d gh b* Th4 2 Tjhhj

bj* b

*b

* ?24b*

The model simplifies to the linear GARCH(p,q) model if all of the g’s and the Tj’s

are equal to zero. Defining the qx1 vector k `h, h4, … , hbc, the model may

alternatively be expressed as:

24 d Ψe e f ?24b* ,

where Ψ denotes the qx1 vector of g coefficients and A refer to the qxq symmetric matrix

of T and Tj coefficients. Conditions on the parameters for the conditional variance to be

positive almost surely and the model well-defined are given in Sentana (1995).

(ii) Modified ARCH or Multiplicative ARCH (GARCH) or Multivariate GARCH or

Mixture GARCH (i.e. MARCH/MGARCH)

Friedman, Laibson and Minsky (1989) denote the class of GARCH(1,1) models in

which the conditional variance depends non-linearly on the lagged squared innovations

as Modified ARCH models:


140

24 d Tmh4 ?24

where F(.) denotes a positive valued function. In their estimation of the model

Friedman, Laibson and Minsky (1989) use the function m sinW . p qW r s4t 1. pW u s4.

Slightly different versions of the univaraite Multiplicative GARCH model were

proposed independently by Geweke (1986), Pantula (1986) and Milhoj (1987). The

model is more commonly referred to as the Log-GARCH model. The Logarithmic-

GARCH model parameterizes the logarithmic conditional variance as a function of the

lagged logarithmic variances and the lagged logarithmic squared innovations:

log24 d T logh4 b* ? log24 b

*

The model may alternatively be expressed as:

24 exp d h4 x%b

* 24 y%

*

In light of this alternative representation, the model is also sometimes referred to as a

Multiplicative GARCH, or MGARCH model.

MGARCH may also denote to the Multivariate GARCH models, which were first

analysed and estimated by Bollerslev, Engle and Wooldridge in 1988. The unrestricted

linear MGARCH(p,q) model is defined by:

.zΩ Ω f.zhhe b* =.zΩb

*

where vech(.) denotes the operator that stacks the lower triangular portion of a

symmetric NxN matrix into an N(N+1)/2x1 vector of the corresponding unique

elements, and the f and = matrices are all of compatible dimension

N(N+1)/2Xn(n+1)/2. This vectorized representation is also sometimes referred to as a

VECH GARCH model. The general vech representation does not guarantee that the


141

resulting conditional covariance matrices Ω are positive definite. Also the model

involves a total of N(N+1)/2+(p+q)(N4+2N

3+N

2)/4 parameters, which becomes

prohibitively expensive from a practical computational point of view for anything but

the bivariate case, or N=2.

The MGARCH may also denote another category of models known as Mixture

GARCH (MGARCH) models. The MAR-ARCH model of Wong and Li(2001) and the

MAGARCH model of Zhang, Li and Yuen (2006) postulates that the time t conditional

variance is given by a time-invariant mixture of different GARCH models.

(iii) AGARCH- Asymmetric or Absolute Value GARCH)

The Asymmetric GARCH model was introduced by Engle (1990) to allow for

asymmetric effects of negative and positive innovations. The AGARCH(1,1) model is

defined as:

24 d Th4 Ph ?24

where negative values of P implies that positive shocks will result in smaller increases

in future volatility than negative shocks of the same absolute magnitude. The model

may alternatively be expressed as:

24 de Th Pe4 ?24

for which de L 0, T u 0 /;? u 0 readily ensures that the conditional variance is

positive almost surely.

AGARCH may also denote another category of models known as Absolute Value

GARCH or TS-GARCH (Taylor-Schwert GARCH) model. The TS-GARCH(p,q)

model of Taylor(1986) and Schwert (1989) parameterized the conditional standard

deviation as a distributed lag of the absolute innovations and the lagged conditional

standard deviations,

2 d T|h| ?2.

*b

*


142

This formulation mitigates the influence of large, in an absolute sense, observations

relative to the traditional GARCH(p,q) model. The TS-GARCH model is also

sometimes referred to as an Absolute Value GARCH, or AVGARCH, model, or simply

an AGARCH model. It is a special case of the more general Power GARCH, or

NGARCH6, formulation.

(iv) ANN-ARCH Model- Artificial Neural Network ARCH Model

Donaldson and Kamstra (1997) term the GJR model7 augmented with a logistic

function, as commonly used in Neural Neetworks, the ANN-ARCH model.

(v) ANST-GARCH- Asymmetric Non-Linear Smooth Transition GARCH

The anst-garch(1,1) model of Nam, Pyun and Arize (2002) postulates that

24 d Th4 ?24 ~ Sh4 E24 mh, P

where F(.) denotes a smooth transition function. The model simplifies to the ST-

GARCH(1,1) model of Gonzalez-Rivera (1998) for ~ E 0 and the standard

GARCH(1,1) model for ~ S E 0. (vi) APARCH Model- Asymmetric Power ARCH

The APARCH, or APGARCH, model of Ding, Granger and Engle (1993) nests several

of the most popular univaraite parameterizations. In particular, the APGARCH(p,q)

model,

2 d T|h| Ph ?2*

b*

reduces to the standard linear GARCH(p,q) model for S 2 /; P 0, the TS-

GARCH(p,q) model for S 1 /; P 0, the NGARCH (p,q0 model for P 0, the

GJR-GARCH model for S 2 /; 0 M P M 1, the TGARCH(p,q) model for S 1 /; 0 M P M 1, while the log-GARCH(p,q) model is obtained as the limiting case of

the model for S 0 /; P 0.

6 NGARCH model is explained later.

7 Explained later in the list.


143

(vii) ARCH-M Model (ARCH-in-Mean Model)

The ARCH-M model was first introduced by Engle, Lilien and Robins (1987) for

modeling risk-return tradeoffs in the term structure of U.S. interest rates. The model

extends the ARCH regression model in Engle (1982) by allowing the conditional mean

to depend directly on the conditional variance,

1>|+~6e? S24, 24

This breaks the block-diagonality between the parameters in the conditional mean and

the parameters in the conditional variance, so that the two sets of parameters must be

estimated jointly to achieve asymptotic efficiency. Non-linear functions of the

conditional variance may be included in the conditional mean in a similar fashion. The

final preferred model estimated in Engle, Lilien and Robins (1987) parameterizes the

conditional mean as a function of log 24. Multivariate extensions of the ARCH-M

model were first analysed and estimated by Bollerslev,Engle and Wooldridge (1988).

(viii) ARCH-SM (ARCH Stochastic Mean) Model

The ARCH-SM acronym was coined by Lee and Taniguchi (2005) to distinguish

ARCH models in which h > > > >. (ix) ARCH-Smoothers

ARCH-Smoothers, first developed by Nelson (1996b) and Foster and Nelson (1996),

extend the ARCH and GARCH models and corresponding ARCH-Filters based solely

on past observations to allow for the use of both current and future observations in the

estimation of the latent volatility.

(x) ATGARCH (Asymmetric Threshold GARCH)

The ATGARCH(1,1) model of Crouchy and Rockinger (1997) combines and extends

the TS-GARCH(1,1) and GJR(1,1) models by allowing the threshold used in

characterizing the asymmetric response to differ from zero:

2 d T|h|ph u P S|h|ph r P ?2

Higher order ATGARCH(p,q) models may be defined analogously.


144

(xi) -ARCH (Beta ARCH) Model

β-ARCH(1,1) model of Guegan and Diebolt (1994) allows the conditional variance to

depend asymmetrically on positive and negative lagged innovations,

24 d Tp1h L 0 P1ph r 0h4.y

where I(.) denotes the indicator function. For T P /; ? 1 the model reduces to

the standard linear ARCH(1,1) model. More general β-ARCH(q) and β-GARCH(p,q)

models may be defined in a similar fashion.

(xii) CGARCH (Component GARCH/Composite GARCH Model)

The component GARCH model of Engle and Lee (1999) was designed to get better

account for long-run volatility dependencies. Rewriting GARCH(1,1) model as:

24 24 Th4 24 ?24 24

where 24 k d/1 T ? refers to the unconditional variance, the CGARCH model

is obtained by relaxing the assumption of a constant 24. Specifically

24 4 Th4 4 ?h4 4

with the corresponding long-run variance parameterized by the separate equation,

4 d E4 h4 24

Substituting this expression for 4 into the former equation, the CGARCH model may

alternatively be expressed as a restricted GARCH(2,2) model.

The CGARCH may also denote the model of den Hertog (1994), which represents h4 as

the sum of a latent permanent random walk component and another latent AR(1)

component.

(xiii) COGARCH (Continuous GARCH Model)

The continuous-time COGARCH(1,1) model was proposed by Kluppelberg, Lindner and

Maller (2004). The model is obtained by backward solution of the difference equation

defining the discrete-time GARCH(1,1) model, replacing the standardized innovations by


145

the increments to the Levy process, L(t). In contrast to the GARCH diffusion model of

Nelson (1990b), which involves two independent Brownian motions, the COGARCH

model is driven by a single innovation process. Higher order COGARCH(p,q) processes

have been developed by Brockwell, Chadraa and Lindner (2006).

(xiv) Copula GARCH

Any joint distribution function may be expressed in terms of its marginal distribution

functions and a copula function linking these. The class of copula GARCH models

builds on this idea in the formulation of multivariate GARCH models by linking

univaraite GARCH models through a sequence of possibly time-varying conditional

copulas. Jondeau and Rockinger (2006) and Patton (2006) explains at large the

estimation and inference in copula GARCH models.

(xv) CorrARCH (Correlated ARCH) Model

The bivariate CorrARCH model of Christodoulakis and Satchell (2002) parameterizes

the time-varying conditional correlations as a distributed lag of the product of the

standardized innovations from univaraite GARCH models for each of the two series. A

Fisher transform is used to ensure that the resulting correlations always lie between -1

and 1.

(xvi) DAGARCH (Dynamic Asymmetric GARCH) Model

The DAGARCH model of Caporin and McAleer (2006) extends the GJR-GARCH

model to allow for multiple thresholds and time-varying asymmetric effects.

(xvii) DTARCH (Double Threshold ARCH) Model

The DTARCH model of Li and Li (1996) allows the parameters in both the conditional

mean and the conditional variance to change across regimes, with the m different

regimes determined by a set of threshold parameters for some lag u 1 of the observed > process, say "j r > M "j, " ∞ "V r " r G r " ∞. (xviii) EGARCH (Exponential GARCH) Model

The EGARCH model was developed by Nelson (1991). The model explicitly allows for

asymmetries in the relationship between return and volatility. In particular let k


146

h2 denote the standardized innovations. The EGARCH(1,1) model may then be

expressed as:

log24 d T|| || P1 1 ? log24

For P r 0 negative shocks will obviously have a bigger impact on future volatility than

positive shocks of the same magnitude. This effect, which is typically observed

empirically with equity index returns, is often referred to as a “leverage effect”, although

it is now widely agreed that the apparent asymmetry has little to do with actual financial

leverage. By parameterizing the logarithm of the conditional variance as opposed to the

conditional variance, the EGARCH model also avoids complications from having to

ensure that the process remains positive. Meanwhile, the logarithmic transformation

complicates the construction of unbiased forecasts for the level of future variance.

(xix) EVT-GARCH (Extreme Value Theory GARCH)

The EVT-GARCH approach pioneered by McNeil and Frey (2000), relies on extreme

value theory for i.i.d. random variables and corresponding generalized Pareto

distributions for more accurately characterizing the tails of the distributions of the

standardized innovations from GARCH models. This idea may be used in the

calculation of low-probability quantile, or Value-at-risk, type predictions

(xx) F-ARCH (Factor ARCH) Model

The multivariate factor ARCH model developed by Diebold and Nerlove (1989) and

the factor GARCH model of Engle, Ng and Rothschild (1990) assumes that the

temporal variation in the NxN conditional covariance matrix for a set of N returns can

be described by univaraite GARCH models for smaller set of K<N portfolios,

Ω Ω ∑ e 24*

where /; 24 refer to the time invariant Nx1 vector of factor loadings and time t

conditional variance for the kth

factor, respectively. More specifically, the F-

GARCH(1,1) model may be expressed as:

Ω Ω e?eΩ Teh4 where w denotes an Nx1 vector, and α and β are both scalar parameters.


147

(xxi) FIAPARCH (Fractionally Integrated Power ARCH) Model

The FIAPARCH(p,d,q) model of Tse (1998) combines the figarch(p,d,q) and the

APARCH(p,q) models in parameterizing 2 as a fractionally integrated distributed lag

of |h| Ph . (xxii) FIGARCH (Fractionally Integrated GARCH) Model

The FIGARCH model proposed by Bollerslev and Mikkelsen (1996) relies on an

ARFIMA type representation to better capture the long-run dynamic dependencies in

the conditional variance. The model may be seen as natural extension of the IGARCH

model, allowing for fractional orders of integration in the autoregressive polynomial in

the corresponding ARMA representation,

1 h4 d 1 ?1., | where . k h4 24, 0 r ; r 1, and the roots of 0 /; ? 1 are all

outside the unit circle. For values of 0 r ; r 1/2 the model implies an eventual slow

hyperbolic decay in the autocorrelations for 24 .

(xxiii) GQARCH (Generalized Quadratic ARCH) Model

The GQARCH(p,q) model of Sentana (1995) is defined by:

24 d ∑ ghb* ∑ Th4 2 ∑ ∑ Tjhhj ∑ ?24b*b*b*b*

The model simplifies to the linear GARCH(p,q) model if all of the g and the Tj

are equal to zero. Defining the qx1 vector k h, h4, … , hb, the model may

alternatively be expressed as:

24 d Ψe e f ?24 .b*

where Ψ denotes the qx1 vector of g coefficients and A refesr to the qxq symmetric

matrix of T /; Tj coefficients. Conditions on the parameters for the conditional

variance to be positive almost surely and the model well-defined are given in Sentana

(1995).


148

(xxiv) GRS-GARCH (Generalized Regime-Switching GARCH) Model

The GRS-GARCH model proposed by Gray (1996) allows the parameters in the

GARCH model to depend upon an unobservable latent state variable governed by a

first-order Markov process. By aggregating the conditional variances over all of the

possible states at each point of time, the model is formulated in such a way that it breaks

the path-dependence, which complicates the estimation of the SWARCH model of Cai

(1994) and Hamilton and Susmel (1994).

(xxv) HARCH (Heterogeneous ARCH) Model

The HARCH(n) model of Müller, Dacorogna, Davé, Olsen, Puctet and Weizsäcker

(1997) parameterizes the conditional variance as a function of the square of the sum of

lagged innovations, or the squared lagged returns, over different horizons,

24 d P hj

j* 4,

*

The model is motivated as arising from the interaction of traders with different

investment horizons. The HARCH model may be interpreted as a restricted QARCH

model.

(xxvi) IGARCH (Integrated GARFCH) Model

Estimates of the standard linear GARCH(p,q) model often results in the sum of the

estimated T /; ? coefficinets being close to unity. Rewriting the GARCH(p,q)

model as an ARMA(maxp,q,p) model for the squared innovations:

1 T ?h4 d 1 ?.

where . k h4 24, and T/; ?denote appropriately defined lag polynomials,

the IGARCH model of Engle and Bollerslve (1986) imposes an exact unit root in the

corresponding autoregressive polynomial, 1 T ? 1 , so that

the model may be written as:

1 h4 d 1 ?.


149

Even though the IGARCH model is not covariance stationary, it is still stationary with a

well defined non-degenerate limiting distribution; see Nelson (1990a). Also as shown

by Lee and Hansen (1994) and Lumsdaine (1996), standard inference procedures may

be applied in testing the hypothesis of a unit root, or T1 ?1 1. (xxvii) LARCH (Linear ARCH) Model

The ARCH(∞) representation:

24 d Th4*

is sometimes referred to as an LARCH model. This representation was first used by

Robinson (1991) in the derivation of general tests for conditional heteroscedasticity.

(xxviii) LMGARCH (Long Memory GARCH) Model

The LMGARCH(p,d,q) model is defined by

24 d ?1 1.

where . k h4 24, /; 0 r ; r 0.5. provided that the fourth order moment exists,

the resulting process for h4 is covariance stationary and exhibits long memory.

(xxix) GJR (Glosten, Jagannathan and Runkle GARCH) Model

The GJR-GARCH, or just GJR, model of Glosten, Jagannathan and Runkle (1993)

allows the conditional variance to respond differently to the past negative and positive

innovations. The GJR(1,1) model may be expressed as:

24 d Th4 Ph4 ph r 0 ?T24

where I(.) denotes the indicator function. The model is also sometimes referred to as a

Sign-GARCH model. The GJR formulation is closely related to the Threshold GARCH,

or TGARCH model proposed independently by Zakoian (1994), and the Asymmetric

GARCH (AGARCH) model of Engle (1990). When estimating the GJR model with

equity index returns, Ɣ is typically found to be positive, so that the volatility increases

proportionally more following negative than positive shocks. This asymmetry is


150

sometimes referred to in the literature as a “leverage effect”, although it is now widely

agreed that it has little to do with actual financial leverage.

(xxx) Miscellaneous other Versions

In addition to the above variations there are numerous other versions (infact more than

50) of ARCH/GARCH models available. For example, PNP-ARCH (Partially Non-

Parametric ARCH) model, QTARCH (Qualitative Threshold ARCH) model,

REGARCH (Range EGARCH) model, RGARCH (Randomized GARCH) model, or

Robust GARCH model, S-GARCH (Simplified GARCH) model, SPARCH (Semi-

Parametric ARCH) model, Spline-GARCH model, STARCH (Structural ARCH)

model, Stdev-ARCH (Standard Deviation ARCH) model, etc.

3.3.3. Implied Volatiltiy

The value of the volatility of the underlying asset that would equate an option’s price to

its fair value is called “implied volatility’. In other words, implied volatility is the

volatility, which is implicitly contained in the option prices. The implied volatility

approach calculates volatility implied by the current market value of the option

contracts. This is undertaken by specifying the option price and calculating the volatility

that would be needed in a mathematical option pricing formula such as that given by

Black and Scholes (1973) to derive the specified market price as a fair value of the

option. One can take the Black and Scholes option pricing model or any other

acceptable option pricing model to extract the implied volatility from the given

parameters. Given an observed European call option price Cob for a contract with strike

price K and expiration date T, the implied volatility σiv is defined as the input value of

volatility parameter to the, say, Black and Scholes (BS) formula such that

<, \; , ; 2¡ ¢£

where, CBS is the fair value of the option calculated from the BS model. The option

implied volatility is often interpreted as a market’s expectation of volatility over the

option’s maturity, i.e. the period from t to T. The implied volatilities from put and call

options of the same strike price and time to maturity are the same because of put-call

parity.


151

Suppose that the true (unconditional) volatility is σ over a period T. If BS model is

correct, then

<, \; , ; 2¡ <, \; , ; 2

for all strike prices. That is the function (or graph) of σiv(K) against K for fixed t,S,T and

r, observed from market option prices, is supposed to be a straight horizontal line. But, it

is well known that the Black and Scholes σiv differs across strikes. There is plenty of

documented empirical evidence to suggest that implied volatilities are different across

options of different strikes, and the shape is like a smile when one plots the BS implied

volatility σiv against strike price K, the shape is anything but a straight line.

There are atleast two theoretical explanations (viz. distributional assumption and

stochastic volatility) to explain the smile puzzle. Other explanations that are based on

market microstructure and measurement errors (like liquidity, bid-ask spread and tick

size) and investor risk preference (like model risk, lottery premium and portfolio

insurance) have also been proposed. The BS model assumes that the stock price follows

a lognormal distribution or the logarithmic stock returns to have a normal distribution.

There is widely documented empirical evidence that risky financial asset returns have

leptokurtic tails. In the case where the strike price is very high, the call option is deep-

out-of-the-money and the probability for this option to be exercised is very low.

Nevertheless, a leptokurtic right tail will give this option a higher probability, than that

from a normal distribution, for the terminal asset price to exceed the strike price and the

call option finish in-the-money. This higher probability leads to a higher call price and a

higher BS implied volatility at high strike.

Next comes the case of a low strike price. It is well known that option value has two

components: intrinsic value and time value. Intrinsic value reflects how deep an option is

in-the-money. Time value reflects the amount of uncertainty before the option expires;

hence it is most influenced by volatility. A deep-in-the-money call option has high

intrinsic value and little time value, and a small amount of bid-ask spread or transaction

tick size is sufficient to perturb the implied volatility estimation. One could, however,

make use of the previous argument and apply it to an out-of-the-money (OTM) put


152

option at low strike price. An OTM put price has a close to nil intrinsic value and the put

option price is due mainly to time value. Again because of the thicker tail on the left, one

expects the probability that the OTM put option finishes in-the-money to be higher than

that for a normal distribution. Hence the put option price (and hence the call option price

through put-call parity) should be greater than that predicted by BS model. If the BS

implied model is used to invert the volatility estimates from these option prices, the BS

implied will be higher than actual volatility. This results in volatility smile where implied

volatility is much higher at very low and very high strikes. The above argument applies

readily to the currency market where exchange rate exhibit thick tail distributions that

are approximately symmetrical. In the stock market, volatility skew (i.e. low implied at

high strike but high implied at low strike) is more common than volatility smile after the

October 1987 stock market crash. Since the distribution is skewed to the far left, the right

tail can be thinner than the normal distribution. In this case implied volatility at high

strike will be lower than that expected from a volatility smile.

3.3.4. Realized Volatility

Financial markets are the source of high frequency data. The original form of market

prices is tick-by-tick data: each “tick” is one logical unit of information, like a quote or

a transaction price. By nature these data are irregularly spaced in time. This high

frequency data is the primary object nowadays for those who are interested in

understanding the financial markets. This is the case, especially, because practitioners

determine their decisions by observing high frequency or tick-by-tick data. Yet, most of

the studies published in the financial literature deal with low frequency or regularly

spaced data. There are two main reasons for this. Firstly, it is still costly and time

consuming to collect, collate, store, retrieve, and manipulate high frequency data. That

is why most of the available data are at daily or lower frequency. The second reason is

somehow more subtle but still quite important: most of the statistical apparatus has been

developed and thought for homogeneous (i.e. equally spaced in time) time series. There

is little work done to adapt the methods to data that arrive at random intervals.

Unfortunately in finance, regularly spaced data are not original data but artifacts derived

from the original market prices. Nowadays, with the development of computer


153

technology, data availability is becoming less and less of a problem. Slowly, high

frequency data are becoming a fantastic experimental bench for understanding market

microstructure and more generally for analyzing financial markets.

Accurate volatility estimates are vital in many areas of finance. As seen above,

historical and implied methods are two main approaches to modeling volatility. The

historical approach employs econometric time series analysis by fitting, for example, an

Autoregressive Moving Average (ARMA) model to an historical series of estimated

volatilities or by using an ARCH or Stochastic Volatility type model to estimate the

conditional second moment of the return series. Implied methods on the other hand,

make use of option pricing models such as BS model in conjunction with market prices

of options. The volatility level that equates these observable prices to the ones produced

by the model is recovered typically via the use of a numerical algorithm such as

Newton-Raphson. An increasingly studied concept within the historical approach is the

idea of Realized volatility or its square, the Realized Variance (RV). This departs from

more traditional historical estimation methods by using returns of high-frequency.

Generally speaking, “high frequency” can be defined as the use of measurement

intervals of less than one day (e.g. an hour, one minute, five seconds, etc). RV itself can

be loosely defined as the sum of squared intra-period returns.

The essential aspects of RV can be largely attributed to Merton (1980), who gives rise

to the notion that an estimate of variance for a given period can be obtained by

summing the intra-period squared returns from a Gaussian diffusion process. The key

powerful result is that this estimator can be made arbitrarily accurate, such that as the

interval over which the returns are calculated becomes infinitesimally small, one is able

to treat volatility as essentially observable. Early empirical work in this area, such as

that by French, Schwert, and Stambaugh (1987), typically involved the construction of

monthly volatility figures from the aggregation of squared daily returns. More recently,

as high frequency data have become more widely and cheaply available, the literature

has increasingly concentrated on the use of intraday data to produce daily RV measures.

Interest in the early work was reignited by Andersen and Bollerslev (1998b), who

criticize the use of sampling frequencies of one day, one week, one month, etc., because


154

the additional information held within intraday returns is lost. They build on the earlier

theory and establish the properties of consistency and efficiency for the RV estimator

when log-prices follow a semi-martingale process. In trying to estimate daily volatility,

Andersen etal, (2001a) utilized the 5-minute intraday return series to estimate daily

volatility and state that “by summing intraday returns sufficiently frequently, the

realized volatility can be made arbitrarily close to the underlying integrated volatility”.

They further state that “for practical purposes we can treat the daily volatility as

observed, which enables us to examine its properties directly”. By treating daily

volatility as an observation one has a powerful tool to analyze a broad range of issues in

financial economics, both within and beyond the realm of market microstructure.

A number of authors have used the notion of realized volatility to estimate daily realized

volatility from intraday data. Beckers (1983), Anderson (1995), Parkinson (1980) and

Rogers and Satchell (1991) all propose estimators of daily volatility based upon high,

low, opening and closing prices. Schwert (1990), Hsieh (1991), Andersen and Bollerslev

(1998a) propose efficient unconditional daily volatility estimators based upon the intraday

return series. The practical value of the concept of daily realized volatility has also been

extensively demonstrated in the literature. Andersen and Bollerslev (1997,1998a) show

how high frequency intraday returns contain valuable information for the measurement of

volatility at the daily level. Moosa and Bollen (2001) utilize realized volatility to test the

relationships between volatility and time to maturity in futures markets. Moosa and

Bollen (2002) also employ the concept of realized volatility to test for bias in Value at

Risk estimates. Using daily data, Canina and Figlewski (1993) examine the information

content of an option’s implied volatility using realized volatility as the benchmark.

Putting to one side the evidence (which is cited above) that supports the empirical

efficacy of estimates and forecasts, there are several other appealing qualities of RV.

Firstly, RV has a solid theoretical foundation. Secondly, compared with complex

models, which are highly parametric and often difficult to calibrate (e.g SV models),

RV is non-parametric or ‘model-free’ and one therefore, avoids the risk of model

misspecifications. Linked with this and on a more practical level, RVs are extremely

simple and straightforward to calculate- all that is needed are market prices. Thirdly, in


155

contrast to implied volatilities, which can only be calculated for securities that have

options written on them, the universe of securities and instruments that have sufficient

data needed for the RV calculation is much wider. Finally, the RV approach does not

require information outside the estimation interval; the volatility for a period such as 24

hours can be derived entirely from the intraday returns within that period. In this

respect, RV can be likened to estimators such as the one proposed by Parkinson (1980),

who shows that a security’s intraday high and low price can be analytically transformed

into its volatility when prices move in geometric Brownian motion (see diBartolomeo

(2002) for a good discussion). This can be contrasted with more elementary (and

commonly used) methods, such as estimating the daily volatility as the standard

deviation of daily returns over some longer period (e.g. one month).

While the theoretical results are only strictly valid as the return sampling frequency

goes to infinity, in reality prices are not observed continuously and one is therefore

constrained by the number of transactions that actually occur. For example, the

calculation of RV using 5-minute returns on an asset that usually only trades every two

days is not possible. Another important potential drawback of RV relates to market

microstructure effects, such as bid-ask bounce and non-synchronous trading that

‘contaminate’ the data. There is much literature in this area, and Owens and Steigerwald

(2005) provide a good general overview. Microstructure effects are often not apparent

at lower return frequencies (e.g. daily or weekly) but are usually very evident in

intraday data and manifest as autocorrelation within the time series of returns. Oomen

(2004) makes the interesting point that such autocorrelation does not necessarily

contradict the efficient market hypothesis, which would initially seem to imply that

autocorrelation should not persist for any significant period of time. His reason is that

the serial correlation is of statistical (i.e. as a result of microstructure and transaction

cost effects) and not economic relevance. Moreover, he shows that autocorrelation

causes the RV estimator to be biased. There is general agreement in the literature (see

for example, Fang 1996; Oomen 2002, etc) that these microstructure effects will have a

greater adverse effect upon the statistical performance of RV, the lower the trading

liquidity of the security and the higher the frequency at which returns are sampled.


156

3.3.5. Stochastic Volatility Models

Stochastic volatility is another important concept, which is used in financial economics

and mathematical finance to deal with the time-varying volatility in evaluating

derivatives, such as options. In the broadest sense, any model that deals with non-

constant volatility can be seen as a stochastic volatility model and this explanation of

stochastic volatility models is commonly accepted among market participants.

However, in academic literature, stochastic volatility models refer to models that treat

the volatility as a random process governed by state variables, such as the price of

underlying securities. The non-measurability of the information set with respect to

observable filtration distinguishes stochastic volatility models from ARCH models. The

Heston (1993) model is one of the most commonly used stochastic volatility models, in

which the volatility itself is used as the state variable. Compared to ARCH models, the

outcomes of stochastic volatility models are more difficult to estimate due to the latent

character of some of the variables in these models. Commonly used estimation methods

for stochastic volatility models include the Generalized Method of Moments (GMM)

approach and Monte Carlo simulation.

Autoregressive conditional heteroscedasticity (ARCH) processes are often described as

SV, but they are different in nature. The essential feature of ARCH models is that they

explicitly model the conditional variance of returns given past returns observed by the

econometricians. This one-step-ahead prediction approach to volatility modeling is very

powerful, particularly in the field of risk management. It is convenient from an

econometric viewpoint as it immediately delivers the likelihood function as the product

of one-step-ahead predictive densities. In the SV approach, the predictive distribution of

returns is specified indirectly, via the structure of the model, rather than explicitly. For a

small number of SV models this predictive distribution can be calculated explicitly but,

invariably, for empirically realistic representations, it has to be computed numerically.

This move away from direct one-step-ahead predictions, has some advantages. In

particular, in continuous time it is more convenient, and perhaps more natural, to model

directly the volatility of asset prices as having its own stochastic process without

worrying about the implied one-step-ahead distribution of returns recorded over an

arbitrary time interval convenient for the econometricians, such as a day or a month.


157

This does, however, raise some difficulties as the likelihood function for SV models is

not directly available, much to the frustration of econometricians in the late 1980s and

early 1990s. Since the mid-1980s continuous-time SV has dominated the option pricing

literature but early on econometricians struggled with the difficulties of estimating and

testing these models. Only in the 1990s were novel simulation strategies developed to

efficiently estimate SV models. These computationally intensive methods enable us,

given enough coding and computing time, to efficiently estimate a broad range of fully

parametric SV models. This has led to refinements of the models, with many earlier

tractable models being rejected from an empirical viewpoint. The resulting enriched SV

literature has brought us much closer to the empirical realities we face in financial

markets.

Academics often study the continuous-time SV models within the context of option

pricing, with one of the most well known papers in this area being that by Hull and

White (1987), which considers a diffusion volatility model with leverage effects.

Wiggins (1987) generalized this by allowing that the correlation between two Brownian

processes could be nonzero. Scott (1987) considered a model in which the logarithm of

the volatility is an Ornstein Uhlenbeck (OU) process, and Stein and Stein (1991) and

Heston (1993) proposed a SV model in which the volatility itself is also an Ornstein

Uhlenbeck (OU) process. More variants of these models are derived with the attempt to

capture phenomena in empirical findings. For instance, the fractional stochastic volatility

models by Comte, Coutin and Renault (2003) intended to reflect the long memory

property of volatility, while Cont and Tankov (2004) tried to model the jumps in the

volatility process. The discrete-time SV model centers on the Mixture-of-Distributions

Hypothesis (MDH), where returns are governed by a latent information arrival process.

The actual parameterizations of most discrete-time SV models are often based on

specific continuous-time SV models. Similar to the GARCH class of models, most

discrete-time SV models in related literature rely on the autoregressive formulation of a

latent volatility process. One of the two main types of discrete-time SV models is the

lognormal stochastic autoregressive volatility model by Tayler (1986) and Harvey, Ruiz

and Shephard (1994). The square-root stochastic autoregressive volatility model by

Meddahi and Renault (2004) is the other main class of discrete-time SV model.


158

3.4. CONCLUSIONS

The present chapter is a short description of the concept of volatility and how it can be

measured. It defines the concept of volatility, as defined by many authors and as

accepted in the field of research as well as the various concepts of returns and their

bearing on results and inferences. In the last section, five categories of models are

explained which can be used to measure volatility of a time series of returns of different

assets in the stock market. These include: historical volatility models, ARCH/GARCH

class of models, implied volatility models, realized volatility models and lastly,

stochastic volatility models. In each category of models, only the most popular ones are

explained in details, as the list of volatility models, in each category, is a long one.

Chapter--3333 Volatility: Volatility:: : Concepts and...

Documents

Transcript of Chapter--3333 Volatility: Volatility:: : Concepts and...