Chapter--3333 Volatility: Volatility:: : Concepts and...
Transcript of Chapter--3333 Volatility: Volatility:: : Concepts and...
ChapterChapterChapterChapter----3333
VolatilityVolatilityVolatilityVolatility: : : : Concepts and Concepts and Concepts and Concepts and Various ModelsVarious ModelsVarious ModelsVarious Models
Chapter-3 Volatility: Concepts and Various Models
116
CHAPTER-3
VOLATILITY: CONCEPTS AND VARIOUS MODELS
Present chapter includes accumulation of knowledge from various textbooks and
articles about the concept of volatility and certain issues related to it. There are four
sections in this chapter. The first section defines and explains the concept of return.
The second section covers the various definitions of the term “volatility”. The third
section deals with the volatility modeling and explains the various models for
measuring volatility. The fourth section summarizes the chapter.
3.1. DEFINITIONS OF RETURNS
Although prices are what we observe in financial markets, most empirical studies focus
on returns. The reason is that, in general, prices are non-stationary whereas returns are
stationary. There are several return definitions, some of which are given as follows:
(1) Simple Price Differences
If Pt is the asset price at time t, then the simple price difference, denoted by Dt is
defined as:
The variability of the simple price difference is an increasing function of the price level
and this might bias the inference at least when the price level increases significantly
during the analysis period. Fortunately, the use of logarithmic return neutralizes most of
this effect (Fama 1965a, 45).
(2) Percentage Returns or Simple Returns
The one-period simple return for holding an asset, which is numerically very close to
logarithmic return for small changes, is defined as:
1
1
−
−−=
t
ttt
p
PPR or )1(1 ttt RPP += −
Chapter-3 Volatility: Concepts and Various Models
117
where, Pt is the price (including dividends) of the asset at time t, and Rt is the one-
period simple return from time t-1 to t.1
(3) Log Returns
The log return, rt, is defined as:
rt = log (Pt) - log (Pt-1) = pt-pt-1
where pt=log(Pt) is the log price. In addition, the change in logarithmic price is the
yield, with continuous compounding, from holding a security for the period in question.
Proof of this follows from Fama (1965a, 45):
exp ln ln
Table 3.1: Return Aggregation
Aggregation Temporal Cross-section
Percent Return 1 1 !
Logarithmic Return " " " # !$%&
'
Source: RiskMetricstm
– Technical Document 1996, 49.
When applying logarithmic return, the continuous time generalizations of discrete time
results are easier and returns over more than one day are simple functions of single day
returns (Taylor 1986, 13). In other words, a key advantage of the log return is that the
multiple-period return is simply the sum of one-period returns, so that
1 If the asset is held for k periods from t-k to t, the k-period simple return is calculated as:
kt
kttt
P
PPkR
−
−−=][ or )1(.........)1(])[1( 1 tktkttktt RxxRPkRPP ++=+= +−−− where Rt[k]
is the k-period simple return from t-k to t. Therefore, the simple one-period return and k-period
return is non-linear. In some cases, if returns are small, we may use the approximation
∑−
=
−≈1
0
k
j
jtt RR but it is too crude in many applications.
Chapter-3 Volatility: Concepts and Various Models
118
∑∑−
=
−
−
=
− =+=+=1
0
1
0
)1log(])[1log(][k
i
jt
k
j
jttt rRkRkr
In addition to these points, the return aggregation is of importance in most of the
financial applications. Table 3.1 summarizes the difference between percentage returns
and logarithmic returns. Wi denotes the weight of asset I, t denotes the time and p
denotes the portfolio. The table indicates that when the aggregation is done across time
it is more convenient to work with logarithmic returns and, in case of aggregation
across assets, percentage return results in a simpler expression.
Figure 3.1: Terms of Measurement
0
1000
2000
3000
4000
5000
6000
7000
2-Jan-91
2-Jan-92
2-Jan-93
2-Jan-94
2-Jan-95
2-Jan-96
2-Jan-97
2-Jan-98
2-Jan-99
2-Jan-00
2-Jan-01
2-Jan-02
2-Jan-03
2-Jan-04
2-Jan-05
2-Jan-06
2-Jan-07
2-Jan-08
2-Jan-09
Closing Prices of NIFTY
-600
-500
-400
-300
-200
-100
0
100
200
300
400
2-Jan-91
2-Jan-92
2-Jan-93
2-Jan-94
2-Jan-95
2-Jan-96
2-Jan-97
2-Jan-98
2-Jan-99
2-Jan-00
2-Jan-01
2-Jan-02
2-Jan-03
2-Jan-04
2-Jan-05
2-Jan-06
2-Jan-07
2-Jan-08
2-Jan-09
Price Differences of NIFTY
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
2-Jan-91
2-Jan-92
2-Jan-93
2-Jan-94
2-Jan-95
2-Jan-96
2-Jan-97
2-Jan-98
2-Jan-99
2-Jan-00
2-Jan-01
2-Jan-02
2-Jan-03
2-Jan-04
2-Jan-05
2-Jan-06
2-Jan-07
2-Jan-08
2-Jan-09
Percentage Returns for NIFTY
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
2-Jan-91
2-Jan-92
2-Jan-93
2-Jan-94
2-Jan-95
2-Jan-96
2-Jan-97
2-Jan-98
2-Jan-99
2-Jan-00
2-Jan-01
2-Jan-02
2-Jan-03
2-Jan-04
2-Jan-05
2-Jan-06
2-Jan-07
2-Jan-08
2-Jan-09
Log Returns for NIFTY
Chapter-3 Volatility: Concepts and Various Models
119
However, since percentage return and logarithmic return are very close for small
changes, it is common to approximate a portfolio return in case of logarithmic return as:
" ( !")
*
This leads to a situation where the one-day model computed with rt extends easily to
returns of greater than one day. In general, the use of logarithmic return is commonly
accepted among the financial researchers. Use of different return concepts may lead to
different interpretations. Figure 3.1 illustrates the behavior of the three measures when
applied to Nifty 50. Especially, the effect of price level on price variability, in case of
the simple price differences, is clearly visible from it.
3.2. CONCEPT OF VOLATILITY
Volatility is a crucial concept in financial theory and practice. Nonetheless, it is not easy
to define the concept of volatility. Since volatility is often calculated as a sample
standard deviation, people often mistakenly assume that volatility is equivalent to
standard deviation, which is in fact only a biased estimator of true volatility. As
described by Andersen, Bollerslev, Christoffersen and Diebold (2005):
“In everyday language, volatility refers to the fluctuations observed in some phenomena
over time. Within economics, it is used slightly more formally to describe, without a
specific implied metric, the variability of the random (unforeseen) component of a time
series.” (p. 1)
This description of volatility provides a relatively clear, intuitive explanation of
volatility. Andersen, Bollerslev, Christoffersen and Diebold (2005) further explains:
“More precisely, or narrowly, in financial economics, volatility is often defined as the
(instantaneous) standard deviation (or sigma) of the random Wiener-driven component
in a continuous-time diffusion model. Expressions such as the “implied volatility” from
option prices rely on this terminology.” (p. 1)
Chapter-3 Volatility: Concepts and Various Models
120
Volatility, in other words, is defined as the spread of all likely outcomes of an uncertain
variable. Typically, in financial markets, we are often concerned with the spread of
asset returns. Volatility is the degree to which financial prices tend to fluctuate. Large
volatility means that returns (that is: the relative price changes) fluctuate in a wide
range.
Statistically, volatility most frequently refers to the sample standard deviation of the
continuously compounded returns (refer to section 3.1.2) of a financial instrument with
a specific time horizon. The sample standard deviation is defined as:
∑−
−−
=T
t
trT 1
2)(1
1ˆ µσ [3.1]
where, rt is the return on day t , and µ is the average return over the T-day period.
Standard deviation is often used to quantify the risk of the instrument over that time
period. Volatility is typically expressed in annualized terms, and it may either be an
absolute number ($5) or a fraction of the mean (5%). For a financial instrument, whose
price follows a Gaussian random walk, or Wiener process, the volatility increases as
time increases. Conceptually, this is because there is an increasing probability that the
instrument's price will be farther away from the initial price as time increases. However,
rather than increase linearly, the volatility increases with the square-root of time as time
increases, because some fluctuations are expected to cancel each other out, so the most
likely deviation after twice the time will not be twice the distance from zero.
More broadly, volatility refers to the degree of (typically short-term) unpredictable
change over time of a certain variable. It may be measured via the standard deviation of
a sample, as mentioned above. However, price changes actually do not follow Gaussian
distributions. Better distributions used to describe them actually have "fat tails"
although their variance remains finite. Therefore, other metrics may be used to describe
the degree of spread of the variable. As such, volatility reflects the degree of risk faced
by someone with exposure to that variable.
Chapter-3 Volatility: Concepts and Various Models
121
Different Definitions of Volatility
Volatility is a theoretical construct. Models for volatility often use an unobservable
variable that controls the degree of fluctuations of the financial return process. This
variable is usually called the volatility. Generally, two different volatility models, will
lead to different concepts of volatility. For example, in GARCH models the volatility is
thought of as conditional variance (or standard deviation) of the return, whereas in
diffusion models (stochastic differential equations) the volatility refers to either the
instantaneous diffusion coefficient or the quadratic variation over a given time period
(often called the integrated volatility). It is useful to keep the following questions in
mind in the context of any volatility definition: (i) what is the asset price model? (ii)
what is the time horizon for this volatility? (iii) Is the volatility forward looking,
backward looking or instantaneous? (iv) Is the volatility a model variable, or the
estimator of a model variable? (v) In case of an estimator of the model variable: is the
estimated volatility extracted from returns data (price fluctuations), or from option
prices (implied volatilities)? Each bullet treats a different volatility definition:
• Conditional Variance/Conditional Standard Deviation: Given the information
untill until now +,, the variance of the financial return ",, over the next
period is:
./"",0+,1 [3.2]
The square root of this quantity is the conditional standard deviation. Note that
this variance depends on the information set. To obtain an explicit number, one
has to make model assumptions for the returns. If one assumes that the financial
returns are iid Normal, then the volatility is constant and the usual variance
estimator is appropriate:
234 1/6 1 ∑ ", "8,4), [3.3]
Time series models are designed to deal with the situation of time varying
volatility.
• Time Series Volatility: Discrete time models for time varying volatility often have
a product structure:
", 2, 9,. [3.4]
Chapter-3 Volatility: Concepts and Various Models
122
The financial return ", over period n is the product of the volatility 2, and the
zero mean, unit variance innovation 9,. Models for 2, include ARCH/GARCH,
Stochastic Volatility, Long Memory, Markov Switching, etc.
• Spot Volatility: the instantaneous volatility. One needs a model describing the
continuous time price movements. Consider a stochastic differential equation like:
;< 2<;=< [3.5]
where B denotes standard Brownian motion and p(t) the log price process. The
variable 2< is called the Spot volatility. Models of this class are often used in
option pricing.
• Quadratic Variation: Consider a continuous time stochastic process over a given
time period. Divide the time period into small, adjacent intervals. Determine the
sum of the squared returns over these intervals:
∑ < <4 [3.6]
The quadratic variation is defined as the limit of these sums as the length of the
sampling intervals goes to zero.
• Realized Volatility Measures: The finite sample quantities in equation (3.4) are
often called “realized volatility” or “realized variance”. Popular sampling
frequencies are measured every 5 minutes and every 30 minutes. It is possible to
construct alternative proxies for volatility, for example, using the high low ranges
over intraday intervals.
• Implied Volatility: Given a specific asset price model, one can determine the
volatility that matches the theoretical option prices from the model to the real life
option prices in the market. This volatility is called the implied volatility. The
Black and Scholes is often used for determining implied volatilities. There also
exist ‘model free implied volatilities’.
Standard Deviation Versus Variance
Sometimes, variance, 2σ , is also used to measure volatility. Since variance is simply
the square of standard deviation, it makes no difference whichever measure we may use
when we compare volatility of two assets. However, variance is much less stable and
Chapter-3 Volatility: Concepts and Various Models
123
less desirable than standard deviation as an object for computer estimation and volatility
forecast evaluation. Moreover, standard deviation has the same unit of measure as the
mean, i.e. if the mean is in rupees, then standard deviation is expressed in rupees
whereas variance is expressed in rupee square. For this reason, standard deviation is
more convenient and intuitive when volatility is examined.
Volatility Versus Risk
Volatility is related to, but not exactly the same as risk. Risk is associated with
undesirable outcome, whereas volatility, as a measure strictly for uncertainty, could be
due to a positive outcome. Moreover, volatility is not a good or perfect measure of risk
because volatility (or standard deviation) is only a measure for the spread of a
distribution and has no information on its shape. The only exception is the case of
normal or lognormal distribution where the mean and the standard deviation are
sufficient statistics for the entire distribution (i.e through these two alone one can
reproduce the empirical distribution).
Volatility Versus Direction
Volatility does not imply direction. This is due to the fact that all changes are squared.
An instrument that is more volatile is likely to increase or decrease in value more than
one that is less volatile. For example, a savings account has low volatility. While it
won’t lose 50% in a year, it also won’t gain 50%.
Volatility is Unobservable
Common to all models of volatility, is that volatility itself is not observed. The situation
may be compared to rolling a dice: one can observe the realizations of the dice [1, 2, 3,
4, 5, 6], but cannot observe the tendency of the (unfair) dice to yield extreme outcomes
like “1” or “6”. This tendency may be estimated though, and after sufficiently many
observations there remains only a small uncertainty about the probability of obtaining a
“6”. The situation of estimating volatility is comparable: here too one observes the
realizations (the returns), and not the tendency to yield extreme returns. However, the
situation is less encouraging in an important respect: in contrast to the setting of rolling
a dice, the estimate of volatility does not necessarily improve as one collects more data,
since volatility itself changes. That is, data one month from now may have little to do
Chapter-3 Volatility: Concepts and Various Models
124
with the current volatility. So there will always be uncertainty about the current and past
values of volatility.
3.3. VOLATILITY MEASUREMENT
Volatility estimation in the context of option pricing must be considered within the
broader context of asset price and return volatility. Mathematical option pricing models
require estimation of volatility. The future volatility of the underlying asset price is a
parameter that must be input into the option pricing model. The volatility estimate used
in option valuation is the annualized standard deviation of the logarithms of the asset
returns (or the continuously compounded asset returns). The volatility estimate is a
measure of the uncertainty about the returns on the asset. It is used to generate the
distribution of asset prices at the option expiration to calculate the fair value of option.
There are several aspects of volatility estimate that should be considered: Firstly, the
volatility parameter required to derive option value is forward looking, that is, the
relevant volatility is the asset return volatility in the period to option expiry. Secondly,
volatility is assumed to be constant between pricing date and option expiry. Thirdly,
volatility is assumed to be time homogeneous, that is, it is the same over the life of the
option. And lastly, uncertainty about the asset price at option maturity is assumed to be
directly proportional to the asset price at commencement. The estimation of the
volatility of the underlying asset price is particularly problematic because it is the only
parameter of most mathematical models that is not observable directly. Moreover, the
sensitivity of the option value to this parameter places additional demands on the
estimation of volatility.
Types of Volatility Models
There are various classes of models and estimators, which have been proposed in the
literature for measuring volatility of asset returns. Models and estimators, assuming
volatility to be constant are the oldest ones among the models, which have been used to
estimate and forecast volatility. These models measure “unconditional volatility”. With
the recognition of empirical regularity that the volatility in financial markets is clustered
in time and is time varying, these models gave way to models measuring “conditional
volatility”. In addition, volatility estimated from the value of options, in which typically
Chapter-3 Volatility: Concepts and Various Models
125
volatility is the only unobservable parameter for valuation, allowed researchers and
practitioners to use “implied volatility”, i.e., the market forecast of volatility in valuing
the traded options. Moreover, Andersen etal (2001a, 2001b), shows that volatility
becomes observable and does not remain latent, if high frequency data is available. The
“realized volatility” estimated using high frequency data is model-free under very weak
assumptions.
Moreover, volatility models can be linear or non-linear2 in nature. When the function
of the regressor variables is linear then it is called linear time series model. In other
word, if the function relating to the observed time series Xt and the underlying shocks,
say ut is linear then this is called linear time series model. Linear volatility models are
linear in the parameters, so that there is one parameter multiplied by each variable in the
model. For example, a structural model could be something like:
> ? ?44 ?@@ ?AA B
or more compactly y=X? B. It is additionally assumed that B~60, 24. The linear
paradigm is a useful one. The properties of linear estimators are very well understood.
Many models that appear, prima facie, to be nonlinear, can be made linear by taking
logarithms or some suitable transformation. As such, linear structural (and time series)
models, as explained above, are unable to explain a number of important features
common to financial data, like leptokurtosis (that is , tendency for financial asset returns
to have distributions that exhibit fat tails and excess peakedness at the mean), volatility
clustering or volatility pooling ( the tendency for volatility in financial markets to
appear in bunches) and leverage effects (the tendency for volatility to rise more
following a large price fall than following a price rise of the same magnitude).
Examples of linear time series models include: the Autoregressive models, the Moving
2 Treating an asset return (e.g. log return rt of a stock) as a collection of random variables overtime
we have a time series rt).linear time series analysis provides a natural framework to study the
dynamic structure of such a series. The theories of a linear time series include stationarity,
dynamic dependence, autocorrelation function, modeling and forecasting. Examples of
econometric models included in the category of linear time series models can be simple
autoregressive (AR) models, simple moving average MA ) models, seasonal models, unit-root non-
stationarity, regression models with time series errors, fractionally differenced models for long
range dependence, etc. For an asset return rt, simple models attempt to capture the linear
relationship between rt and the information available prior to time t.
Chapter-3 Volatility: Concepts and Various Models
126
Average models, ARMA(p,q) model, ARIMA(p,d,q) model, etc. During the last two
decades or so, a new area of “Non-Linear time series modeling” is fast coming up.
Here, there are basically two possibilities, viz. Parametric or Nonparametric
approaches. Evidently, if in a particular situation there is surety about the functional
form, then the former should be used, otherwise the latter may be applied. When
dealing with nonlinearities, Campbell et al. (1997) identified that in linear time series
the shocks are assumed to be uncorrelated but not necessarily identically independently
distributed (iid). Moreover, in non-linear time series the shocks are assumed to be iid,
but there is a non-linear function relating to the observed time series Xt and the
underlying shock ut. there can be models that are Non-linear in mean, like the Non-
linear moving average model (it is non-linear in mean but linear in variance); and there
can be models that are non-linear in variance, like the Engle’s ARCH model (it is non-
linear in variance but linear in mean).
The combination of powerful methodological advances and important applications
within empirical finance produced explosive growth in the financial econometrics of
volatility dynamics, with the econometrics and finance literatures cross-fertilizing each
other furiously. Initial developments were tightly parametric, but the recent literature
has moved in less parametric, and even fully non-parametric directions. The parametric
approaches to volatility are based on explicit functional form assumptions regarding the
expected and/or instantaneous volatility corresponding to the strength of volatility
process at a point of time. In the discrete-time ARCH class of models, the expectations
are formulated in terms of directly observable variables, while the discrete- and
continuous-time stochastic volatility (SV) models both involve latent state variable(s).
The non-parametric approaches to volatility are generally free from such functional
form assumptions and hence afford estimates of notional volatility that are flexible yet
consistent (as the sampling frequency of the underlying returns increases). The non-
parametric approaches include ARCH filters and smoothers designed to measure the
volatility over infinitesimally short horizons, as well as the recently popularized realized
volatility measures for (nontrivial) fixed-length time intervals.
Examples of univaraite volatility models include Engle’s ARCH model, the GARCH
model, the exponential GARCH (EGARCH) model, the threshold GARCH (TGARCH)
Chapter-3 Volatility: Concepts and Various Models
127
model, the conditional heteroscedasticity autoregressive moving average (CHARMA)
model the random coefficient autoregressive (RCA) model, the stochastic volatility
models of Melino and Turnbull (1990), Taylor (1994), Harvey, Ruiz, and Shephard
(1994), etc. The univaraite volatility models can be generalized to the multivariate case
in which there are some simple methods for modeling the dynamic relationships
between volatility processes of multiple asset returns. Multivariate volatilities measure
the conditional covariance matrix of multiple asset returns. The multivariate volatility
models include the GARCH models for bivariate returns (includes the constant
correlation model as well as the time-varying correlation bivariate GARCH model), etc.
If EF denotes the correlation between /; F, then a time series is said to have a
short memory provided ∑ EF,F* converges to a constant as n becomes large. A series
with long memory has autocorrelation values that decline slowly at a hyperbolic rate.
For example, the popular ARCH class of models is a short-memory volatility model.
The examples of long-memory models include the break model [Granger and Hyung
(2004), Starica and Granger (2004)], regime switching models [Hamilton and Susmel
(1994), Diebold and Inoue (2001), Hillebrand (2005)], fractionally integrated GARCH,
etc. Most previous research has focused on linear models, fractionally integrated models
in particular, to study long memory. More recent studies have shown that a number of
non-linear volatility models can also produce long memory characteristics in volatility.
Examples of such models include the previous mentioned break models and the regime
switching models. In these two categories of models, volatility is characterized by short
memory between breaks and within each regime. Without controlling for the breaks and
the changing regimes, volatility will produce spurious long-memory characteristics.
Each of these models represents a very different volatility structure and produces
volatility forecasts that are very different from each other and different from those of
the fractionally integrated and short-memory models.
3.3.1. Historical Volatility Models
Historical volatility is a measure of price fluctuation overtime. Historical volatility uses
historical (daily, weekly, monthly, quarterly and yearly) price data to empirically
measure the volatility of an asset in the past. Compared with the other types of volatility
models, historical volatility models (HIS) are easy to manipulate and construct. HIS
Chapter-3 Volatility: Concepts and Various Models
128
models can be categorized into two major categories: single state models and regime
switching models. All HIS models differ by the number of lag volatility terms included
in the model and the weights assigned to them, reflecting the choice on the tradeoff
between increasing the amount of information and more updated information.
(1) Single State Models
(i) Random Walk Model
The simplest historical price model is the random walk model, where the difference
between consecutive period volatility is modeled as a random noise;
ttt v+= −1σσ
So the best forecast for tomorrow’s volatility is today’s volatility;
tt σσ =+1ˆ
where tσ alone is used as a forecast for 1+tσ .
(ii) Historical Average Method
In contrast, the Historical Average method makes a forecast based on the entire history
23 2 2 G 2
In other words, under the assumption of a stationary mean, the best forecast of today’s
volatility is a long-term average (LTM) of past observed volatilities.
(iii) Moving Average Method
The simple Moving Average method:
23 1H 2 2 G 2F
is similar to the historical average method, except that older information is discarded.
The value of H (i.e. the lag length to past information used) could be subjectively chosen
or based on minimizing in-sample forecast error, I 2 23. The multi-period
forecast 23F JK" H L 1 will be the same as the one-step-ahead forecast 23 for all
three methods described above.
Chapter-3 Volatility: Concepts and Various Models
129
(iv) Exponential Smoothing Methods
The Exponential method:
23 1 ?2 ?23 I /; 0 M ? M 1, 23 1 ?2 ?2N
is similar to the historical method, but more weight is given to the distant past. The
smoothing parameter ? is estimated by minimizing the in-sample forecast error I. The
exponential smoothing methods are essentially methods of fitting a suitable curve to
historical data of a given time series. There are a variety of these methods, such as
single exponential smoothing, Holt’s linear method, Holt-Winter’s method and their
variations. Although used in several areas of Business and Economic forecasting, these
methods are now supplanted by other better methods.
(v) Exponentially Weighted Moving Average Method (EWMA)
The Exponentially weighted moving average method:
23 ?2F
* ?F*O
is the moving average method with exponential weights. Again the smoothing ? is
estimated by minimizing the in-sample forecast errors I.
All the Historical volatility methods above have a fixed weighting scheme or a
weighting scheme that follows some declining pattern. Other types of historical models
have weighting schemes that are not pre-specified.
(vi) Simple Regression Method
The simplest method with weighting schemes having no pre-specifications is the simple
regression method:
2 P ?2 ?424 G ?,2, Q,
23 P ?2 ?42 G ?,2,
which expresses volatility as a function of its past values and an error term. It is
principally an autoregressive method.
Chapter-3 Volatility: Concepts and Various Models
130
(vii) ARIMA Model
If Y, a variable at time t, is modeled as:
R S TR S U
where S is the mean of Y and U is an uncorrelated random error term with zero mean
and constant variance 24 (i.e. it is white noise), then Yt is said to follow a first-order
autoregressive (or AR (1)), stochastic process3.
If the variable Y is modeled as:
R B ?VU ?U ?4U4
where B is a constant and U, as before, is the white noise stochastic error term, then Y is
said to follow a first-order moving average, (MA(1)), process. Here, Y at time t is equal
to a constant plus a moving average of the current and past error terms. In a similar
fashion one can think of a second-order, MA (2), or a qth-order, MA (q), moving
average process. In short, a moving average process is simply a linear combination of
white noise error terms.
If Y has characteristics of both AR and MA then it is said to follow an ARMA process.
Y following an ARMA (1,1) process is modeled as
R W TR ?VU ?U
where W represents a constant term. The above equation includes one autoregressive and
one moving average term. In general, in an ARMA (p,q) process, there will be p
autoregressive and q moving average terms.
In all the above three equations, (for AR, MV and ARMA processes), it is assumed that
the time series involved are (weakly) stationary in the sense that the mean and variance
of the time series are constant and its covariance is time-invariant. But, in reality, the
economic time series are generally, non-stationary, that is, they are integrated.
3 Here the value of Y at time t depends on its value in the previous time period and a random term;
the Y values are expressed as deviations from their mean value. In other words, it means the
forecast value of Y at time t is simply some proportion ( T of its value at itme (t-1) plus a
random shock or disturbance at time t. In a similar fashion one can think of a second-order
autoregressive process AR (2) or a pth-order autoregressive process AR (p) for the variable Y.
Chapter-3 Volatility: Concepts and Various Models
131
If a time series is integrated of order 1 (i.e. it is I(1)), its first differences are I(0), that is
stationary4. If a time series is differenced d times to make it stationary, and then the
ARMA (p,q) is applied to model it, then the original time series is ARIMA (p,d,q), that
is, it is an autoregressive integrated moving average time series, where p denotes the
number of autoregressive terms, d the number of times the series is differenced before it
becomes stationary, and q the number of moving average terms. If d=0 (i.e. a series is
stationary to begin with), then ARIMA(p,d=0,q)=ARMA(p,q). An ARIMA (p,0,0)
process means a purely AR(p) stationary process; an ARIMA(0,0,q) means a purely
MA(q) stationary process. Given the values of p,d,q, one can tell what process is being
modeled. Popularly known as the Box-Jenkins methodology, but technically known as
the ARIMA methodology, the emphasis of this method is not on constructing single-
equation or simultaneous-equation models but on analyzing the probabilistic, or
stochastic, properties of economic time series on their own under the philosophy: let the
data speak for themselves. Unlike the regression models, in which Yt is explained by k
regressors X1,X2,……Xk, the BJ-type time series models allow Yt to be explained by
past, or lagged, values of Y itself and the stochastic error terms. For this reason,
ARIMA models are sometimes called atheoretic models because they are not derived
from any economic theory. ARIMA models pertaining to a single time series are
categorized as univaraite ARIMA models, whereas those pertaining to multiple time
series are categorized as multivariate ARIMA models.
(2) Regime-Switching and Transition Exponential Smoothing
Regime switching allows the stock price process to switch between k regimes
randomly; each regime is characterized by different model parameters, and the process
describing which regime price process is in at any time is assumed to be Markov (that
is, the probability of changing regime depends only on the current regime, not on the
history of the process). The rationale behind the regime-switching framework is that the
market may switch from time to time between, say, a stable low-volatility state and a
more unstable high-volatility regime. Periods of high volatility may arise, for example,
because of short-term political or economic uncertainty. In fact, an important stylized
4 Similarly, if a time series is I(2), its second difference is I(0). In general, if a time series is I(d),
after differencing it d times we obtain an I(0) series.
Chapter-3 Volatility: Concepts and Various Models
132
fact of stock volatility known as non-linearity, can be explained as a realization of
regime switching (Robe and Kosfeld (2001), de Lima (1998)). The switching is able to
describe a number of features inherent in the stock market returns, such as leptokurtosis
and mean reversion. Blanchard and Watson (1982) identify surviving bubbles or
collapsing bubbles as regime switching process.
The basic intuition of Regime-switching models is that the data generation process can
change across different states (examples). In particular, regime-switching models
assume that the relevant relationships are linear within each regime, but possibly
different across regimes. Example:
> X Y
where Y~0, 24 and i=1,2. In regime-switching models the transition across regimes
are assumed to be stochastic.
There are two types of regime-switching models: threshold models, in which the
regimes are characterized by an observable variable (SETAR, STAR, etc); and the
Markov-switching models, in which, regimes are characterized by an unobservable
variable.
(i) Threshold Models
Threshold models are an interesting alternative for modeling both returns and
volatilities. The fundamental idea behind these models is the introduction of regimes
based on thresholds, thus allowing the analysis of complex stochastic systems from
simple subsystems. In this category of models, the regime changes are characterized as
a deterministic function of past realizations of some observed variable. The overall
process is non-linear, while following a linear AR model in each regime. This category
of models can further have following variants:
(a) Threshold Autoregressive (TAR) Models
Given by Tong (1983) and Tsay (1989), the regime in this model switches according to
the observable past history of the system. There are several attractive features of these
models; like, limit cycles, amplitude dependent frequencies and jump phenomena, but,
Chapter-3 Volatility: Concepts and Various Models
133
there are somewhat arbitrary identification of the threshold variable and the threshold
value(s).
(b) Self Exciting Threshold Autoregressive (SETAR) Model
Given a time series of data Xt, the SETAR model is a tool for predicting future values in
this series, assuming that the behavior of the series changes once the series enters a
adifferent regime. The switch from one regime to to another depends on the past values
of the X series (hence the Self-Exciting portion of the name). The model consists of k
autoregressive parts, each for a different regime. The model is usually refreed to as the
SETAR(k,p) model where k is the number of regimes and p is the order of the
autoregressive part (since those can differ between regimes, the p portion is sometimes
dropped and models are denoted simply as SETAR(k)).
The SETAR model is a special case of Tong’s general threshold autoregressive models.
The latter allows the threshold variable to be very flexible, such as an exogenous time
series in the open-loop threshold autoregressive system, a Markov chain in the Markov-
chain driven threshold autoregressive models, which is also known as the Markov
switching model.
Recent research has led to the introduction of hybrid models, known as the second-
generation models, by combining SETAR model with the threshold Stochastic
Volatility model (THSV) and the threshold GARCH (TGARCH) models. The
conditional heteroscedasticity models (GARCH(1,1), SV) are non-linear time series
models most commonly used in literature. Generally, these two models cannot capture
the asymmetry of volatility. This characteristic known as the leverage effect, is related
to the asymmetric behavior of the market, in the sense that it is more volatile after a
continuous decrease in prices than after a rise (both of the same magnitude). The non-
linear SETAR models, however, allow the asymmetries in the mean (non-linearity in
the mean of returns) to be captured. The former models belong to the class of “first-
generation models”. To capture the asymmetric effect in the volatility, the introduction
of thresholds in the volatility equation has been proposed, thus obtaining the Threshold
GARCH (TGARCH) MODEL OR Threshold Stochastic Volatility (THSV) model. If
both kinds of non-linearity is explained in terms of the mean and variance, the SETAR
Chapter-3 Volatility: Concepts and Various Models
134
model is combined with the previous ones, obtaining the SETAR-TGARCH model and
the SETAR-THSV model. These econo-generation models capture the main features of
volatility. The difference is that the SETAR-THSV is more flexible than the SETAR-
TGARCH to capture kurtosis though the estimation procedure is computationally more
costly.
(c) Smooth Transition Autoregressive (STAR) Model
STAR models are applied to time series data as an extension of autoregressive models
in order to allow for higher degree of flexibility in model parameters through a smooth
transition. Given a time series of data Xt , the STAR model is a tool for predicting
future values in the series, assuming that the behavior of the series changes depending
on the value of the transition variable. The transition might depend on the past values of
the X series (similar to SETAR model) or exogenous variables. The model consists of
two autoregressive parts linked by the transition function. The model is usually referred
to as STAR(p) models proceeded by the letter describing the transition function and p is
the order of the autoregressive part. Most popular transition function includes
exponential function and first and second-order logistic functions. They give rise to
Logistic STAR (LSTAR) and Exponential STAR (ESTAR) models.
STAR models were introduced and developed by Kung-sik Chan and Howell Tong in
1986, in which the same acronym was used. It originally stands for Smooth Threshold
Autoregressive. The models can be understood as two-regime SETAR model with
smooth transition between regimes, or as continuum of regimes. In both cases the
presence of the transition function is the defining feature of the model as it allows for
changes in values of the parameters.
(ii) Markov-Switching Models
A Markov-switching model is a non-linear specification in which different states of the
world affect the evolution of the time series. The dynamic properties depend on the
present regime, with the regimes being realizations of a hidden Markov chain with a
finite state space. Markov switching models were introduced to the econometric
mainstream by Hamilton (1988, 1989) and continue to gain popularity in the financial
time series analysis. Even the most basic Markov-switching model with constant regime
Chapter-3 Volatility: Concepts and Various Models
135
parameters is capable of describing many typical characteristics of financial time series.
Hamilton’s original motivation was to model long swings in the growth rate of output,
but instead he found evidence for discrete switches in the growth rate at business cycle
frequencies. Output growth was modeled as the sum of a discrete Markov chain and a
Gaussian Autoregression:
R Z [
where
Z /V /\, \ 0 K" 1 and
[ X[ X4[4 X@[@ XA[A 2], \ 1|\ 1 , \ 0|\ 0 _, ]~60,1
The major estimation difficulty with the model is the lack of separate observability of Zt
and Xt.
There have been many extensions and generalizations of the original Markov switching
model given by Hamilton. In the original model the conditional distribution of a
realization depends on the previous four values of the Markov process, which makes the
estimation of the model computationally demanding. Hamilton (1990, 1994) suggests
alternative Markov switching models where the regression parameters, rather than the
mean, switch between states. This apparently minor modification simplifies substantial
estimation, which can be carried out as an application of the EM algorithm. Multivariate
Markov-switching models are also introduced.
According to Hansen (1992), once one generalizes the switching model to all regression
parameters, current states are independent of their previous realizations, so that a simple
switching model (mixture model), rather than a Markov switching model, seems
appropriate. A generalization of Markov switching GARCH models was developed by
Gray (1996) and subsequently modified by Klaassen (2002). While the model of Gray
is attractive since it combines Markov-switching with the GARCH effects, its analytical
Chapter-3 Volatility: Concepts and Various Models
136
intractability is a serious drawback. As a consequence, conditions for covariance
stationarity have yet to be established. Closely related is the lack of an analytic
expression for the covariance structure of the squared process. Cai (1994) and Hamilton
and Susmel (1994) propose Markov-switching ARCH models, though (standard)
GARCH (1,1) models are known to provide better descriptions of market volatility than
even high-order ARCH specifications. Their restrictions to ARCH models was due to
the path dependence in Markov switching GARCH models that arise when “literally”
translating the GARCH model of Bollerslev (1986) to a regime switching setting.
3.3.2. ARCH Model and its Variants
Time series models have been initially introduced either for descriptive purposes like
prediction and seasonal correction or for dynamic control. In the 1970s, the research
focused on a specific class of time series models, the so-called autoregressive moving
average process (ARMA), which were very easy to implement. In these models the
current value of the series of interest is written as a linear function of its own lagged
values and current and past values of some noise process, which can be interpreted as
innovations to the system. However, this approach has two major drawbacks: (1) it is
essentially a linear setup, which automatically restricts the type of dynamics to be
approximated; (2) it is generally applied without imposing a priori constraints on the
autoregressive and moving average parameters, which is inadequate for structural
interpretations.
Among the field of applications where standard ARMA fit is poor are financial and
monetary problems. The financial time series features various forms of non linear
dynamics, the crucial one being the strong dependence of the instantaneous variability
of the series of its own past. Moreover, financial theories based on concepts like
equilibrium or rational behavior of the investors would naturally suggest including and
testing some structural constraints on the parameters. In this context, ARCH models,
introduced by Engle in 1982, arise as an appropriate framework for studying these
problems. Currently, there exist more than a hundred papers and dozens of Ph.D. thesis
on this topic, which reflects the importance of this approach for statistical theory,
finance and empirical work.
Chapter-3 Volatility: Concepts and Various Models
137
(1) Basic ARCH Model
The underlying logic of this category of volatility models is that of volatility clustering
in financial time series, that is, there are periods in which prices show wide swings for
an extended time period followed by periods in which there is relative calm. In reality
one often observes that large positive and large negative observations in financial time
series tend to appear in clusters. A characteristic of most of these financial time series is
that in their level form they are random walk; that is, they are non-stationary. On the
other hand, in the first difference form, they are generally stationary. Therefore, instead
of modeling the levels of financial time series, authors try to model their first
differences. But these first differences often exhibit wide swings, or volatility,
suggesting that the variance of financial time series varies over time. The so-called
autoregressive conditional heteroscedasticity (ARCH) model, originally developed by
Engle (1982), comes in handy in modeling such “varying variance”. As the name
suggests, heteroscedasticity, or unequal variance, may have an autoregressive structure
in that heteroscedasticity observed over different periods may be auto correlated.
Suppose the following AR(1), or ARIMA(1,0,0)5 model is considered:
[4 ?V ?[4 U [3.7]
where [ is the mean-adjusted relative change in stock prices and U is a white noise
error term. This model postulates that volatility in the current period is related to its
value in the previous period plus a white noise error term. If ?4 is positive, it suggests
that if volatility was high in previous period, it will continue to be high in current
period, indicating volatility clustering. If ?is equal to zero, than there is no volatility
clustering. The statistical significance of the estimated ?4 can be judged by the usual t-
test. Suppose the following AR(p) model is considered:
[4 ?V ?[4 ?4[44 G ?[4 U [3.8]
5 For details of AR and ARIMA models refer to section 3.3.1 point (1) (vii).
Chapter-3 Volatility: Concepts and Various Models
138
This model suggests that volatility in the current period is related to volatility in the past
p periods, the value of p being an empirical question. Equation (3.7) is an example of
ARCH(1) model and equation (3.8) is an example of ARCH(p) model, where p
represents the number of autoregressive terms in the model.
An important element of ARCH models is it more readily explains “fat tailed”
leptokurtic distributions of asset price changes.
(2) Basic GARCH Model
Since its discovery in 1982, ARCH modeling has become a growth industry, with all its
variations on the original model. One that has become popular is the Generalized
Autoregressive Conditional Heteroscedasticity (GARCH) model, originally proposed
by Bollerslev in 1986. The simplest GARCH model is the GARCH(1,1) model, which
can be written as:
24 TV TU4 T424
which says that the conditional variance of u at time t depends not only on the squared
error term in the previous time period [as in ARCH(1,1)] but also on its conditional
variance in the previous time period. This model can be generalized to a GARCH(p,q)
model in which there are p lagged terms of the squared error term and q terms of the
lagged conditional variances.
It can be noted that a GARCH(1,1) model is equivalent to an ARCH(2)model and a
GARCH(p,q) model is equivalent to an ARCH(p+q) model.
(3) Variants of ARCH/GARCH
There have been breathtaking variations, extensions and applications in the basic
ARCH/GARCH model since their introduction. The list of these variations, extensions
or applications is very long and out of the scope of the present study, and therefore, only
some of these are explained below. The details of the respective models can be gathered
from their respective original articles.
Chapter-3 Volatility: Concepts and Various Models
139
(i) Augmented ARCH (AARCH)
The AARCH model of Bera, Higgins and Lee (1992) extends the linear ARCH(q)
model to allow the conditional variance to depend on cross-products of the lagged
innovations. Defining the qx1 vector `Y, Y4, … , Ybc, the AARCH(q)
model may be expressed as:
24 d e f
where A denotes a qxq symmetric positive definite matrix. If A is diagonal, the model
reduces to the standard linear ARCH(q) model. The Generalized AARCH, or
GAARCH model is obtained by including lagged conditional variances on the right-
hand-side of the equation. The slightly more general GQARCH (Generalized Quadratic
ARCH) representation was proposed independently by Sentana (1995). The
GQARCH(p,q) model is defined as:
24 d gh b* Th4 2 Tjhhj
bj* b
*b
* ?24b*
The model simplifies to the linear GARCH(p,q) model if all of the g’s and the Tj’s
are equal to zero. Defining the qx1 vector k `h, h4, … , hbc, the model may
alternatively be expressed as:
24 d Ψe e f ?24b* ,
where Ψ denotes the qx1 vector of g coefficients and A refer to the qxq symmetric matrix
of T and Tj coefficients. Conditions on the parameters for the conditional variance to be
positive almost surely and the model well-defined are given in Sentana (1995).
(ii) Modified ARCH or Multiplicative ARCH (GARCH) or Multivariate GARCH or
Mixture GARCH (i.e. MARCH/MGARCH)
Friedman, Laibson and Minsky (1989) denote the class of GARCH(1,1) models in
which the conditional variance depends non-linearly on the lagged squared innovations
as Modified ARCH models:
Chapter-3 Volatility: Concepts and Various Models
140
24 d Tmh4 ?24
where F(.) denotes a positive valued function. In their estimation of the model
Friedman, Laibson and Minsky (1989) use the function m sinW . p qW r s4t 1. pW u s4.
Slightly different versions of the univaraite Multiplicative GARCH model were
proposed independently by Geweke (1986), Pantula (1986) and Milhoj (1987). The
model is more commonly referred to as the Log-GARCH model. The Logarithmic-
GARCH model parameterizes the logarithmic conditional variance as a function of the
lagged logarithmic variances and the lagged logarithmic squared innovations:
log24 d T logh4 b* ? log24 b
*
The model may alternatively be expressed as:
24 exp d h4 x%b
* 24 y%
*
In light of this alternative representation, the model is also sometimes referred to as a
Multiplicative GARCH, or MGARCH model.
MGARCH may also denote to the Multivariate GARCH models, which were first
analysed and estimated by Bollerslev, Engle and Wooldridge in 1988. The unrestricted
linear MGARCH(p,q) model is defined by:
.zΩ Ω f.zhhe b* =.zΩb
*
where vech(.) denotes the operator that stacks the lower triangular portion of a
symmetric NxN matrix into an N(N+1)/2x1 vector of the corresponding unique
elements, and the f and = matrices are all of compatible dimension
N(N+1)/2Xn(n+1)/2. This vectorized representation is also sometimes referred to as a
VECH GARCH model. The general vech representation does not guarantee that the
Chapter-3 Volatility: Concepts and Various Models
141
resulting conditional covariance matrices Ω are positive definite. Also the model
involves a total of N(N+1)/2+(p+q)(N4+2N
3+N
2)/4 parameters, which becomes
prohibitively expensive from a practical computational point of view for anything but
the bivariate case, or N=2.
The MGARCH may also denote another category of models known as Mixture
GARCH (MGARCH) models. The MAR-ARCH model of Wong and Li(2001) and the
MAGARCH model of Zhang, Li and Yuen (2006) postulates that the time t conditional
variance is given by a time-invariant mixture of different GARCH models.
(iii) AGARCH- Asymmetric or Absolute Value GARCH)
The Asymmetric GARCH model was introduced by Engle (1990) to allow for
asymmetric effects of negative and positive innovations. The AGARCH(1,1) model is
defined as:
24 d Th4 Ph ?24
where negative values of P implies that positive shocks will result in smaller increases
in future volatility than negative shocks of the same absolute magnitude. The model
may alternatively be expressed as:
24 de Th Pe4 ?24
for which de L 0, T u 0 /;? u 0 readily ensures that the conditional variance is
positive almost surely.
AGARCH may also denote another category of models known as Absolute Value
GARCH or TS-GARCH (Taylor-Schwert GARCH) model. The TS-GARCH(p,q)
model of Taylor(1986) and Schwert (1989) parameterized the conditional standard
deviation as a distributed lag of the absolute innovations and the lagged conditional
standard deviations,
2 d T|h| ?2.
*b
*
Chapter-3 Volatility: Concepts and Various Models
142
This formulation mitigates the influence of large, in an absolute sense, observations
relative to the traditional GARCH(p,q) model. The TS-GARCH model is also
sometimes referred to as an Absolute Value GARCH, or AVGARCH, model, or simply
an AGARCH model. It is a special case of the more general Power GARCH, or
NGARCH6, formulation.
(iv) ANN-ARCH Model- Artificial Neural Network ARCH Model
Donaldson and Kamstra (1997) term the GJR model7 augmented with a logistic
function, as commonly used in Neural Neetworks, the ANN-ARCH model.
(v) ANST-GARCH- Asymmetric Non-Linear Smooth Transition GARCH
The anst-garch(1,1) model of Nam, Pyun and Arize (2002) postulates that
24 d Th4 ?24 ~ Sh4 E24 mh, P
where F(.) denotes a smooth transition function. The model simplifies to the ST-
GARCH(1,1) model of Gonzalez-Rivera (1998) for ~ E 0 and the standard
GARCH(1,1) model for ~ S E 0. (vi) APARCH Model- Asymmetric Power ARCH
The APARCH, or APGARCH, model of Ding, Granger and Engle (1993) nests several
of the most popular univaraite parameterizations. In particular, the APGARCH(p,q)
model,
2 d T|h| Ph ?2*
b*
reduces to the standard linear GARCH(p,q) model for S 2 /; P 0, the TS-
GARCH(p,q) model for S 1 /; P 0, the NGARCH (p,q0 model for P 0, the
GJR-GARCH model for S 2 /; 0 M P M 1, the TGARCH(p,q) model for S 1 /; 0 M P M 1, while the log-GARCH(p,q) model is obtained as the limiting case of
the model for S 0 /; P 0.
6 NGARCH model is explained later.
7 Explained later in the list.
Chapter-3 Volatility: Concepts and Various Models
143
(vii) ARCH-M Model (ARCH-in-Mean Model)
The ARCH-M model was first introduced by Engle, Lilien and Robins (1987) for
modeling risk-return tradeoffs in the term structure of U.S. interest rates. The model
extends the ARCH regression model in Engle (1982) by allowing the conditional mean
to depend directly on the conditional variance,
1>|+~6e? S24, 24
This breaks the block-diagonality between the parameters in the conditional mean and
the parameters in the conditional variance, so that the two sets of parameters must be
estimated jointly to achieve asymptotic efficiency. Non-linear functions of the
conditional variance may be included in the conditional mean in a similar fashion. The
final preferred model estimated in Engle, Lilien and Robins (1987) parameterizes the
conditional mean as a function of log 24. Multivariate extensions of the ARCH-M
model were first analysed and estimated by Bollerslev,Engle and Wooldridge (1988).
(viii) ARCH-SM (ARCH Stochastic Mean) Model
The ARCH-SM acronym was coined by Lee and Taniguchi (2005) to distinguish
ARCH models in which h > > > >. (ix) ARCH-Smoothers
ARCH-Smoothers, first developed by Nelson (1996b) and Foster and Nelson (1996),
extend the ARCH and GARCH models and corresponding ARCH-Filters based solely
on past observations to allow for the use of both current and future observations in the
estimation of the latent volatility.
(x) ATGARCH (Asymmetric Threshold GARCH)
The ATGARCH(1,1) model of Crouchy and Rockinger (1997) combines and extends
the TS-GARCH(1,1) and GJR(1,1) models by allowing the threshold used in
characterizing the asymmetric response to differ from zero:
2 d T|h|ph u P S|h|ph r P ?2
Higher order ATGARCH(p,q) models may be defined analogously.
Chapter-3 Volatility: Concepts and Various Models
144
(xi) -ARCH (Beta ARCH) Model
β-ARCH(1,1) model of Guegan and Diebolt (1994) allows the conditional variance to
depend asymmetrically on positive and negative lagged innovations,
24 d Tp1h L 0 P1ph r 0h4.y
where I(.) denotes the indicator function. For T P /; ? 1 the model reduces to
the standard linear ARCH(1,1) model. More general β-ARCH(q) and β-GARCH(p,q)
models may be defined in a similar fashion.
(xii) CGARCH (Component GARCH/Composite GARCH Model)
The component GARCH model of Engle and Lee (1999) was designed to get better
account for long-run volatility dependencies. Rewriting GARCH(1,1) model as:
24 24 Th4 24 ?24 24
where 24 k d/1 T ? refers to the unconditional variance, the CGARCH model
is obtained by relaxing the assumption of a constant 24. Specifically
24 4 Th4 4 ?h4 4
with the corresponding long-run variance parameterized by the separate equation,
4 d E4 h4 24
Substituting this expression for 4 into the former equation, the CGARCH model may
alternatively be expressed as a restricted GARCH(2,2) model.
The CGARCH may also denote the model of den Hertog (1994), which represents h4 as
the sum of a latent permanent random walk component and another latent AR(1)
component.
(xiii) COGARCH (Continuous GARCH Model)
The continuous-time COGARCH(1,1) model was proposed by Kluppelberg, Lindner and
Maller (2004). The model is obtained by backward solution of the difference equation
defining the discrete-time GARCH(1,1) model, replacing the standardized innovations by
Chapter-3 Volatility: Concepts and Various Models
145
the increments to the Levy process, L(t). In contrast to the GARCH diffusion model of
Nelson (1990b), which involves two independent Brownian motions, the COGARCH
model is driven by a single innovation process. Higher order COGARCH(p,q) processes
have been developed by Brockwell, Chadraa and Lindner (2006).
(xiv) Copula GARCH
Any joint distribution function may be expressed in terms of its marginal distribution
functions and a copula function linking these. The class of copula GARCH models
builds on this idea in the formulation of multivariate GARCH models by linking
univaraite GARCH models through a sequence of possibly time-varying conditional
copulas. Jondeau and Rockinger (2006) and Patton (2006) explains at large the
estimation and inference in copula GARCH models.
(xv) CorrARCH (Correlated ARCH) Model
The bivariate CorrARCH model of Christodoulakis and Satchell (2002) parameterizes
the time-varying conditional correlations as a distributed lag of the product of the
standardized innovations from univaraite GARCH models for each of the two series. A
Fisher transform is used to ensure that the resulting correlations always lie between -1
and 1.
(xvi) DAGARCH (Dynamic Asymmetric GARCH) Model
The DAGARCH model of Caporin and McAleer (2006) extends the GJR-GARCH
model to allow for multiple thresholds and time-varying asymmetric effects.
(xvii) DTARCH (Double Threshold ARCH) Model
The DTARCH model of Li and Li (1996) allows the parameters in both the conditional
mean and the conditional variance to change across regimes, with the m different
regimes determined by a set of threshold parameters for some lag u 1 of the observed > process, say "j r > M "j, " ∞ "V r " r G r " ∞. (xviii) EGARCH (Exponential GARCH) Model
The EGARCH model was developed by Nelson (1991). The model explicitly allows for
asymmetries in the relationship between return and volatility. In particular let k
Chapter-3 Volatility: Concepts and Various Models
146
h2 denote the standardized innovations. The EGARCH(1,1) model may then be
expressed as:
log24 d T|| || P1 1 ? log24
For P r 0 negative shocks will obviously have a bigger impact on future volatility than
positive shocks of the same magnitude. This effect, which is typically observed
empirically with equity index returns, is often referred to as a “leverage effect”, although
it is now widely agreed that the apparent asymmetry has little to do with actual financial
leverage. By parameterizing the logarithm of the conditional variance as opposed to the
conditional variance, the EGARCH model also avoids complications from having to
ensure that the process remains positive. Meanwhile, the logarithmic transformation
complicates the construction of unbiased forecasts for the level of future variance.
(xix) EVT-GARCH (Extreme Value Theory GARCH)
The EVT-GARCH approach pioneered by McNeil and Frey (2000), relies on extreme
value theory for i.i.d. random variables and corresponding generalized Pareto
distributions for more accurately characterizing the tails of the distributions of the
standardized innovations from GARCH models. This idea may be used in the
calculation of low-probability quantile, or Value-at-risk, type predictions
(xx) F-ARCH (Factor ARCH) Model
The multivariate factor ARCH model developed by Diebold and Nerlove (1989) and
the factor GARCH model of Engle, Ng and Rothschild (1990) assumes that the
temporal variation in the NxN conditional covariance matrix for a set of N returns can
be described by univaraite GARCH models for smaller set of K<N portfolios,
Ω Ω ∑ e 24*
where /; 24 refer to the time invariant Nx1 vector of factor loadings and time t
conditional variance for the kth
factor, respectively. More specifically, the F-
GARCH(1,1) model may be expressed as:
Ω Ω e?eΩ Teh4 where w denotes an Nx1 vector, and α and β are both scalar parameters.
Chapter-3 Volatility: Concepts and Various Models
147
(xxi) FIAPARCH (Fractionally Integrated Power ARCH) Model
The FIAPARCH(p,d,q) model of Tse (1998) combines the figarch(p,d,q) and the
APARCH(p,q) models in parameterizing 2 as a fractionally integrated distributed lag
of |h| Ph . (xxii) FIGARCH (Fractionally Integrated GARCH) Model
The FIGARCH model proposed by Bollerslev and Mikkelsen (1996) relies on an
ARFIMA type representation to better capture the long-run dynamic dependencies in
the conditional variance. The model may be seen as natural extension of the IGARCH
model, allowing for fractional orders of integration in the autoregressive polynomial in
the corresponding ARMA representation,
1 h4 d 1 ?1., | where . k h4 24, 0 r ; r 1, and the roots of 0 /; ? 1 are all
outside the unit circle. For values of 0 r ; r 1/2 the model implies an eventual slow
hyperbolic decay in the autocorrelations for 24 .
(xxiii) GQARCH (Generalized Quadratic ARCH) Model
The GQARCH(p,q) model of Sentana (1995) is defined by:
24 d ∑ ghb* ∑ Th4 2 ∑ ∑ Tjhhj ∑ ?24b*b*b*b*
The model simplifies to the linear GARCH(p,q) model if all of the g and the Tj
are equal to zero. Defining the qx1 vector k h, h4, … , hb, the model may
alternatively be expressed as:
24 d Ψe e f ?24 .b*
where Ψ denotes the qx1 vector of g coefficients and A refesr to the qxq symmetric
matrix of T /; Tj coefficients. Conditions on the parameters for the conditional
variance to be positive almost surely and the model well-defined are given in Sentana
(1995).
Chapter-3 Volatility: Concepts and Various Models
148
(xxiv) GRS-GARCH (Generalized Regime-Switching GARCH) Model
The GRS-GARCH model proposed by Gray (1996) allows the parameters in the
GARCH model to depend upon an unobservable latent state variable governed by a
first-order Markov process. By aggregating the conditional variances over all of the
possible states at each point of time, the model is formulated in such a way that it breaks
the path-dependence, which complicates the estimation of the SWARCH model of Cai
(1994) and Hamilton and Susmel (1994).
(xxv) HARCH (Heterogeneous ARCH) Model
The HARCH(n) model of Müller, Dacorogna, Davé, Olsen, Puctet and Weizsäcker
(1997) parameterizes the conditional variance as a function of the square of the sum of
lagged innovations, or the squared lagged returns, over different horizons,
24 d P hj
j* 4,
*
The model is motivated as arising from the interaction of traders with different
investment horizons. The HARCH model may be interpreted as a restricted QARCH
model.
(xxvi) IGARCH (Integrated GARFCH) Model
Estimates of the standard linear GARCH(p,q) model often results in the sum of the
estimated T /; ? coefficinets being close to unity. Rewriting the GARCH(p,q)
model as an ARMA(maxp,q,p) model for the squared innovations:
1 T ?h4 d 1 ?.
where . k h4 24, and T/; ?denote appropriately defined lag polynomials,
the IGARCH model of Engle and Bollerslve (1986) imposes an exact unit root in the
corresponding autoregressive polynomial, 1 T ? 1 , so that
the model may be written as:
1 h4 d 1 ?.
Chapter-3 Volatility: Concepts and Various Models
149
Even though the IGARCH model is not covariance stationary, it is still stationary with a
well defined non-degenerate limiting distribution; see Nelson (1990a). Also as shown
by Lee and Hansen (1994) and Lumsdaine (1996), standard inference procedures may
be applied in testing the hypothesis of a unit root, or T1 ?1 1. (xxvii) LARCH (Linear ARCH) Model
The ARCH(∞) representation:
24 d Th4*
is sometimes referred to as an LARCH model. This representation was first used by
Robinson (1991) in the derivation of general tests for conditional heteroscedasticity.
(xxviii) LMGARCH (Long Memory GARCH) Model
The LMGARCH(p,d,q) model is defined by
24 d ?1 1.
where . k h4 24, /; 0 r ; r 0.5. provided that the fourth order moment exists,
the resulting process for h4 is covariance stationary and exhibits long memory.
(xxix) GJR (Glosten, Jagannathan and Runkle GARCH) Model
The GJR-GARCH, or just GJR, model of Glosten, Jagannathan and Runkle (1993)
allows the conditional variance to respond differently to the past negative and positive
innovations. The GJR(1,1) model may be expressed as:
24 d Th4 Ph4 ph r 0 ?T24
where I(.) denotes the indicator function. The model is also sometimes referred to as a
Sign-GARCH model. The GJR formulation is closely related to the Threshold GARCH,
or TGARCH model proposed independently by Zakoian (1994), and the Asymmetric
GARCH (AGARCH) model of Engle (1990). When estimating the GJR model with
equity index returns, Ɣ is typically found to be positive, so that the volatility increases
proportionally more following negative than positive shocks. This asymmetry is
Chapter-3 Volatility: Concepts and Various Models
150
sometimes referred to in the literature as a “leverage effect”, although it is now widely
agreed that it has little to do with actual financial leverage.
(xxx) Miscellaneous other Versions
In addition to the above variations there are numerous other versions (infact more than
50) of ARCH/GARCH models available. For example, PNP-ARCH (Partially Non-
Parametric ARCH) model, QTARCH (Qualitative Threshold ARCH) model,
REGARCH (Range EGARCH) model, RGARCH (Randomized GARCH) model, or
Robust GARCH model, S-GARCH (Simplified GARCH) model, SPARCH (Semi-
Parametric ARCH) model, Spline-GARCH model, STARCH (Structural ARCH)
model, Stdev-ARCH (Standard Deviation ARCH) model, etc.
3.3.3. Implied Volatiltiy
The value of the volatility of the underlying asset that would equate an option’s price to
its fair value is called “implied volatility’. In other words, implied volatility is the
volatility, which is implicitly contained in the option prices. The implied volatility
approach calculates volatility implied by the current market value of the option
contracts. This is undertaken by specifying the option price and calculating the volatility
that would be needed in a mathematical option pricing formula such as that given by
Black and Scholes (1973) to derive the specified market price as a fair value of the
option. One can take the Black and Scholes option pricing model or any other
acceptable option pricing model to extract the implied volatility from the given
parameters. Given an observed European call option price Cob for a contract with strike
price K and expiration date T, the implied volatility σiv is defined as the input value of
volatility parameter to the, say, Black and Scholes (BS) formula such that
<, \; , ; 2¡ ¢£
where, CBS is the fair value of the option calculated from the BS model. The option
implied volatility is often interpreted as a market’s expectation of volatility over the
option’s maturity, i.e. the period from t to T. The implied volatilities from put and call
options of the same strike price and time to maturity are the same because of put-call
parity.
Chapter-3 Volatility: Concepts and Various Models
151
Suppose that the true (unconditional) volatility is σ over a period T. If BS model is
correct, then
<, \; , ; 2¡ <, \; , ; 2
for all strike prices. That is the function (or graph) of σiv(K) against K for fixed t,S,T and
r, observed from market option prices, is supposed to be a straight horizontal line. But, it
is well known that the Black and Scholes σiv differs across strikes. There is plenty of
documented empirical evidence to suggest that implied volatilities are different across
options of different strikes, and the shape is like a smile when one plots the BS implied
volatility σiv against strike price K, the shape is anything but a straight line.
There are atleast two theoretical explanations (viz. distributional assumption and
stochastic volatility) to explain the smile puzzle. Other explanations that are based on
market microstructure and measurement errors (like liquidity, bid-ask spread and tick
size) and investor risk preference (like model risk, lottery premium and portfolio
insurance) have also been proposed. The BS model assumes that the stock price follows
a lognormal distribution or the logarithmic stock returns to have a normal distribution.
There is widely documented empirical evidence that risky financial asset returns have
leptokurtic tails. In the case where the strike price is very high, the call option is deep-
out-of-the-money and the probability for this option to be exercised is very low.
Nevertheless, a leptokurtic right tail will give this option a higher probability, than that
from a normal distribution, for the terminal asset price to exceed the strike price and the
call option finish in-the-money. This higher probability leads to a higher call price and a
higher BS implied volatility at high strike.
Next comes the case of a low strike price. It is well known that option value has two
components: intrinsic value and time value. Intrinsic value reflects how deep an option is
in-the-money. Time value reflects the amount of uncertainty before the option expires;
hence it is most influenced by volatility. A deep-in-the-money call option has high
intrinsic value and little time value, and a small amount of bid-ask spread or transaction
tick size is sufficient to perturb the implied volatility estimation. One could, however,
make use of the previous argument and apply it to an out-of-the-money (OTM) put
Chapter-3 Volatility: Concepts and Various Models
152
option at low strike price. An OTM put price has a close to nil intrinsic value and the put
option price is due mainly to time value. Again because of the thicker tail on the left, one
expects the probability that the OTM put option finishes in-the-money to be higher than
that for a normal distribution. Hence the put option price (and hence the call option price
through put-call parity) should be greater than that predicted by BS model. If the BS
implied model is used to invert the volatility estimates from these option prices, the BS
implied will be higher than actual volatility. This results in volatility smile where implied
volatility is much higher at very low and very high strikes. The above argument applies
readily to the currency market where exchange rate exhibit thick tail distributions that
are approximately symmetrical. In the stock market, volatility skew (i.e. low implied at
high strike but high implied at low strike) is more common than volatility smile after the
October 1987 stock market crash. Since the distribution is skewed to the far left, the right
tail can be thinner than the normal distribution. In this case implied volatility at high
strike will be lower than that expected from a volatility smile.
3.3.4. Realized Volatility
Financial markets are the source of high frequency data. The original form of market
prices is tick-by-tick data: each “tick” is one logical unit of information, like a quote or
a transaction price. By nature these data are irregularly spaced in time. This high
frequency data is the primary object nowadays for those who are interested in
understanding the financial markets. This is the case, especially, because practitioners
determine their decisions by observing high frequency or tick-by-tick data. Yet, most of
the studies published in the financial literature deal with low frequency or regularly
spaced data. There are two main reasons for this. Firstly, it is still costly and time
consuming to collect, collate, store, retrieve, and manipulate high frequency data. That
is why most of the available data are at daily or lower frequency. The second reason is
somehow more subtle but still quite important: most of the statistical apparatus has been
developed and thought for homogeneous (i.e. equally spaced in time) time series. There
is little work done to adapt the methods to data that arrive at random intervals.
Unfortunately in finance, regularly spaced data are not original data but artifacts derived
from the original market prices. Nowadays, with the development of computer
Chapter-3 Volatility: Concepts and Various Models
153
technology, data availability is becoming less and less of a problem. Slowly, high
frequency data are becoming a fantastic experimental bench for understanding market
microstructure and more generally for analyzing financial markets.
Accurate volatility estimates are vital in many areas of finance. As seen above,
historical and implied methods are two main approaches to modeling volatility. The
historical approach employs econometric time series analysis by fitting, for example, an
Autoregressive Moving Average (ARMA) model to an historical series of estimated
volatilities or by using an ARCH or Stochastic Volatility type model to estimate the
conditional second moment of the return series. Implied methods on the other hand,
make use of option pricing models such as BS model in conjunction with market prices
of options. The volatility level that equates these observable prices to the ones produced
by the model is recovered typically via the use of a numerical algorithm such as
Newton-Raphson. An increasingly studied concept within the historical approach is the
idea of Realized volatility or its square, the Realized Variance (RV). This departs from
more traditional historical estimation methods by using returns of high-frequency.
Generally speaking, “high frequency” can be defined as the use of measurement
intervals of less than one day (e.g. an hour, one minute, five seconds, etc). RV itself can
be loosely defined as the sum of squared intra-period returns.
The essential aspects of RV can be largely attributed to Merton (1980), who gives rise
to the notion that an estimate of variance for a given period can be obtained by
summing the intra-period squared returns from a Gaussian diffusion process. The key
powerful result is that this estimator can be made arbitrarily accurate, such that as the
interval over which the returns are calculated becomes infinitesimally small, one is able
to treat volatility as essentially observable. Early empirical work in this area, such as
that by French, Schwert, and Stambaugh (1987), typically involved the construction of
monthly volatility figures from the aggregation of squared daily returns. More recently,
as high frequency data have become more widely and cheaply available, the literature
has increasingly concentrated on the use of intraday data to produce daily RV measures.
Interest in the early work was reignited by Andersen and Bollerslev (1998b), who
criticize the use of sampling frequencies of one day, one week, one month, etc., because
Chapter-3 Volatility: Concepts and Various Models
154
the additional information held within intraday returns is lost. They build on the earlier
theory and establish the properties of consistency and efficiency for the RV estimator
when log-prices follow a semi-martingale process. In trying to estimate daily volatility,
Andersen etal, (2001a) utilized the 5-minute intraday return series to estimate daily
volatility and state that “by summing intraday returns sufficiently frequently, the
realized volatility can be made arbitrarily close to the underlying integrated volatility”.
They further state that “for practical purposes we can treat the daily volatility as
observed, which enables us to examine its properties directly”. By treating daily
volatility as an observation one has a powerful tool to analyze a broad range of issues in
financial economics, both within and beyond the realm of market microstructure.
A number of authors have used the notion of realized volatility to estimate daily realized
volatility from intraday data. Beckers (1983), Anderson (1995), Parkinson (1980) and
Rogers and Satchell (1991) all propose estimators of daily volatility based upon high,
low, opening and closing prices. Schwert (1990), Hsieh (1991), Andersen and Bollerslev
(1998a) propose efficient unconditional daily volatility estimators based upon the intraday
return series. The practical value of the concept of daily realized volatility has also been
extensively demonstrated in the literature. Andersen and Bollerslev (1997,1998a) show
how high frequency intraday returns contain valuable information for the measurement of
volatility at the daily level. Moosa and Bollen (2001) utilize realized volatility to test the
relationships between volatility and time to maturity in futures markets. Moosa and
Bollen (2002) also employ the concept of realized volatility to test for bias in Value at
Risk estimates. Using daily data, Canina and Figlewski (1993) examine the information
content of an option’s implied volatility using realized volatility as the benchmark.
Putting to one side the evidence (which is cited above) that supports the empirical
efficacy of estimates and forecasts, there are several other appealing qualities of RV.
Firstly, RV has a solid theoretical foundation. Secondly, compared with complex
models, which are highly parametric and often difficult to calibrate (e.g SV models),
RV is non-parametric or ‘model-free’ and one therefore, avoids the risk of model
misspecifications. Linked with this and on a more practical level, RVs are extremely
simple and straightforward to calculate- all that is needed are market prices. Thirdly, in
Chapter-3 Volatility: Concepts and Various Models
155
contrast to implied volatilities, which can only be calculated for securities that have
options written on them, the universe of securities and instruments that have sufficient
data needed for the RV calculation is much wider. Finally, the RV approach does not
require information outside the estimation interval; the volatility for a period such as 24
hours can be derived entirely from the intraday returns within that period. In this
respect, RV can be likened to estimators such as the one proposed by Parkinson (1980),
who shows that a security’s intraday high and low price can be analytically transformed
into its volatility when prices move in geometric Brownian motion (see diBartolomeo
(2002) for a good discussion). This can be contrasted with more elementary (and
commonly used) methods, such as estimating the daily volatility as the standard
deviation of daily returns over some longer period (e.g. one month).
While the theoretical results are only strictly valid as the return sampling frequency
goes to infinity, in reality prices are not observed continuously and one is therefore
constrained by the number of transactions that actually occur. For example, the
calculation of RV using 5-minute returns on an asset that usually only trades every two
days is not possible. Another important potential drawback of RV relates to market
microstructure effects, such as bid-ask bounce and non-synchronous trading that
‘contaminate’ the data. There is much literature in this area, and Owens and Steigerwald
(2005) provide a good general overview. Microstructure effects are often not apparent
at lower return frequencies (e.g. daily or weekly) but are usually very evident in
intraday data and manifest as autocorrelation within the time series of returns. Oomen
(2004) makes the interesting point that such autocorrelation does not necessarily
contradict the efficient market hypothesis, which would initially seem to imply that
autocorrelation should not persist for any significant period of time. His reason is that
the serial correlation is of statistical (i.e. as a result of microstructure and transaction
cost effects) and not economic relevance. Moreover, he shows that autocorrelation
causes the RV estimator to be biased. There is general agreement in the literature (see
for example, Fang 1996; Oomen 2002, etc) that these microstructure effects will have a
greater adverse effect upon the statistical performance of RV, the lower the trading
liquidity of the security and the higher the frequency at which returns are sampled.
Chapter-3 Volatility: Concepts and Various Models
156
3.3.5. Stochastic Volatility Models
Stochastic volatility is another important concept, which is used in financial economics
and mathematical finance to deal with the time-varying volatility in evaluating
derivatives, such as options. In the broadest sense, any model that deals with non-
constant volatility can be seen as a stochastic volatility model and this explanation of
stochastic volatility models is commonly accepted among market participants.
However, in academic literature, stochastic volatility models refer to models that treat
the volatility as a random process governed by state variables, such as the price of
underlying securities. The non-measurability of the information set with respect to
observable filtration distinguishes stochastic volatility models from ARCH models. The
Heston (1993) model is one of the most commonly used stochastic volatility models, in
which the volatility itself is used as the state variable. Compared to ARCH models, the
outcomes of stochastic volatility models are more difficult to estimate due to the latent
character of some of the variables in these models. Commonly used estimation methods
for stochastic volatility models include the Generalized Method of Moments (GMM)
approach and Monte Carlo simulation.
Autoregressive conditional heteroscedasticity (ARCH) processes are often described as
SV, but they are different in nature. The essential feature of ARCH models is that they
explicitly model the conditional variance of returns given past returns observed by the
econometricians. This one-step-ahead prediction approach to volatility modeling is very
powerful, particularly in the field of risk management. It is convenient from an
econometric viewpoint as it immediately delivers the likelihood function as the product
of one-step-ahead predictive densities. In the SV approach, the predictive distribution of
returns is specified indirectly, via the structure of the model, rather than explicitly. For a
small number of SV models this predictive distribution can be calculated explicitly but,
invariably, for empirically realistic representations, it has to be computed numerically.
This move away from direct one-step-ahead predictions, has some advantages. In
particular, in continuous time it is more convenient, and perhaps more natural, to model
directly the volatility of asset prices as having its own stochastic process without
worrying about the implied one-step-ahead distribution of returns recorded over an
arbitrary time interval convenient for the econometricians, such as a day or a month.
Chapter-3 Volatility: Concepts and Various Models
157
This does, however, raise some difficulties as the likelihood function for SV models is
not directly available, much to the frustration of econometricians in the late 1980s and
early 1990s. Since the mid-1980s continuous-time SV has dominated the option pricing
literature but early on econometricians struggled with the difficulties of estimating and
testing these models. Only in the 1990s were novel simulation strategies developed to
efficiently estimate SV models. These computationally intensive methods enable us,
given enough coding and computing time, to efficiently estimate a broad range of fully
parametric SV models. This has led to refinements of the models, with many earlier
tractable models being rejected from an empirical viewpoint. The resulting enriched SV
literature has brought us much closer to the empirical realities we face in financial
markets.
Academics often study the continuous-time SV models within the context of option
pricing, with one of the most well known papers in this area being that by Hull and
White (1987), which considers a diffusion volatility model with leverage effects.
Wiggins (1987) generalized this by allowing that the correlation between two Brownian
processes could be nonzero. Scott (1987) considered a model in which the logarithm of
the volatility is an Ornstein Uhlenbeck (OU) process, and Stein and Stein (1991) and
Heston (1993) proposed a SV model in which the volatility itself is also an Ornstein
Uhlenbeck (OU) process. More variants of these models are derived with the attempt to
capture phenomena in empirical findings. For instance, the fractional stochastic volatility
models by Comte, Coutin and Renault (2003) intended to reflect the long memory
property of volatility, while Cont and Tankov (2004) tried to model the jumps in the
volatility process. The discrete-time SV model centers on the Mixture-of-Distributions
Hypothesis (MDH), where returns are governed by a latent information arrival process.
The actual parameterizations of most discrete-time SV models are often based on
specific continuous-time SV models. Similar to the GARCH class of models, most
discrete-time SV models in related literature rely on the autoregressive formulation of a
latent volatility process. One of the two main types of discrete-time SV models is the
lognormal stochastic autoregressive volatility model by Tayler (1986) and Harvey, Ruiz
and Shephard (1994). The square-root stochastic autoregressive volatility model by
Meddahi and Renault (2004) is the other main class of discrete-time SV model.
Chapter-3 Volatility: Concepts and Various Models
158
3.4. CONCLUSIONS
The present chapter is a short description of the concept of volatility and how it can be
measured. It defines the concept of volatility, as defined by many authors and as
accepted in the field of research as well as the various concepts of returns and their
bearing on results and inferences. In the last section, five categories of models are
explained which can be used to measure volatility of a time series of returns of different
assets in the stock market. These include: historical volatility models, ARCH/GARCH
class of models, implied volatility models, realized volatility models and lastly,
stochastic volatility models. In each category of models, only the most popular ones are
explained in details, as the list of volatility models, in each category, is a long one.