The Birth Process A baby is born Created by Mrs. Jane Ziemba Perryville Middle School.
Long-Term Asset Allocation Based on Stochastic Multistage...
Transcript of Long-Term Asset Allocation Based on Stochastic Multistage...
1
LONG-TERM ASSET ALLOCATION BASED ON STOCHASTIC MULTISTAGE MULTI-OBJECTIVE PORTFOLIO OPTIMIZATION
Guido Chagas Pedro L. Valls Pereira EESP – FGV and Banco Itaú Unibanco EESP – FGV and CEQEF –FGV
ABSTRACT
Multi-Period Stochastic Programming (MSO) offers an appealing approach to identity
optimal portfolios, particularly over longer investment horizons, because it is inherently
suited to handle uncertainty. Moreover, it provides flexibility to accommodate coherent
risk measures, market frictions, and most importantly, major stylized facts as volatility
clustering, heavy tails, leverage effects and tail co-dependence. However, to achieve
satisfactory results a MSO model relies on representative and arbitrage-free scenarios
of the pertaining multivariate financial series. Only after we have constructed such
scenarios, we can exploit it using suitable risk measures to achieve robust portfolio
allocations. In this thesis, we discuss a comprehensive framework to accomplish that.
First, we construct joint scenarios based on a combined GJR-GARCH + EVT-GPD +
t-Copula approach. Then, we reduce the original scenario tree and remove arbitrage
opportunities using a method based on Optimal Discretization and Process Distances.
optimization taking into account market frictions such as transaction costs and liquidity
restrictions. The proposed framework is particularly valuable to real applications
because it handles various key features of real markets that are often dismissed by
more common optimization approaches. Lastly, using the approximated scenario tree
we perform a multi-period Mean-Variance-CVaR optimization taking into account
market frictions such as transaction costs and liquidity restrictions. The proposed
framework is particularly valuable to real applications because it handles various key
features of real markets that are often dismissed by more common optimization
approaches.
Keywords: Multiperiod Asset Allocation, Scenario Tree Generation and Reduction,
Optimal Discretization, Mean-Variance-CVaR Optimization
Jel Code: G10, G11, G17
2
1 Introduction
Most, if not all, activities in financial markets explore risk-adjusted returns.
They look for investment opportunities that offer attractive returns within acceptable
risk limits. However, this seemingly simple question entails great challenges. First, we
need to choose a proper way to weigh risks. But what is a good risk measure? Should
we consider just one measure when comparing opportunities? Secondly, we seldom
have exact information regarding the future performance of potential investments. In
fact, we must take into account data uncertainty to make optimal decisions. Lastly, real
markets are not perfect. They have restrictions, costs and limitations, known as market
frictions, that often affect our decisions.
Furthermore, when considering a portfolio of such investment opportunities,
we face an additional challenge: their performance is usually mutually dependent.
Therefore, to find an optimal portfolio of investment opportunities (or, at least, one that
is not dominated by any other), we also need to understand their dependence structure
and use it to our advantage.
These insights suggest that, to perform a realistic portfolio optimization, we
need a model that considers the (actual) individual and joint behaviour of the
opportunities’ expected returns, takes into account suitable risk measures and
evaluates the impacts of market frictions. This thesis proposes a integrated framework
to accomplish that.
We start by modelling the individual and joint behaviour of the investment
opportunities. These opportunities usually involve instruments whose series of returns
(i) usually have particular statistical properties, (ii) are seldom independent and
identically distributed and (iii) often display a nonlinear jointly behaviour. The more
accurate our ability to explain these empirical features, the better we can appraise and
allocate such opportunities.
These features, which are persistent across different times, markets and
instruments, are known in the literature as stylized facts. Because of their importance,
many empirical financial studies have suggested models to explain and, more
interestingly, simulate multivariate financial times series that are capable of
accommodating such key (individual and joint) empirical statistical facts. In fact, some
3
essential financial activities as risk management and portfolio optimization depend on
elaborated models to account properly for these stylized facts in order to provide
reliable conclusions.
Unfortunately, as we discuss later there are many stylized facts and, to the
best of our knowledge, most models are capable of explaining simultaneously, at most,
three of them. For instance, McNeil and Frey [213] suggested a combined approach
using GARCH models and EVT to model extreme stock returns. Rockinger and
Jondeau [268] proposed a different approach employing GARCH models and copulas
to fit multivariate stock returns. Recent works, like Huang, et al. [149] and Wang et al.
[310], combine these previous studies into a GARCH-EVT-Copula approach to study
exchange rate markets.
In this thesis, we follow a similar framework and propose an unified approach
to model four of the most significant facts: heavy tails, volatility clustering, leverage
effects and tail co-dependence, which are particularly relevant to daily financial series
(e.g., daily log returns).
Heavy tailed distributions are distributions whose tails exhibit seemingly
polynomial decay. That means that extreme (absolute) returns are more common that
those predicted by thin-tailed distributions, like the popular Gaussian distribution. For
obvious reasons, the probability of occurrence of extreme observations is essential
information for reliable risk management. Unfortunately, conventional econometric
approaches are not suited to handle extreme returns because they are relatively rare.
To do so, we need an alternative such as EVT discussed in the later sections.
Volatility clustering is another key property. Many statistical tools and
techniques, like EVT for instance, rely on the assumption that series are independent
and identically distribution (i.i.d.). Although daily raw returns do not show evidence of
significant serial correlation, daily absolute or squared returns clearly exhibit strong
serial correlation. In other words, large (absolute) returns are often followed by large
returns while small (absolute) returns are typically followed by small returns, creating
clusters of volatility. This empirical fact plainly invalidates the assumption of i.i.d. series.
Conveniently, GARCH or SV models allow us to handle these volatility clusters and
use (approximately) i.i.d filtered standardized residuals to perform subsequent
analyses and modelling.
4
Leverage effects are also particularly relevant because they explain how “bad
news” (i.e., negative shocks) have greater impact in volatility than “good news” (i.e.,
positive shocks). This behaviour can be accommodated using, for instance,
asymmetric extensions of the GARCH models such as EGARCH, GJR-GARCH or
TGARCH. These models are rather popular because they can reasonably explain the
volatility clustering, the leverage effects and, to some extent, the heavy tails observed
in financial daily series.
Lastly, tail co-dependence is likely one of the most important stylized facts
because it describes how dependence among instruments and markets are stronger
(in absolute terms) during turbulent periods. It indicates how cross-correlations change
during these periods leading to increasing risk (or, similarly, decreasing diversification).
Although some multivariate GARCH models like DCC and TVC are able to describe
partially the joint behaviour of financial series, copulas offer a more flexible alternative
because they detach the marginal modelling from the dependence structure modelling.
Once we have modelled these stylized facts, we use the calibrated model to
simulate a sufficiently large number of paths and construct a discrete scenario tree that
appropriately describes the expected behaviour of the daily log returns. Compared to
other common scenario construction approaches, the method suggested in this thesis
provides a better fit to the individual and joint dynamics of the financial series. However,
the generated scenario tree is neither free of arbitrage opportunities nor tractable in
real multi-period stochastic portfolio optimizations. Therefore, our next step is to
construct a smaller but sufficiently similar (in some proper sense discussed later)
scenario tree that is both manageable and arbitrage-free. To accomplish that, we
extend the optimal discretization method based on process distances, proposed by
Pflug [243] and further developed by Pflug and Pichler [245], [246] and Kovacevic and
Pichler [186].
The process distance is a generalization of the Wasserstein distance. While
the latter is a distance function between probability distributions, the former is a
distance functional between stochastic processes. Recalling that scenario trees are
simply filtered discrete probability spaces that represent the stochastic process model
of the simulated data, we can use process distances to evaluate how similar (i.e., how
close) two scenario trees are. Moreover, we can combine them with discretization
5
methods to approximate the original scenario trees by smaller, more manageable
scenario trees.
These distances and their variants, which are inherently related to Optimal
Transport Theory, have been vastly used in engineering applications, as pointed out
by Villani [307]. Conversely, finance applications have been scarce because usual
discretization methods based on process distances are not arbitrage-free. Therefore,
we describe in the next sections how we can extend the optimal discretization
approach to exclude arbitrage opportunities in order to construct a smaller but
reasonably similar scenario tree. We also briefly discuss other possibilities to construct
scenario trees for stochastic optimization such as conditional sampling methods,
moment-matching approaches and scenario clustering techniques, and point out why
we should exercise caution when using them in financial applications.
Lastly, once we have constructed the approximated scenario tree, we perform
our concluding step: a multi-period multi-objective stochastic portfolio optimization. It
is particularly important because there is wide literature regarding Multi-Period
Stochastic Optimization (MSO) and Multi-Objective Optimization (MOO) but very little
research combining them.
MSO has been extensively applied to capacity planning, transportation,
logistics, and long-term asset allocation over the last decades, as pointed out by
Wallace and Ziemba [309]. In these applications, future decisions depend on both
preceding decisions and observations. Moreover, any optimal solution (i.e., optimal
sequence of decisions) must take into account uncertainty. For financial applications
in particular, Multi-Period Stochastic Programming (MSP) provides a sound alternative
to handle such uncertainty, provided we can estimate the expected joint probability
distribution of the returns using scenario trees. Besides, MSP also offers flexibility to
accommodate markets frictions and elaborated risk constraints.
MOO, on the other hand, has been a little less popular. Although intuitively
appealing, applications with multiple objectives have a formidable challenge: there is
no unique optimal solution. Instead, there are many (possibly infinite) Pareto optimal
solutions. The more complex and numerous the objective functions, the more difficult
it is to identify the non-dominated solutions. As discussed by Marler and Arora [210],
Ehrgott [87], and Abraham, Jain and Goldberg [1], there are various techniques to
perform a MOO such as weighted sum approaches, epsilon-constraint methods, goal
6
attainment techniques and genetic algorithms, which we discuss in the literature
review. Among them, the epsilon-constraint method is one of the few that can be
efficiently combined with MSP, provided we have a small number of objectives.
Thus, in this work we combined a MSP approach with the epsilon-constraint
method to perform our multi-period multi-objective stochastic portfolio optimization.
The objectives are simple: (i) maximize expected return, (ii) minimize variance and (iii)
minimize conditional value-at-risk. While the first two objectives are nearly mandatory
in most practical portfolio optimization problems, the last one is chosen to manage
market risk more accurately in real applications. Although, the objectives’ selection is
arbitrary, it considers some of the most common objectives observed in real asset
allocation applications, as remarked by Roman, Darby-Dowman and Mitra [269].
This thesis’ contributions to the literature are twofold: (i) it constructs arbitrage-
free scenario trees that, differently from engineering applications (e.g. supply chain
management, inventory control, energy production), are a fundamental cornerstone of
financial applications and (ii) it provides an alternative framework to multi-period multi-
objective stochastic portfolio optimization. Although quite useful for real financial
applications, both subjects have been mildly explored by the literature.
The remainder of this work is organized as follows. Section 2 presents a brief
review of the related literature. Section 3 discusses the theoretical background related
to the scenario tree generation (i.e., GARCH models, EVT and copulas), the scenario
tree approximation (i.e., optimal discretization using process distances) and finally the
multi-period multi-objective portfolio optimization (i.e., Mean-Variance-CVaR
optimization). Section 4 describes how to integrate these methods and theories into a
unified framework. Finally, Section 5 presents some empirical results and Section 6
concludes, discussing possible further researches.
7
2 Literature Review
Many peculiar statistical features of financial time series have been
comprehensively discussed in the literature. As stressed by Campbell, Lo, and
MacKinlay [195], returns are hardly random walks. This belief dates back to Mandelbrot
[202] [203], who was among the first to suggest that financial series are seldom
Gaussian, often displaying higher kurtosis and more peaked distributions. In fact,
financial returns exhibit a set of important empirical properties, known as stylized facts,
that are observed in different markets and instruments and that influences various
financial decision problems.
More precisely, Rydberg [274], Cont [58] and McNeil, Frey and Embrechts
[214] point out that, among the most studied stylized facts regarding daily financial time
series, we can underline:
(i) Presence of heavy tails
(ii) Presence of volatility clustering
(iii) Asymmetric distribution of raw returns
(iv) Lack of significant serial correlation in raw returns
(v) Strong serial correlation in squared and absolute returns
(vi) Negative correlation between volatility and returns (known as leverage
effect)
(vii) Serial correlations of powers of absolute returns are strongest at power
one (known as Taylor effect)
(viii) Considerable evidence of cross-correlation in squared and absolute
returns
(ix) Presence of tail co-dependence between series of raw returns
(x) Time-varying cross-correlations
(xi) Long-range dependence
As stressed by Cont [58], it is somewhat difficult, if not impossible, model all
these features together. Keeping that in mind, in this brief review we focus mostly on
8
four key stylized facts: heavy-tails, volatility clustering, leverage effect and tail co-
dependence; that we will accommodate in the framework proposed later. A more
detailed review of stylized facts is offered by Campbell, Lo and MacKinlay [42], Taylor
[295] [296], and Tsay [301].
Mandelbrot [202] [203] was one of the first who claimed that commodity and
stock extreme returns are rather more common than suggested by Gaussian
distributions. As an alternative, the author suggested stable Paretian distributions (i.e.,
distributions with tail index � < 2) to accommodate series with such extreme
observations and preserve the assumption of independent and identically distributed
returns. Fama [97] [98] also supported this hypothesis pointing out that further studies
would be required to use it properly.
Although Mandelbrot and Fama correctly suggested that the Gaussian
distribution was inadequate to model financial returns, the proposed stable distribution
presented some disadvantages such as infinite variance (since � < 2). Although there
are techniques to manage this family of distributions as discussed for instance by
Mittnik, Paolella and Rachev [222] [223] and Taleb [292], many studies like Hsu et al
[147], Hagerman [122], Perry [239], Ghose and Kroner [110] and Campbell, Lo and
MacKinlay [42] advocate that there are more suitable approaches, with less
counterfactual implications (e.g., variance does not appear to increase as sample size
increases, long-horizon financial returns seem to follow Gaussian distributions) to
accomplish this objective.
Consequently, alternative approaches were proposed to handle extreme
returns properly. Blattberg and Gonedes [42] suggested the Student’s t-distribution as
a better alternative to capture the heavy-tail behaviour. Kon [180] questioned the
assumption that financial daily returns are stationary and pointed out that mixtures of
normal distributions are better suited than Student t distributions to describe such
leptokurtic distributions. Barndorff-Nielsen [16] introduced the flexible Generalized
Hyperbolic (GH) distributions whose particular cases were later applied to model
heavy-tails and, sometimes, asymmetric gains and losses. For example, Barndorff-
Nielsen [17] [18] recommended the Normal Inverse Gaussian (NIG) distribution.
Eberlein and Keller [85], and Eberlein, Keller and Prause [86] proposed a solution
based on hyperbolic distributions that was further generalized by Prause [250]. Hansen
[125] suggested a less common special case of the GH distribution: a skewed Student
9
t-distribution. These later studies appear to support Cont [58], who inferred that any
distribution capable of properly modelling the distribution of financial returns require at
least four parameters: location, scale, shape and asymmetry.
All approaches mentioned previously focused on modelling the whole
distribution of returns. Other interesting methods available to handle extreme
observations, which have been extensively applied in hydrology, meteorology and
insurance, comprise modelling the distribution tail’s behaviour separately. These
methods are based on Extreme Value Theory (EVT). Some of the first applications of
EVT in finance were proposed by Jansen and Vries [154], Koedijk et al. [178][179],
Longin [197] [198], Tsay [300], and Dacorogna at al. [59]. As underlined by Cont [58],
various works such as Longin [197], Lux [200] and Hauksson et al. [130] indicate that
a Fréchet distribution with shape parameter 0.2 ≤ � < 0.4 (i.e., with tail index 2.5 < � ≤5) seems suitable to model the distribution tails of daily financial series. Cont [58] also
points out that such shape parameter range suggests that the variance is indeed finite
and tails are not as heavy as those achieved by stable Paretian distributions. In the
next section we briefly discuss these methods and we refer to Embrechts, Klüppelberg
and Mikosch [90], and Coles [53] while McNeil, Frey and Embrechts [214] and
Finkenstädt and Holger [102], for comprehensive reviews and applications of EVT.
Another major stylized fact in financial series is volatility clustering. Although
Mandelbrot [202] first remarked that large (small) returns tend to be followed by large
(small) returns, was Engle [93] who introduced the Autoregressive Conditional
Heteroscedastic (ARCH) model, one of the first serious approaches trying to explain
this empirical property. Soon after, Bollerslev [34] proposed a more parsimonious
extension: the Generalized Autoregressive Conditional Heteroscedastic (GARCH)
model, which became a keystone for real-world volatility modelling problems.
GARCH models provided an appealing alternative to earlier approaches
because they were able to better explain relevant properties of empirical daily returns.
First, daily returns do not appear to be independent and identically distributed.
Secondly, squared returns appear to be time varying and exhibit strong evidence of
serial correlation. Lastly, extreme returns appear to cluster together, independently of
their sign.
10
Furthermore, many sophisticated extensions of the original GARCH have been
proposed over the last decades. For instance, there are various nonlinear
specifications such as the Nonlinear ARCH (NARCH) model developed by Engle and
Ng [96], the Asymmetric Power GARCH (APGARCH) model introduced by Ding et al.
[77], and some variations of the Smooth Transition GARCH (STGARCH) model
suggested by Hagerud [123], Gonzalez-Rivera [115] and Lubrano [199].
Other extensions focus on substantial evidence of long memory (i.e., very slow
decay) in the empirical serial correlations of absolute or squared daily returns as
discussed by Taylor [296], Ding et al. [77] and Ding and Granger [76]. The Fractionally
Integrated GARCH (FIGARCH) proposed by Baillie, Bollerslev and Mikkelsen [12] and
the Long Memory GARCH (LMGARCH) suggested by Karanasos et al. [167] are
examples.
Additionally, fitting traditional GARCH models to series with structural changes
may lead to potential misspecifications, as discussed by Hamilton and Susmel [124],
Cai [41], and Mikosch and Stărică [220]. Consequently, regime-switching models, such
as the Markov-Switching GARCH (MS-GARCH) models suggested by Gray [117],
Dueker [80] and Klaassen [172], were developed to take into account potential
structural breaks.
Besides GARCH approaches, there are alternatives based on stochastic
volatility such as Taylor [294] and Harvey and Shephard [128], and more recently, on
realized volatility like Andersen et al. [1] [6], and Barndorff-Nielsen and Shephard [19]
that also provide invaluable tools to properly explain some important stylized facts
discussed in the literature. We do not discuss them here and refer to Bauwens, Hafner
and Laurent [20], Andersen et al. [1] and Tsay [301] for further details regarding these
alternatives.
Regardless of these various enhancements and alternatives, the basic
univariate formulation GARCH(1,1) is still the most used according Bauwens, Hafner
and Laurent [20], and McNeil, Frey and Embrechts [214]. The reasons are twofold:
they are easier to fit and they provide reasonable estimates, particularly when
combined with complementary techniques like extreme value theory and copulas. We
will discuss some details of this particular specification in the next section and refer to
Andersen et al. [7], Tsay [301], and Bauwens, Hafner and Laurent [20] for a
comprehensive review.
11
Moreover, straightforward extensions of the GARCH(1,1) formulation are also
commonly used to address other additional statistical features of daily financial series.
In particular, some of these modifications lead to asymmetric models that can
accommodate another key stylized fact: the leverage effect.
This feature was first discussed by Black [31] and Christie [52] who noticed
that declining asset prices (i.e., negative log returns) are often follow by increasing
volatility. Further research about this asymmetry of innovations shock led, for example,
to the Exponential GARCH (EGARCH) model by Nelson [232], the GJR-GARCH model
by Glosten, Jagannathan and Runkle [111] and the Threshold GARCH (TGARCH) by
Rabemananjara and Zakoian [252] and Zakoian [312].
There are some differences among these extensions. For instance, moment
conditions for the EGARCH are less strict than those for GJR-GARCH and TGARCH,
and GJR-GARCH models the conditional variance while TGARCH focus on the
conditional standard deviation. Nonetheless, to the best of our knowledge, there are
not many studies comparing their results. Engle and Ng [96] is one exception that
suggests that the GJR-GARCH model appears more appropriate to explain extreme
shocks than the EGARCH model. Another exception can be found in Tsay [301] who
provides a few examples suggesting that no model is statistically superior. Regarding
practical applications, parsimonious EGARCH(1,1) and GJR-GARCH(1,1)
specifications are often preferred according Andersen et al. [7], and Bauwens, Hafner
and Laurent [20].
To this point we have discussed only univariate approaches but,
unsurprisingly, various multivariate specifications were proposed over the last
decades. Focusing on multivariate GARCH models for instance, Bollerslev et al. [37]
introduced the Diagonal VECH (DVECH) model. Although quite general, it has limited
practical applications because it requires too many parameters to be estimated. Baba,
Engle, Kraft and Kroner [11] and Engle and Kroner [95] suggested the BEKK model,
which further restricted the DVECH specification but still faced limitations with large
dimensions. To address this curse of dimensionality, Bollerslev [36] proposed the
Constant Conditional Correlation (CCC) Model whose constant correlation strong
assumption was later relaxed in the Time-Varying Correlation (TVC) model of Tse and
Tsui [302] and the Dynamic Conditional Correlation (DCC) model suggested by Engle
[94]. More recent specifications of these dynamic correlation models, such as
12
Cappiello, Engle and Sheppard [44] and Hafner and Franses [121], also address
additional features of financial series like leverage effects and also explain elaborate
joint behaviours of financial markets such as contagion and volatility spillover effects.
For further details, Bauwens, Laurent and Rombouts [21], and Silvennoinen and
Teräsvirta [281] provide an inclusive review of multivariate GARCH models.
Despite of the appealing properties of multivariate GARCH models, some
works such as McNeil, Frey, and Embrechts [214] and Patton [238] suggest that
copulas, which were introduced by Sklar [282] and later promoted by Joe [160] and
Nelsen [231], may offer a compelling alternative because they separate the
dependence structure from the univariate marginal distributions allowing more
flexibility (a key aspect for many practical applications). However, depending on the
required degree of flexibility, accuracy and tractability, copula models may range from
straightforward static combinations of different univariate distributions, such as those
presented by Patton [235], to much more elaborated applications comprising dynamic
multivariate processes like those discussed in Patton [236].
Among the most common copula types we have the (i) elliptical copulas like
the Gaussian and t copulas, which are readily extended to multivariate settings but can
explain only simple dependence structures; (ii) Archimedean copulas such as Frank,
Gumbel and Clayton copulas that, conversely, are difficult to apply to high dimensional
problems but offer greater flexibility to explain the joint dependence; (iii) vine copulas,
which handle high dimensional dependence by combining flexible bivariate copulas
families and; (iv) mixture copulas that are able to accommodate quite sophisticate
dependence structures. We recommend Embrechts, Lindskog and McNeil [91] for a
good review of elliptical and Archimedean families, Kurowicka and Joe [189] for a
detailed discussion of vine copulas, and Joe [161] for a comprehensive review,
including some mixture copula models.
Even though elaborated models like vine and mixture copulas provide greater
flexibility, practical financial applications (i.e., applications that comprise many
instruments) often favour more tractable approaches like elliptical copulas. Moreover,
tail co-dependence is often the most pressing concern in these applications. For that
reason, t copulas and their extensions are usually the preferred choice. See for
example Demarta and McNeil [64].
13
Additionally, many recent financial applications combine copulas with other
methods models to achieve better results. Important examples are GARCH-Copula
models, discussed by Rockinger and Jondeau [162][268], Huang et al. [150], and
Riccetti [263][264]. These models can be fitted using a two-step estimation approach,
conceptually similar to that used by McNeil and Frey [213] to combine EVT and
GARCH models.
As we elaborate in the next sections, we can integrate these two-step
estimation approaches into a single three-step GJR-GARCH + EVT (GPD) + t-Copula
framework to simulate multivariate trajectories (also known as paths) that have a
common starting point. This simulated multivariate scenario fan, which is a particular
case of a scenario tree, not only captures the uncertainty of the daily log returns but
also accommodates reasonably well the four key stylized facts mentioned previously:
heavy-tails, volatility clustering, leverage effect and tail co-dependence.
Similar works applied to exchange rates and commodities can be found Wang,
Jin and Zhou [310] and Tang et al. [293]. Besides, other studies that jointly explain at
least three stylized facts can be found in Xiong and Idzorek [311], which accounts for
heavy tails, skewness, and asymmetric dependence, and Hu and Kercheval [148],
which proposes a GHD approach to explain asymmetric dependence, volatility
clustering and (partially) heavy tails. There are also a few elaborated works that
capture four stylized facts. Patton [236], and Viebig and Poddig [306] are examples.
So far, we reviewed combined methods based exclusively on historical daily
log returns. Nevertheless, there are several alternative methods available to simulate
multivariate scenarios that use additional information such as economic variables or
forward-looking indicators. Boender [33], Cariño et al. [45], and Cariño, Myers and
Ziemba [46], for instance, construct scenarios using Vector Autoregressive (VAR)
models. Mulvey and Vladimirou [228], on the other hand, generate scenarios using
decomposition methods based on Principal Component Analysis (PCA). Mulvey [225],
and Mulvey and Thorlacius [227] employ dynamic stochastic simulation to forecast
both economic series and asset prices. For additional information regarding these
different scenario generation methods, we refer to Ziemba and Mulvey [314], and
Kouwenberg and Zenios [185].
These alternative methods based on economic variables or forward-looking
indicators are vastly used in practical applications. However, they are not particularly
14
tailored to accommodate the stylized facts mentioned earlier, which are particularly
relevant to portfolio optimization and risk management applications. For that reason,
in this thesis we simulate the multivariate scenarios using the suggested GJR-GARCH
+ EVT (GPD) + t-Copula three-step approach, which can properly describe the
multivariate underlying stochastic process of the daily log returns (at the expense of
additional information that could prove useful in some cases).
This combined approach, however, exhibits two major limitations: the
generated multivariate scenario tree (i) is often too large (i.e., computationally
demanding) to be applied in practical finance applications and (ii) is not free of arbitrage
opportunities. Therefore, we require additional steps to handle these limitations.
Regarding the first shortcoming, numerous techniques have been proposed to
approximate the original scenario trees by more tractable ones, as highlighted by
Dupačová, Consigli and Wallace [82] and Kouwenberg [184]. Dupačová, Consigli and
Wallace [82], Høyland and Wallace [144] and Prékopa [251] also point out that the
features considered by each particular stochastic optimization problem define which
properties should be contemplated by a good scenario tree approximation (e.g., a
traditional Mean-Variance optimization only entails the approximated first two moments
of the original multivariate distribution). Therefore, true convergence towards the
original scenario tree (based on the true underlying stochastic process) may be desired
but is not mandatory to achieve optimal solutions for Stochastic Programming
problems.
These approximation techniques seek to generate tractable scenario trees and
guarantee that the different approximated scenario trees lead to similar, stable and
optimal objective values, with small approximation errors. As reviewed in Dupačová,
Consigli and Wallace [82], and Heitsch and Römisch [133], we discuss some common
techniques used in finance, energy and logistics problems, which are commonly
grouped in four categories: sampling algorithms, moment-matching approaches,
clustering techniques or optimal discretization methods.
Sampling algorithms are likely the first approaches used to approximate
stochastic processes. They comprise different techniques that basically draw possible
values from a given distribution or stochastic process for each node of a predefined
scenario tree. Among these sampling techniques, bootstrapping sampling, random
15
sampling, importance sampling, rejection sampling, conditional and Markov Chain
Monte Carlo (MCMC) sampling are among the most used in practical applications.
Traditional or adjusted random sampling, also known as Monte Carlo
sampling, are the simplest, and often the least efficient, of these methods. Essentially,
they draw values randomly and associate them to probabilities using the predefined
distribution. Similarly, bootstrapping sampling methods are just random sampling
methods, with replacement, that draw values from the sample itself. Cariño et al. [45],
Consigli [54], Consigli and Dempster [55], and Consigli, Ianquita and Moriggia [56] for
instance, use these methods to construct their scenario trees.
Despite their popularity, random sampling techniques usually require too many
draws to be reliable. Therefore, we often consider other, more efficient, techniques
such as (sequential) importance sampling methods. These methods restrict (or give
more importance to) the random draws to relevant regions of alternative distributions
that are easier to handle. Dupačová, Consigli and Wallace [82], Dempster and
Thompson [71], and Dempster [67] offer interesting applications of these methods.
Moreover, Dupačová, Consigli and Wallace [82] also combines the sequential
importance sampling with the moment matching and clustering approaches (described
later) to achieve more reliable scenario approximations. Likewise, rejection sampling
follows a similar approach to importance sampling but trades simplicity for efficiency,
as discussed by Pflug and Pichler [247].
One important drawback of most of the previous sampling techniques is that
current draws do not depend on previous draws. Consequently, they preclude any type
of temporal dependence, which is quite relevant for numerous financial applications.
Conditional sampling methods take away this shortcoming by keeping track of
previously drawn values. The popular Sample Average Approximation method is an
example of conditional sampling extensively discussed by Kleywegt, Shapiro and
Homem-de-Mello [176] and Shapiro [278]. More recently, MCMC algorithms (such as
Gibbs sampling and Metropolis-Hastings-Green) have been mixed with some of the
previously mentioned methods like importance sampling, leading to advances in
multidimensional optimization problems. We refer to Ekin, Polson and Soyer [88], and
Parpas et al. [234] for recent examples.
Unfortunately, although sampling methods exhibit very nice asymptotic
properties, large scenario trees are often intractable computationally, particularly in
16
high dimensional spaces (expect for some very sophisticated MCMC sampling
models), hindering most practical applications. Likewise, small scenario trees based
on sampling methods usually lead to poor accuracy (i.e., large approximation errors
and instability). For those reasons, alternative approximation techniques are often
considered in practice.
Moment-matching (or, more generally, property-matching) approaches are
among these alternatives. As highlighted by Dupačová, Consigli and Wallace [82],
these methods do not required full knowledge of the true underlying stochastic process
and, consequently, are particularly convenient when we have limited information about
its probability distribution. Instead, property-matching approaches need only a few
statistical properties of the distribution (e.g., moments, co-moments, percentiles,
principal components, serial correlations) to simulate an approximated scenario tree,
i.e., a discrete probability distribution that is consistent with the given properties.
.Besides requiring less information (i.e., just some statistical properties of the
underlying stochastic process), moment-matching approaches can match most
properties using relatively few scenario when compared to traditional sampling
methods. They can also easily accommodate more elaborated dependency structures
such as serial correlations.
However, they also present relevant shortcomings. First, as discussed by
Dupačová, Consigli and Wallace [82] all relevant statistical properties for the
optimization problem are assumed to be known before the optimization problem is
solved, but that might not be always true. Second, Hochreiter and Pflug [141], and
Pflug and Pichler [246] show that completely distinct approximated scenario trees (i.e.,
quite different stochastic processes) might lead to the desired statistical properties,
possibly leading to unreliable solutions. Lastly, Kaut and Wallace [168] emphasises
that, differently from sampling methods, property-matching approaches do not
guarantee convergence by simply increasing the number of simulated scenarios, which
may produce instability and large optimality gaps.
We suggest Høyland and Wallace [144], Høyland, Kaut and Wallace [145] and
Dupačová, Consigli and Wallace [82] as well-known references regarding moment-
matching methods and, more recently, Date, Mamon and Jalen [62], Consiglio, Carollo
and Zenios [57], and Beraldi, Simone and Violi [26] as more elaborated applications of
17
such approximation methods. We also recommend King and Wallace [171] for a
comprehensive review about the subject.
Scenario clustering techniques have also been employed as alternatives to
traditional sampling methods. They use different methods of cluster analysis to bundle
similar scenarios together to approximate the original scenario trees by smaller ones.
However, as pointed out in Aggarwal [1], clustering techniques for time-series are more
challenging because they fall within the class of contextual representation, i.e.,
techniques that should consider both behavioural (e.g., price) and contextual (e.g.,
time) attributes. In that regard, Liao [190], and Rani and Sikka [258] offer interesting
surveys about time-series clustering techniques. They underline that among the many
clustering categories available, three have been used (directly or indirectly) to handle
time-series clustering: partitioning methods, hierarchical methods, and model-based
methods.
Partitioning clustering methods constructs (crisp or fuzzy) partitions of the
data, where each partition represents a cluster containing at least one element. As a
popular example in this category, we can point out the k-means algorithms, which are
characterized by the mean value of the elements in that cluster. Pranevičius and
Šutienė [290][291], and Šutienė, Makackas, and Pranevičius [289] applied elaborated
variations of the k-means algorithm to generate approximated scenario trees that
preserve important statistical features of the underlying stochastic process such as
intertemporal dependency structures. In addition, Beraldi and Bruni [25], and Gulpinar,
Rustem and Settergren [120] discuss interesting variations of the k-means algorithm
applied to stochastic optimization problems. However, as explained by Liao [190],
basic partitioning methods perform well with spherical-shaped clusters and small or
medium data sets but do not handle properly clusters with non-spherical or complex
shapes, which are quite common in financial applications.
Hierarchical clustering methods constructs a multilevel cluster tree by grouping
the data using agglomerative or divisive algorithms. Agglomerative (bottom-up)
clustering puts each element in its own cluster and then merge clusters into larger
clusters, until predefined conditions (e.g., desired number of clusters) are achieved.
Divisive (top-down) clustering, on the other hand, start at the top with all elements in a
single cluster and then splits it recursively until the predefined conditions are met.
Unfortunately, plain hierarchical methods are also poorly equipped to handle complex
18
scenarios as discussed by Liao [190], and Pranevičius and Šutienė [290] because they
cannot make adjustments once a merge or split action has been performed, which may
lead to poor scenario structures. On the other hand, as demonstrated by Musmeci,
Aste and Di Matteo [229], and Tumminello, Lillo and Mantegna [303], some elaborate
extensions of hierarchical methods can achieve good results, at the expense of
tractability.
Model-based clustering methods assume a model for each of the clusters and
attempt to find the best fit for the data. Statistical-based methods falls into this category.
Although fairly robust due to the use of more sophisticated frameworks (e.g., Hidden
Markov Models) and similarity measures (e.g., Dynamic Time Warping, Longest
Common Subsequence), these methods are usually rather complex and rare in the
financial literature. As examples, we refer to Knab et al. [177], and Dias, Vermunt and
Ramos [73][74] for sophisticated applications involving Hidden Markov Models (HMM)
to construct clusters of financial time-series.
Lastly, optimal discretization methods have also been used as alternative
approaches to approximated scenario trees, according Pflug [241][243], Römisch
[270], Pflug and Römisch [248], and Heitsch and Römisch [132][133][134][135][137].
As stressed by Pflug [241][243], although more elaborated these methods offer a
sound theoretical basis for approximating the true (and often intractable) underlying
probability distribution by a manageable and suitable probability measure with finite
support. In simply terms, such approximation is an optimization that seeks to minimize
an appropriate probability distance metric between the original set of scenarios and
the reduced set of scenarios.
In the literature, this approach is also known Monge-Kantorovich mass
transport problem, which in the scenario tree context corresponds to the optimal
redistribution of the approximated scenario probabilities in order to minimize a
probability distance between the original and the approximated scenario tree. We refer
to Villani [307] and Rachev and Rüschendorf [255][256] for detailed discussions about
mass transportation theory and to Gröwe-Kuska, Heitsch and Römisch [118],
Hochreiter, Pflug and Wozabal [142], and Heitsch and Römisch [133] for examples of
applications outside financial markets.
Optimal discretization theory relies deeply on the choice of a suitable
probability distance metric. As discuss by Rachev [253], Rachev, Stoyanov and
19
Fabozzi [257], and Gibbs and Su [112], numerous probability metrics are available but
most, such as the Kolmogorov metric or the total variation distance, are poorly
equipped to properly measure the distance between probability distributions. Among
the appropriate metrics to that accomplish that, there are even fewer that suited multi-
period stochastic optimization problems, i.e., that can be extended from probability
distribution distances to stochastic process distances. The most common candidates
in the literature are the Fortet-Mourier and Wasserstein distances.
Römisch [270], Heitsch and Römisch [132] and Dupačová, Gröwe-Kuska, and
Römisch [83] are renowned examples of scenario reduction using the Fortet-Mourier
metric. Likewise, Heitsch and Römisch [133]; Hochreiter and Pflug [141], Pflug
[241][243], Pflug and Pichler [244][245][246][247], and Kovacevic and Pichler [186] are
popular examples of scenario approximation using the Wasserstein metric. Moreover,
Dempster, Medova and Sook [68] provides an interesting discussion about the
statistical properties of the Wasserstein distance and its application in optimal
discretization for stochastic programming.
In particular, Pflug and Pichler [244][245][246], and Kovacevic and Pichler
[186] also explore in detail the Process (or Nested distance), which is an extension of
the Wasserstein distance for multi-period problems (i.e., for problems where we should
use some type of filtration distance, as explained by Heitsch and Römisch [132], and
Heitsch, Römisch and Strugarek [138]). They show that, in such problems, we need to
consider both the probability measures and the value-and-information structure
(represented by a proper filtration or scenario tree) during the discretization process of
a stochastic optimization problem. In this thesis, we follow this approach as well and
explain it in detail in later sections.
Concerning the second limitation of the GJR-GARCH + EVT (GPD) + t-Copula
framework to generate scenario trees, arbitrage-free scenarios are mandatory to real
finance applications to avoid biased solutions, as stressed by Klaassen
[173][174][175], Sodhi [286], Geyer, Hanke and Weissensteiner [107][108] and
Consigli, Carollo, and Zenios [57]. They also underline that most of the scenario
approximation methods do not take that into account. Accordingly, literature regarding
arbitrage-free approximation methods is, at best, limited.
Klaassen [173], for instance, shows how arbitrage opportunities can lead to
biased solutions (i.e., biased investment strategies) in stochastic programming
20
optimization problems. Klaassen [174] goes further and discusses a two-step process:
one to construct an arbitrage-free scenario and a second to reduce the tree size by
state and time aggregation while preserving the absence of arbitrage opportunities.
Unfortunately, such approach does not guarantee that other statistical features (e.g.,
kurtosis, covariance or serial correlation) are preserved. Klaassen [175], on the other
hand, extends Høyland and Wallace [144] moment-matching method, which preserves
some predetermined statistical features of the underlying stochastic process but does
not preclude arbitrage opportunities. Klaassen [175] describes how we can identify,
ex-post, such opportunities and suggests how we can avoid them, ex-ante, by adding
no-arbitrage constraints to Høyland and Wallace [144]’s moment-matching
approximation method.
Klaassen [175] also suggests that we can rerun the scenario approximation
method until an arbitrage-free scenario tree is achieved. However, Geyer, Hanke and
Weissensteiner [107][108] shows that simply rerunning or even increasing randomly
the branching scheme of the scenario tree do not guarantee an arbitrage-free scenario
tree, but it does make the optimization problem less tractable. Adding constraints to
exclude arbitrage opportunities lead to non-convex optimization problems, which
usually require a global optimization approach. Consigli, Carollo, and Zenios [57], for
instance, propose a refinement of the moment-matching method with no arbitrage-
constraints that uses such an approach (based on lower bounding techniques).
Geyer, Hanke and Weissensteiner [107][108][109] also points out that the
absence of arbitrage implies a lower bound on the branching scheme of the scenario
tree. The number of possible children scenarios at each node (i.e., each state of nature,
characterized by a particular realization of prices or returns) of the scenario tree usually
must be greater than the number of non-redundant instruments considered in the
optimization problem so that some statistical properties of the underlying stochastic
process can be properly preserved. Note that this condition is more stringent than
Harrison and Kreps [127]’s no-arbitrage condition.
Additionally, Jobst and Zenios [159], Fersti and Weissensteiner [100], and
Geyer, Hanke and Weissensteiner [107] stress that multi-period portfolio optimization
problems should use objective (i.e., real-world) probability measures rather than risk-
21
neutral probability measures to account properly for the (estimated) market price of
risk.
Finally, assuming that tractability and arbitrage conditions have been
achieved, we still need to evaluate the quality of the optimal solutions provided by these
techniques. As we mentioned earlier, such quality is typically evaluated in the literature
based on two key characteristics: optimality gap and stability.
The optimality gap, also known as the approximation error, between the
approximated and the original scenario tree, which represents the true underlying
stochastic process, is a major aspect of the approximation quality. Since the optimal
objective values or policies for true underlying stochastic problem are hardly available
(otherwise we would not need approximations in first place), we must find indirect
methods to estimate such gap. Many studies like Kaut and Wallace [168], Römisch
[270], Heitsch and Römisch [136], and Pflug [241] debate possible methods to control
it as well as their implications to the discretization process. Pflug [241], in particular,
provides a good review about the subject and proposes an important lemma that
establishes an upper bound on the optimality gap, which we discuss in later sections.
Such result is paramount to our optimal discretization approach based on the
Wasserstein and Process distances, as we will see later.
Stability is another major aspect related to the quality of the approximated
scenario tree. According Kaut and Wallace [168], King and Wallace [171], and
Dempster, Medova and Sook [69], stability analyses usually focus on both in-sample
and out-of-sample stability of the objective function rather than the stability of the
solutions, which may vary significantly due to the highly nonlinearity and non-convexity
of the optimization problem.
Definitions of in-sample and out-of-sample stability vary in the related
literature. In this thesis, we follow King and Wallace [171]’s definitions. In-sample
stability implies that optimal solutions do not vary significantly if we generate new
scenario trees with similar cardinality, using the same method. The concept of in-
sample stability implicitly assumes that the entertained scenario approximation method
does not produce the exactly same scenario tree for the same input data. Out-of-
sample stability, on the other hand, guarantees that the generated optimal solutions
lead to optimal objective function values that are close to the true objective value.
Clearly, out-of-sample stability is achieved if the optimality gap is sufficiently small.
22
Unfortunately, since the true objective function value is typically unknown in practical
applications, out-of-sample stability must be evaluated approximately by alternative
tests, as we discuss later.
Kaut and Wallace [168], and King and Wallace [171] also highlight that in-
sample stability does not imply out-of-sample stability and vice-versa, but both help to
establish a lower bound on the cardinality of the scenario tree. Various works in the
literature evaluate the stability of different methods to approximate scenario trees. For
a more comprehensive review of the subject, we refer to Dupačová, Gröwe-Kuska and
Römisch [83], Rachev and Römisch [254], Heitsch and Römisch [136], and Heitsch,
Römisch and Strugarek [138].
The various concepts and techniques mentioned previously suggest that, in
general, no particular scenario construction approach is superior in all circumstances.
They often depend heavily on the context and problem at hand. Dempster, Medova
and Sook [68], and Löhndorf [196] for example, compare sampling algorithms,
moment-matching approaches, clustering techniques or optimal discretization
methods, and conclude that differences are minimal, except for simple sampling
approaches that are outperformed.
It is also worth mentioning that some methods perform the scenario tree
generation and reduction steps simultaneously, rather than generating a large scenario
tree and then approximating it by a tractable one. In this thesis, we focus on the latter,
which seems to be more common in the recent literature. For studies and applications
regarding the former, we suggest Cariño et al. [45], Mulvey [225], Dempster and
Thorlacius [72], and Gulpinar, Rustem and Settergren [120].
Lastly, provided a manageable, sufficiently accurate and arbitrage-free
scenario tree is available, we proceed to the portfolio optimization step. Since
Markowitz’s seminal work [208], portfolio optimization has been a major subject in
financial markets. Although elegant, his single-period Mean-Variance approach entails
various simplifications that are hardly satisfied in real markets. For example, variance
is not a coherent risk measure; actual financial problems commonly require decisions
involving time varying (i.e., uncertain) variables over multiple periods (e.g., weeks,
months or years); portfolio rebalances incur in transaction costs and, sometimes,
liquidity issues and; taxes and regulatory restrictions often influence portfolio
performance.
23
A myopic (i.e., single-period) decision is optimal only in very particular
circumstances. Typically, as pointed out by Mulvey, Pauling and Madey [226] and
Brandt [40], (i) it fails to benefit or hedge against the presence of dynamic patterns and
relationships among financial instruments, (ii) it does not account for market liquidity
and transaction costs and (iii) it deals poorly with uncertainty. To accommodate such
elaborated market dynamics and frictions, various alternatives and extensions were
suggested. We refer to Guastaroba [119], Jarraya [155], Mansini, Ogryczak, and
Speranza [205] [206], and Kolm, Tütüncü and Fabozzi [166] for recent surveys. Among
these various alternatives, Multi-Period Stochastic Optimization (MSO) approaches
were particularly appealing because of their flexibility to handle key features of real
markets.
One key difference of multi-period approaches from single-period ones is that
their optimal solutions do not comprehend a single allocation decision. Instead, they
comprise a whole sequence of successive allocation decisions (i.e., rebalances),
known as allocation policy or strategy. Although an optimal policy rarely offers the best
outcome for any particular scenario, it does suggest a noninferior course of action (in
the Pareto sense) that considers the impacts and probabilities of all possible scenarios,
i.e., it suggests an allocation strategy that takes into account uncertainty. Shapiro,
Dentcheva and Ruszczyński [279], Dupačová, Hurt and Štěpán [84], Kall and Wallace
[165], and King and Wallace [171] provide numerous examples that discuss this
difference and emphasize the advantages of the multi-period stochastic approach.
There are many approaches to multi-period optimization such as dynamic
stochastic control, multi-period stochastic programming, sampling-based methods
(e.g., acceptance-rejection sampling, and importance sampling) and metaheuristic
methods (e.g., genetic algorithms, simulated annealing, and particle swarm
optimization). Early models like Mossin [224], Samuelson [276] and Merton [215], [216]
focused on manageable approaches that could be solved analytically, often based on
continuous-time frameworks with unrealistic simplifying assumptions. Conversely,
recent advances in processing power have greatly improved the tractability of discrete-
time Multi-Period Stochastic Programming (MSP) models, which offer greater flexibility
in financial applications as demonstrated, for instance, by Mulvey and Vladimirou [228],
Dantzig and Infanger [60], and Dupačová, Hurt and Štěpán [84]. Ziemba [313] goes
further and states that scenario-based discrete multi-period stochastic programming
24
usually offers a superior alternative to simulation, control theory and continuous-time
approaches.
Such advantage, however, relies on a critical and rather challenging step: the
construction and approximation of (often) multivariate scenarios. Stability, tractability
and accuracy of MSP’s allocation policies or strategies depend heavily on the quality
and number of the simulated scenarios, commonly represented as scenario trees.
Therefore, the steps mentioned earlier, and explained in detail in the next sections, are
essential to provide a reliable and manageable scenario tree that can be used by a
proper MSP approach. For further details regarding discrete-time stochastic decision
models, we suggest Uryasev and Pardalos [305], and Dupačová, Hurt and Štěpán [84].
In this thesis, a proper MSP approach refers to a stochastic programming
setup that handles satisfactorily, in the Artzner et al. [9] and Föllmer and Schied [105]’s
sense, the risk described by the approximated scenario tree, as discussed, for
instance, by Ruszczyński and Shapiro [273] and Rockafellar [265]. Markowitz’s original
Mean-Variance approach is certainly tractable and elegant but it does not account
properly for actual market risk. To accomplish that, various risk measures have been
proposed in the literature but few have achieved wide acceptance among academics
and practitioners, as discussed by Nawrocki [230].
The Lower Partial Moment (LPM) family of risk measures (or risk functionals),
proposed and explored by Bawa [20], Bawa and Lindenberg [21] and Fishburn [95],
included some of these measures. Such LPM measures are often called safety risk
measures in contrast to the more traditional dispersion risk measures, such as variance
or absolute deviation. As discussed by Nawrocki [230], the advantages of LPM risk
measures are twofold: they can handle non-normal returns and accommodate a myriad
of different investors’ utility functions (i.e., risk seeking, risk-neutral or risk averse
behaviours).
These properties are particularly relevant for practical financial applications
because: (i) non-normality of financial returns are commonly observed in real markets,
as pointed out by Campbell, Lo and MacKinlay [42], and Taylor [295] [296]; (ii)
derivation of a precise (and unique) utility function is challenging, if not impossible at
all, as stressed by Roy [271], and Kahneman and Tversky [163].
25
Therefore, LPM provides more realistic, albeit not perfect (e.g., usually difficult
to optimize or extend to standard portfolio theory), risk measures that are consistent
with stochastic dominance (for positive exponents) and focus on what typically matters
most for investors: the downside risk. We refer to Roy [271], Markowitz [209], Mao
[207], and Fishburn [103] for the original arguments regarding these characteristics
and to Unser [304], Bertsimas, Lauprete and Samarov [28], and Pflug and Römisch
[248] for recent discussions about LPM risk measures.
Among the most renowned downside risk measures, we underline the
Semivariance (SV), suggested by Markowitz [209], the Semi-Absolute Deviation
(SAD), proposed by Speranza [288], the Conditional Value-at-Risk (CVaR), proposed
by Rockafellar and Uryasev [266][267] and the Conditional Drawdown-at-Risk (CDaR),
suggested by Chekhlov, Uryasev and Zabarankin [49][50]. In particular, the CVaR, also
known as Average Value-at-Risk (AVaR) or Expected Shortfall (ES), has been quite
popular in recent practical applications according Xiong and Idzorek [311], Krokhmal,
Palmquist and Uryasev [187], and Krokhmal, Zabarankin and Uryasev [188], because
it captures tail risks reasonably well and it is computationally easy to optimize.
A comprehensive review of these various risk measures, their strengths and
limitations can be found in Dowd [79], Pflug and Römisch [248], Dempster, Pflug and
Mitra [70], and Rachev, Stoyanov, and Fabozzi [257]. Pflug and Römisch [248], in
particular, describes in detail the optimization of two types of risk measures:
acceptability type and deviation type functionals, which are especially suited for MSP.
As we discuss in later sections, we use the same approach to accommodate the risk
measures in the multi-period portfolio optimization step.
Unfortunately, a single risk measure is rarely enough to accommodate
adequately all investor’s goals in most practical portfolio optimization problems, as
pointed out by Roman, Darby-Dowman and Mitra [269]. As an example, variance (or,
correspondingly, standard deviation) is not a coherent risk measure but it is usually a
compulsory objective, from most investors’ perspectives. In such cases, a second (and
coherent) risk measure could be considered in order to manage risk properly. These
real-world cases, where multiple objectives must be evaluated simultaneously, often
lead to Multi-Objective Optimization (MOO) problems.
However, differently from MSP literature MOO literature is somewhat more
recent. There are a few early examples like Jean [156] and Konno, Shirakawa and
26
Yamazaki [182] but, only recently, improvements in processing power made it possible
to evaluate and explore new venues and ideas in practical applications. As examples
of promising applications for portfolio optimization, we can mention the Multi-Objective
Ant Colony Algorithm (MOACO), applied by Doerner et al. [78], the Non Dominated
Sorting Genetic Algorithm (NSGA-II), used by Lin, Wang and Yan [191], and the
Multiple Objective Evolutionary Algorithms (MOEA), entertained by Fieldsend, Matatko
and Peng [101]. We suggest Abraham, Jain and Goldberg [1] and Ehrgott [87] for a
comprehensive review on multicriteria optimization and Jarraya [155] for a recent
survey of MOO applications in asset allocation.
The recent advances in MOO are very appealing but most of them lack a key
feature required for many practical financial applications: they cannot handle multi-
period settings. In fact, the financial literature has barely explored approaches that can
solve stochastic multistage multi-objective problems. Roman, Darby-Dowman and
Mitra [269] and Abdelaziz [1] are among the few examples. Most studies combining
multi-period multi-objective optimization such as Fazlollahi, and Maréchal [99] are
found in Engineering. To contribute on that front, in this thesis we discuss an
additional, but narrowly explored, advantage of MSP approaches: the flexibility to
accommodate multi-objective functions in multi-period portfolio optimization problems.
In the next section, we discuss in detail the theoretical background required to
accomplish that.
27
3 Theoretical Background
In this section we discuss the theoretical framework required to:
(i) Estimate a multivariate probability distribution of daily log returns based
on a GJR-GARCH + EVT (GPD) + t-Copula combined approach that is
able to explain four important stylized facts: heavy-tails, volatility
clustering, leverage effect and tail co-dependence.
(ii) Simulate a sufficiently large tree of multivariate scenarios using the
combined approach, which is assumed to be close to the true underlying
stochastic process governing the multivariate price evolution of the
respective financial instruments.
(iii) Approximate the original large scenario tree by a smaller one with
tractable cardinality and no arbitrage opportunities.
(iv) Optimize a Mean-Variance-CVaR portfolio using Stochastic
Programming together with the approximated scenario, which is
assumed to describe properly the parameters uncertainty.
Figure 1 – Stochastic Portfolio Optimization Process Overview
Following McNeil, Frey and Embrechts [214], King and Wallace [171], and
Pflug and Pichler [246], before we elaborate on each one of these steps, we start with
some fundamental concepts and definitions.
28
Definition 1 A σ-algebra (sigma-algebra) ℱ on a scenario set � is a collection
of subsets of � such that:
(i) � ∈ ℱ
(ii) ℱ is closed under complementation (i.e., if � ∈ ℱ then �� ∈ ℱ)
(iii) ℱ is closed under countable unions (i.e., if ��. ��, ⋯ ∈ ℱ then ⋃ ������ ∈ ℱ)
(iv) ℱ is closed under countable intersections (i.e., if ��. ��, ⋯ ∈ ℱ then ⋂ �� ∈ ℱ���� )
The elements of ℱare also called events of ℱ. Two important σ-algebras on �
are the trivial σ-algebra ℱ� = �∅, �� and the complete σ-algebra ℱ� = ��� , which
comprises the power set (i.e., the collection of all subsets) of �. Additionally, given any
collection ! of subsets of �, the smallest σ-algebra containing all subsets of !, denoted "�! , is called the σ-algebra generated by !.
Definition 2 Given a scenario set � and a σ-algebra ℱ on �, a measure is a
mapping # from ℱ to the extended real line ℝ% = &−∞, +∞* such that:
(i) #�∅ = 0
(ii) #�� ≥ 0, ∀� ∈ ℱ
(iii) # is countably additive (i.e., if ��. ��, ⋯ ∈ ℱ are pairwise disjoint sets in ℱ
then #�⋃ ������ = ∑ #��� ���� )
Definition 3 Given a scenario set � and a σ-algebra ℱ on �, the pair ��, ℱ forms a measurable space. A measurable space is simply a set � equipped with a σ-
algebra ℱ, whose elements are called measurable sets.
Definition 4 Given a scenario set �, a σ-algebra ℱ on � and a measure #, the
triple ��, ℱ, # forms a measure space.
Definition 5 Given a measure space ��, ℱ, # , the measure # is a probability
measure if:
(i) #�� ∈ &0,1*, ∀� ∈ ℱ
(ii) #�� = 1
Definition 6 A probability space is a measure space ��, ℱ, � such that the
measure � is a probability measure.
29
Definition 7 Given a probability space ��, ℱ, � such that � is a scenario
space, ℱ is a σ-algebra on � and � is a probability measure, a (real-valued) random
variable � and a (real-valued) random vector / = �/�, ⋯ , /0 correspond, respectively,
to the mappings:
�: � → ℝ ( 1 )
/: � → ℝ0 ( 2 )
Such that � and / are ℱ-measurable
Definition 8 Given a random variable �, 3 and 4 are, respectively, the
cumulative distribution function and the probability density function of �, provided the
derivative of 3 exists, such that:
3: ℝ → &0,1* ( 3 )
3�5 = 6 4�7 879:� = ��� ≤ 5 ( 4 )
Definition 9 Given a random vector � = ���, ⋯ , �0 , the multivariate
distribution function 3 of ��, ⋯ , �0 corresponds to:
3�5�, ⋯ , 50 = ���� ≤ 5�, ⋯ , �0 ≤ 50 ( 5 )
Such that the marginal distributions of 3 are
3��5� = lim9>,⋯,9?@>,9?A>,⋯,9B→� 3�5�, ⋯ , 50 , C = 1, ⋯ , D ( 6 )
3.1 Scenario Tree Generation
As we discuss later in detail, an optimization based on stochastic programming
assumes that the underlying stochastic process or probability distribution of the
parameters is known. However, except for very unrealistic cases (i.e., cases that rely
on very strong simplifying assumptions), we cannot easily solve multiperiod stochastic
optimization problems using continuous processes or distributions because they are
functional optimization problems (i.e., solutions are functions, not values).
30
In practice, we often need to approximate the continuous stochastic process
or probability distribution by a similar (in some statistical sense) process or distribution
that has finitely many discrete scenarios like a scenario tree structure. Scenario trees
offer an intuitive way to represent the underlying stochastic process or probability
distribution, providing an easy structure to accommodate the uncertainty of the relevant
parameters and, in the former case, the way that information is revealed over time. In
other words, we can represent a stochastic process �����, ���, ���, ⋯ , , ���, ⋯ , �E� or a
probability distribution associated to ��E� with a discrete scenario process represented
by a tree structure.
Definition 10 Scenario trees are acyclic directed graphs comprising nodes
and leafs (i.e., nodes without descendants) that represent discrete conditional
realizations of a particular random n-vector at a given decision stage t. Nodes can have
neither siblings nor more than one parent. Consequently, scenario trees have only a
single root node. Each node is connected to its descendants by arcs (or branches) and
each arc is associated to a conditional probability of occurrence. Each possible
scenario path (or trajectory) from the root node to a leaf node is also associated to a
(unconditional) probability, which is given by the product of the arc probabilities
pertaining to that path. The number of stages and the branching scheme determines
the topology of the scenario tree. The n-vector values of the root node, which defines
the beginning of the optimization horizon, are assumed to be known.
For instance, following Pflug and Pichler [246] notation, let’s assume a
particular scenario tree with F + 1 = 11 nodes labelled �0, ⋯ , N� and H=2 stages �I�, I�, I�� such that:
31
Figure 2 – Scenario Tree Structure Example
Note that each scenario corresponds to a specific scenario path (or trajectory):
J� → Path 0, 1 and 4
J� → Path 0, 1 and 5
JK → Path 0, 2 and 6
JL → Path 0, 2 and 7
JM → Path 0, 2 and 8
JN → Path 0, 3 and 9
JO → Path 0, 3 and 10
In this example, besides the root node stage I� = �0� we have two other
stages: I� = �1,2,3� and I� = �4,5,6,7,8,9,10�. Moreover, for a given node D ∈�0, ⋯ , N�, we define that:
D ∈ IU ⇒ WXY8�D = D: ∈ IU:� (parent node) ( 7 )
D ∈ IZ, [ ∈ IZ\�, WXY8�[ = D ⇒ [ ∈ D\ (children nodes) ( 8 )
32
D ∈ IZ, [ ∈ I], ^ < _ ⇒ WXY8]�D = [ ⟹ D ≻ [ (all descendant nodes) ( 9 )
Each leaf node D ∈ IE is associated with a possible scenario J� ∈ ℝb, C =1, ⋯ ,7, which has a unconditional probability ��D = ��J� . Consequently, each non-
leaf node [ ∈ IZ, _ < H has probability:
��[ = c ��D 0≻d ( 10 )
Moreover, given the conditional probability ��D|D: , between nodes D and D:
(its parent node), we have:
��D|D: = ��D, D: ��D: = ��D ��D: ( 11 )
��[ = ��D|D: f ��[|[: 0≻d , D ∈ IE , D ≻ [ ( 12 )
��[ = 0 = 1 (root node is known) ( 13 )
Many simulating approaches applied in financial applications, as the one used
in this thesis for instance, often generate a scenario fan, which is a particular case of
a general scenario tree, rather than a traditional scenario tree. They are usually easier
to generate but less efficient (in the sense that they require too many nodes). Once a
scenario fan is simulated, we can approximate it by a smaller scenario tree using
optimal discretization, as we discuss in the next section.
Definition 11 A scenario fan is a scenario tree whose nodes (except for the
leaf nodes) have a single descendant node.
33
Figure 3 – Scenario Fan Example
A key aspect of any scenario tree structure is that its topology governs how
information is revealed at each stage. Consequently, the tree is implicitly linked to the
natural filtration associated to the underlying stochastic process and, as pointed out by
Pflug and Pichler [246], it can be assumed to be equivalent to the pertaining filtered
probability space.
Definition 12 Given a probability space ��, ℱ, � , an increasing sequence of
σ-algebras ℱ = �ℱ�, ℱ�, ⋯ , ℱE on � is a natural filtration if ℱZ ⊆ ℱZ\� ⊆ ℱ for _ =0,1, ⋯ , H − 1. A filtration represents the evolution of the available information up to time _. Additionally, the probability space ��, ℱ, � endowed with a filtration ℱ is called a
filtered probability space.
Definition 13 A stochastic process � = ���, ��, ⋯ , �E on ��, ℱ, � is said to
be adapted to the filtration ℱ if �Z is ℱZ-measurable for _ ≥ 0. It means that the
stochastic process at time _ depends only on the past information imposed by ℱZ. This
property is known as a nonanticipativity constraint. Moreover, any stochastic process � is adapted to its natural filtration (i.e., its history process).
34
For instance, in the previous example suppose that the σ-algebras ℱZ are
generated by the possible scenarios (i.e., atoms) associated with the nodes pertaining
to each stage of the tree. Consequently, the filtration ℱ = �ℱ�, ℱ�, ℱ� comprises:
ℱ� = "��J�, J�, JK, JL, JM, JN, JO� = "�� ℱ� = "��J�, J��, �JK, JL, JM�, �JN, JO�
ℱ� = "��J��, �J��, �JK�, �JL�, �JM�, �JN�, �JO� = h��
Figure 4 – Link between a Filtration and a Scenario Tree
As _ increases, additional information is revealed according the structure
imposed by the scenario tree and the natural filtration of the underlying stochastic
process. There is one-to-one relationship between them.
For multistage problems in particular, this information structure is because
crucial. Although two stochastic process may exhibit the same discrete distribution at
the final stage (i.e., the same scenarios and unconditional probabilities), they might not
reveal information in the same way, which may lead to different optimal solutions. To
35
see that, let us consider another example. Suppose we have the following scenario
trees A and B:
Figure 5 – Scenario Tree Comparison – Tree A
36
Figure 6 – Scenario Tree Comparison – Tree B
Both scenario trees lead to the same multivariate discrete distribution (i.e.,
same realizations with identical unconditional probabilities) at the final stage H = 2.
However, the underlying stochastic processes related to trees A and B are different.
To see that, just consider the node D = 1 in both scenario trees. On scenario tree A,
node 1 leads to one single outcome and, consequently, there is no uncertainty
regarding the expected scenario at the stage H (i.e., the realized scenario is
acknowledged before stage H). Conversely, on scenario tree B, node 1 leads to three
possible scenarios. Therefore, the realized scenario is not known until an event
(conditional on node 1) at stage H materializes. In that case, the optimal solutions for
the underlying stochastic processes represented by scenario trees A and B are likely
to be different (depending on values sitting on each node). Hence, any suitable optimal
discretization process should take into account the (dis)similarity between the
information structures (i.e., the natural filtrations) pertaining to the original and
approximated scenarios (i.e., discrete stochastic processes).
Another famous example in the literature highlighting the importance of the
information structure (i.e., the way information is revealed as the process evolves)
involves a fair coin tossed three times. In this example, one dollar is paid if (A) head
appears at least two times or if (B) head appears at the last toss. The following scenario
trees represent these payoffs.
37
Figure 7 – Scenario Tree for Coin Bet A
Figure 8 – Scenario Tree for Coin Bet B
38
After the last toss, the payoff distributions for both bets are equal but that the
payoff for bet A can be anticipated (i.e., can be revealed before the last toss) for some
scenarios. Therefore, earlier tosses might provide additional (and relevant) information
for bet A. Accordingly, the information structures of these bets are not the same and
any optimal decision based on them should consider this dissimilarity.
These examples show that the evolution of information disclosure is
paramount to any decision process. Obviously, we also require the content (i.e., the
values) of the information disclosed since decisions are based on both the possible
values and their (conditional) probabilities.
Fortunately, a scenario tree can describe both the information process and the
value process. The information (or tree) process /, related to the filtration ℱ,
corresponds to the tree structure that establishes how information is revealed. The
discrete value process � corresponds to the value or vectors sitting in the tree nodes
that comprise the revealed content. The pair ��, / is called (scenario) value-and-
information pair.
Definition 14 A tree process is a stochastic process / = �/�, /�, ⋯ , /E with
values on some (Polish) state space IZ, _ = 0, ⋯ , H, such that:
(i) I� C^ i ^CDjkY_lD
(ii) IZ iXY WiCXmC^Y 8C^nlCD_
(iii) "�/Z = "�/�, ⋯ , /Z , _ = 0, ⋯ , H
(iv) ℱ = "�/ = o"�/� , "�/� , ⋯ , "�/E p is a filtration
A tree process / = �/�, /�, ⋯ , /E can be constructed using the history process
of any stochastic process � = ���, ��, ⋯ , �E :
/� = ��
/� = ���, �� ⋮
/E = ���, ��, ⋯ , , �E
Assuming that all state spaces IZ of the tree process / are Polish and given
general Borel sets rZ, _ = 1, ⋯ , H, Shiryaev [283] demonstrates that we can compute
39
the joint distribution � of the tree process / = �/�, /�, ⋯ , /E using its conditional
probabilities:
���r� = ��/� ∈ r�
���r�|s� = ��/� ∈ r�|/� = s�
⋮ �Z�rZ|s�, ⋯ , sZ:� = ��/Z ∈ rZ|/� = s�, /� = s�, ⋯ , /Z:� = sZ:�
In addition, Shiryaev [283] also shows that given a filtration ℱ = "�/ , a
stochastic process � adapted to ℱ and a measurable function 4Z, we have:
�Z = 4Z�/Z ( 14 )
Finally, a tree process induces a probability distribution � on IZ. Therefore, if
we choose � = IE, we get a filtration ℱ = "�/ = o"�/� , ⋯ , "�/E p = �ℱ�, ⋯ , ℱE on �, which leads to a filtered probability space ��, ℱ, � .
Definition 15 Given two tree processes / and /t defined on possibly different
filtered probability spaces ��, ℱ, � and o�t, ℱt, �tp, Borel sets rZ, _ = 1, ⋯ , H and
conditional probabilities such that:
�Z�rZ|s�, ⋯ , sZ:� = ��/Z ∈ rZ|/� = s�, /� = s�, ⋯ , /Z:� = sZ:� , _ = 1 ⋯ , H ( 15 )
�tZ�rZ|s�, ⋯ , sZ:� = �t�/Z ∈ rZ|/� = s�, /� = s�, ⋯ , /Z:� = sZ:� , _ = 1 ⋯ , H ( 16 )
The tree processes are equivalent if:
(i) There are bijective functions u�, ⋯ , uE mapping the state spaces FZ of /Z to the state spaces FvZ of /tZ such that the processes �/�, ⋯ , /E and �u�:�o/t�p, ⋯ , uE:� �/tE have the same distribution. (ii) �t is the pushforward measure of � for the measurable functions uZ.
�t��r� = �� wu�:�orx�py
�tZ�rZ|s�, ⋯ , sZ:� = �Z wuZ:�orxZpzu�:�ost�p, ⋯ , uZ:�:� ostZ:�py , _ = 2, ⋯ , H
40
Figure 9 – Pushforward Measure of P
Finally,
Definition 16 Given and tree process Y and a value process X such that X is
adapted to the filtration ℱ generated by "�/ , the pair ��, / is called scenario value-
and-information pair.
Definition 17 Given two scenario value-and-information pairs ��, / and o�t, /tp
such that �Z = 4Z�/Z and �tZ = 4xZo/tZp, the pairs are equivalent if:
(i) Tree processes / and /t are equivalent (in the sense of Definition 15). (ii) 4xZ = 4Z ∘ uZ:� a.s. for the equivalent mappings uZ. These definitions indicate that scenario value-and-information pairs and
scenario trees are interchangeable. Consequently, given a multi-period stochastic
optimization problem and two equivalent scenario value-and-information pairs, we
should achieve similar solutions.
In addition, as we discuss later, a scenario value-and-information pair can also
described by a nested distribution, a concept proposed by Pflug [243] and further
discussed by Pflug and Pichler [245]. The nested distribution and the related process
distance concepts are particularly useful when dealing with approximations of
stochastic processes rather than probability measures (i.e., approximations in multi-
period settings rather single-period settings) because they do not require two separate
41
measures of dissimilarity (i.e., one for distributions and another for filtrations) to govern
the quality of the approximation.
However, before we delve into how we can perform such multi-period
approximations using process distances, in the next subsections we cover the
theoretical background required to simulate a (usually large) multivariate scenario tree
that properly describes real financial series. They discuss how to (i) elaborate and
calibrate a combined econometric model capable of simulating the four major financial
stylized facts discussed earlier: heavy-tails, volatility clustering, leverage effect and tail
co-dependence, and (ii) generate a large set of multivariate trajectories of log returns,
represented by a scenario fan, based on this combined econometric model.
3.1.1 GARCH Models
As we discussed in the previous section, Generalized Autoregressive
Conditional Heteroscedastic (GARCH) models are particularly appealing because they
can explain reasonably well key stylized facts of financial series like volatility clustering,
leverage effects and, to some extent, heavy-tails.
Despite of some quite interesting nonlinear and regime-switching extensions
proposed recently, in this study we focus on simpler asymmetric GARCH models that
are more tractable and still can properly accommodate these key statistical features.
A simple GARCH (p,q) process entails:
|Z = XZ − }Z:�&XZ* ( 17 )
|Z = ~Z"Z ( 18 )
}&|Z* = 0, }&|Z|Z:]* = 0 for ^ ≠ 0
"Z� = i� + c i�|Z:������ + c ��"Z:���
��� ( 19 )
i� > 0, i� ≥ 0, n = 1 ⋯ �, �� ≥ 0, C = 1 ⋯ W
c i��
��� + c ���
��� < 1
42
Where �XZ� is a series of daily log returns, �~Z� is a white noise sequence and "Z� is the conditional variance at time _. The restrictions in equation ( 19 ) ensure that
the conditional variance is positive and that the unconditional variance is finite:
�iX�|Z = }&|Z�* = i�1 − w∑ i����� + ∑ ������ y ( 20 )
The conditional mean }Z:�&�Z* is usually specified as a low order ARMA model.
For daily log returns in particular, empirical evidence suggests that serial correlation in
raw returns is often statistically insignificant leading to no relevant autoregressive term
in the conditional mean specification.
Using a basic GARCH (1,1) formulation as a reference:
"Z� = i� + i�|Z:�� + ��"Z:��
i� > 0, i� > 0, �� > 0
i� + �� < 1
( 21 )
We can see how volatility clustering and, to some extent, heavy-tails can be
explained. According to Bauwens, Hafner and Laurent [20], most estimates of the
coefficients i� and �� lie, respectively, in the ranges [0.02 0.05] and [0.75, 0.98]. We
notice that for these values of i� and ��, a large value "Z:�� prompts a large value "Z�
and a small value of "Z:�� induces a small value "Z�, provided |Z:�� is not too large.
Consequently, the volatility clustering observed in daily financial returns can be
reasonably explained by estimating appropriate values for i� and ��. Besides, provided 1 − 2i�� − �i� + �� � > 0 and assuming that ~Z follows a Gaussian distribution, He [131]
shows that
�L = }&|ZL*�}&|Z�* � = 3 � &1 − �i� + �� �*1 − �i� + �� � − 2i��� > 3 ( 22 )
That means that a GARCH(1,1) process also induces a leptokurtic
unconditional distribution, i.e., an unconditional distribution that exhibits heavier tails
than those suggested by a Gaussian distribution. Unfortunately, assuming Gaussian
innovations, the induced unconditional kurtosis is not large enough to fully explain the
tail behaviour of daily financial returns. There are, however, alternatives: (i) we can
assume innovations with Student t-distributions, as recommended by Bollerslev [35],
or Generalized Error Distributions (GED), as suggested by Nelson [232] or; (ii) we can
43
combine GARCH models with Extreme Value Theory (EVT) as proposed by McNeil
and Frey [213]. We shall explore the latter alternative in the next section.
Parameters of GARCH models are often estimated using Maximum-Likelihood
Estimation (MLE) or Quasi-Maximum-Likelihood Estimation (QMLE), depending on the
choice of the innovation distribution. Unfortunately, as stressed in Andersen et al. [7],
GARCH log-likelihood functions are often ill-behaved and poor choices of starting
values can lead to convergence problems. Moreover, constraints such as positive
variance can compromise optimization performance. As a consequence, simpler
specifications are often preferred in practical applications.
These simpler specifications comprise, for example, early formulations of
asymmetric GARCH models like the EGARCH and GJR-GARCH models. They
are also widely used because they can explain a third important stylized fact: the
leverage effect.
The EGARCH (p,q) process substitutes equation ( 19 ) for
ln "Z� = i� + c �i� ��|Z:��"Z:� − } ��|Z:��"Z:� �� + 8� |Z:�"Z:������ + c �� ln "Z:���
��� ( 23 )
The additional standardized terms can capture potential asymmetric impacts
of innovations. For instance, if 8� is negative, bad news has a larger impact on volatility.
Besides, the logarithmic transformation guarantees positive variance and relaxes the
positive constraint requirements. On the other hand, EGARCH processes might lead
to infinite variance under non-Gaussian innovations (for example, with Student-t
innovations) as underlined by Nelson [232]. In addition, conditional variance’s
forecasts are biased according Jensen’s inequality
}&"Z�* ≥ Y���� ���� On the other hand, the GJR-GARCH formulation substitutes equation ( 19 ) for
"Z� = i� + c i�|Z:������ + c ��"Z:���
��� + c 8��o|Z:� < 0p|Z:������ ( 24 )
i� > 0, i� ≥ 0, 8� ≥ 0, n = 1 ⋯ �, �� ≥ 0, C = 1 ⋯ W
Where ��. is an indicator function such that
44
�o|Z:� < 0p = �1 C4 |Z:� < 00 C4 |Z:� ≥ 0 ( 25 )
Notice that potential asymmetric impacts on the conditional variance are
explained by the last term. In this specification, we need to impose restrictions on the
coefficients to ensure a positive conditional variance. Moreover, GJR-GARCH
processes have more stringent stationarity conditions than EGARCH process, as
discussed by Ling and McAleer [192].
For example, a GJR-GARCH (1,1) process is stationary and has finite variance
provided
i� + �� + 8�2 < 1 ( 26 )
Moreover, the fourth moment exists if
��� + 2i��� + ��i�� + ��8� + �� �i�8� + 8��2 � < 1 ( 27 )
�� = � 3 C4 ~Z~F�0,1 3 � − 2� − 4 C4 ~Z~_�0,1, �
To the best of our knowledge, there are few studies comparing performance
of EGARCH and GJR-GARCH models. Engle and Ng [96], and Peters [240]
recommend the GJR-GARCH over the EGARCH specification while Alberg, Shalit and
Yosef [1] suggests otherwise. Without loss of generality, we follow the former and focus
on the GJR-GARCH formulation in our analysis.
3.1.2 Extreme Value Theory
Although some models of the GARCH family can adequately emulate
important key stylized facts such as volatility clustering and leverage effects, they are
often still unable to fully model heavy-tailed distributions. To accomplish that, an
additional technique such as Extreme Value Theory (EVT) may be used to handle tails
properly, where very few observations are usually available.
45
EVT encompasses two major approaches to model extreme returns: Block
Maxima (BM) and Peaks over Threshold (POT) that use, respectively, two fundamental
results from EVT. We briefly discuss both approaches next.
Block Maxima Method
The central result underlying block maxima methods is the Fisher-Tippett-
Gnedenko theorem, initially proposed by Fisher and Tippett [104], and fully proved by
Gnedenko [114]. It shows that, under certain conditions, the asymptotic distribution for
the series of maxima (minima) converges to the Weibull, Gumbel or Fréchet
distributions, independently of the distribution of the series.
Theorem 1 (Fisher-Tippett-Gnedenko theorem) Let �X�, ⋯ , X0� be a sequence
of independent and identically distributed random variables with common distribution
function 3, �0 = [i5� � 0�X�� the maximum order statistic and ���0 ≤ 5 = ��X� ≤5, ⋯ , X0 ≤ 5 = &3�5 *0. If there exist constants "0, #0 ∈ ℝ, "0 > 0 and a non-
degenerate distribution function ¡ such that:
lim0→� � ¢�0 − #0"0 ≤ 5£ = lim0→�&3�"05 + #0 *0 =¡�5 ( 28 )
Then ¡ converges in distribution to one of the three limiting families:
Gumbel (or Type I): ¡�5 = Y:¤@¥ , −∞ < 5 < +∞ ( 29 )
Fréchet (or Type II): ¡�5 = �Y:��\¦9 @>§ C4 5 > − 1�0 l_ℎYXmC^Y ( 30 )
Weibull (or Type III): ¡�5 = �Y:��\¦9 @>§ C4 5 < − 1�1 l_ℎYXmC^Y ( 31 )
In other words, the Fisher-Tippett-Gnedenko theorem shows that if 3 belongs
to the maximum domain of attraction (MDA) of ¡ for some non-degenerate distribution
function ¡ then ¡ must converge weakly to one of these distributions of properly
normalized maxima. Moreover, these limiting distributions are particular cases of the
Generalized Extreme Value (GEV) distribution (Figure 10), introduced by von Mises
[221] and Jenkinson [157]:
46
¡¦�5 = ©Y:��\¦9 @>§ C4 � ≠ 0Y:¤@¥ C4 � = 0 such that 1 + �5 > 0 ( 32 )
Figure 10 – Generalized Extreme Value Distributions
The shape parameter � determines the corresponding limiting distribution for
the tail and, therefore, its thickness. It is also common to refer to the tail behaviour
using the tail index � = �:�. If � < 0, ¡¦ converges to the Weibull family that includes
distributions with finite support like the beta distribution. If � = 0, ¡¦ converges to the
Gumbel family that encompasses distributions like the normal, log-normal and
hyperbolic whose right tails exhibit an exponential decay (i.e., thin-tailed behaviour).
Finally, if � > 0, ¡¦ converges to the Fréchet family that comprehends, for example,
Student t, Pareto and stable distributions whose right tails follow a polynomial decay
(i.e., heavy-tailed behaviour).
Given that daily financial series of returns typically display heavy tails as
empirical research suggests, the Fréchet distribution seems a suitable candidate to
model the asymptotic behaviour of their tails, even if we do not know, as is usually the
case, the actual distributions of the financial series.
As we do not know a priori the normalized maxima 5 required by ¡¦�5 , we
need to identify, besides the shape parameter �, the scale " and location # parameters.
Differently from the former, the latter two often depend on the distribution of the
financial series. Therefore, we use an alternative definition for the GEV distribution:
47
¡¦�� = ¡¦,�,ª�5 = �Y:w�\¦¥@«¬ y@>§ C4 � ≠ 0Y:¤@¥@«¬ C4 � = 0 such that 1 + � 9:ª� > 0 ( 33 )
Then, we can apply a parametric approach like maximum-likelihood to
estimate these parameters as suggested by Embrechts, Klüppelberg and Mikosch [90]
and Tsay [301]. First, we divide the sample in [ non-overlapping subperiods of length u. Then we select the maximum return in each subperiod and construct a sequence of
maximal returns �������...d, assumed to have a ¡¦�5 distribution. Supposing that the
sequence of maximal returns is independent and identically distributed, the likelihood
function is:
k���, ⋯ , �d|�, ", # = f ℎ¦,�,ª��� d��� ( 34 )
ℎ¦,�,ª��� =®®°1" ±1 + � �� − #" ²:�\¦¦ Y:w�\¦³?:ª� y@>§ C4 � ≠ 01" Y:³?:ª� :¤@´?@«¬ C4 � = 0 ( 35 )
And, consequently, the log-likelihood function is:
µ���, ⋯ , �d|�, ", # = c kD�ℎ¦,�,ª��� �d��� ( 36 )
ℎ¦,�,ª��� =®®°− ln " − 1 + �� ln ±1 + � �� − #" ² − ±1 + � �� − #" ²:�¦ C4 � ≠ 0
− ln " − �� − #" − Y:³?:ª� C4 � = 0 ( 37 )
Embrechts, Klüppelberg and Mikosch [90] and Hosking [143] discuss potential
numerical procedures to find the maximum-likelihood estimates for the shape, scale
and location parameters of the GEV distribution. Based on Smith [284], we can show
that these estimates are consistency and asymptotic efficiency as long as � > − 1 2⁄ .
Although straightforward, the previous parametric approach relies on the
strong simplifying hypothesis that extreme returns follow precisely a GEV distribution.
A more realistic approach, for example, would be to assume that extreme returns follow
approximately a GEV distribution. Unfortunately, this alternative semi-parametric
48
method is more involved. We refer to Embrechts, Klüppelberg and Mikosch [90] for
further details regarding this approach.
The traditional extreme value approach based on the limiting GEV distribution
is rather simple and extensively used but it presents some shortcomings, as pointed
out by Embrechts [89], Tsay [301] and Gilli and Këllezi [113]. First, it does not use all
available data efficiently, i.e., it uses only maxima from (possibly sizable) subsamples.
Second, estimation results might vary greatly depending on the number of subperiods [ and the subperiod sample size u and it is not clear how to identify, supposing it
exists, a suitable trade-off between [ and u. Lastly, the Fisher-Tippett-Gnedenko
theorem holds if financial series are (almost) independent and identically distributed or
if the sample size u is excessively large so that the block maxima sequence can be
assumed to be independent. So we need to handle these limitations before we can
apply this approach to financial series. The technique discussed next avoids the first
and second problems while the last one is handled in a later section, where we
combine GARCH, EVT and copula methods to properly simulate daily multivariate
return series.
Peaks over Thresholds Method
As an alternative to block maxima approach, Davison and Smith [63] and Smith
[285] introduced an alternative approach focused on the exceedances beyond a given
(large) threshold, also known as Peaks over Threshold (POT), that does not exhibit
some of the shortcomings of the traditional approach.
Rather than focusing on the distribution function 3�5 of the independent and
identically distributed returns, this method focuses on the conditional excess
distribution 3·�� over a sufficiently large threshold u (i.e., on the exceedances
distribution) where � = 5 − ¸, 5 > ¸. 3·�� is given by:
3·�� = ��� ≤ � + ¸|� > ¸ = ��� ≤ � + ¸, � > ¸ ��� > ¸ ⇒
3·�� = ��¸ < � ≤ � + ¸ 1 − ��� ≤ ¸
3·�� = 3�� + ¸ − 3�¸ 1 − 3�¸ ( 38 )
Or, in terms of the tail’s distribution 3¹ = 1 − 3:
49
��� > � + ¸|� > ¸ = 1 − 3·�� = 1 − 3�� + ¸ − 3�¸ 1 − 3�¸ ⇒
1 − 3·�� = 1 − 3�¸ − 3�� + ¸ + 3�¸ 1 − 3�¸ ⟹
1 − 3·�� = 1 − 3�� + ¸ 1 − 3�¸ ⟹
3¹·�� = 3¹�� + ¸ 3¹�¸ ⟹
3¹�� + ¸ = 3¹�¸ 3¹·�� ( 39 )
So, we can estimate the tail distribution 3¹�� + ¸ by individually estimating 3¹�¸ and 3¹·�� . The fundamental result underlying the estimation of 3¹·�� was
proposed by Pickands [249], and Balkema and de Haan [14]. Basically, it shows that
under certain conditions the conditional distribution 3·�5 converges to a Generalized
Pareto Distribution (GPD) (Figure 11) where
º¦,��· �� = »1 − w1 + � ¼��· y:>§ C4 � ≠ 01 − Y: ½¬�¾ C4 � = 0 where ¿ > 0 ( 40 )
Figure 11 – Generalized Pareto Distributions
With support 5 ≥ # when � ≥ 0 and 0 < 5 ≤ − ��· ¦ when � < 0. Similarly to
GEV distribution, the GPD can be defined by a shape parameter �, a scale parameter " and a location parameter # (here assumed to be zero).
50
Theorem 2 (Pickands-Balkema-de Haan theorem) Let �X�, ⋯ , X0� be a
sequence of independent and identically distributed random variables with common
distribution function 3 and common conditional distribution 3· over a sufficiently large
threshold ¸. If 3 ∈ �Àro¡¦p there is a positive measurable function "�¸ such that:
lim·→9Á ^¸W� 9Â9Á:·�3·�5 − º¦,��· �5 � = 0 ( 41 )
Where 5Ã is the right endpoint (finite or infinite) of the distribution 3.
We observe that, while the GEV distribution ¡¦,�,ª describes the limiting
distributions of normalized maxima of a distribution 3 that belongs to the �Àro¡¦p, the
GPD º¦,��· describes the respective limiting distribution of exceedances beyond a
given sufficiently large threshold. Incidentally, the shape parameter � is the same for
both distributions.
If � < 0, º¦,��· converges to distributions whose right tails are finite like the
beta distribution. If � = 0, º¦,��· converges to distributions whose right tails exhibit an
exponential decay like the normal. Lastly, if � > 0, º¦,��· converges to distributions
whose right tails follow a polynomial decay like the Student t or Pareto distributions.
Consequently, a GPD with positive shape parameter � seems a likely candidate to
model the asymptotic behaviour of the conditional tails of daily return series.
To find 3¹·�� , we start by applying Theorem 1:
lim0→�&3�"05 + #0 *0 =¡¦�5 ⟹
lim0→�&1 − 3¹�"05 + #0 *0 = ¡¦�5 ⟹
lim0→� D ln&1 − 3¹�"05 + #0 * = kD ¡¦�5 ( 42 )
Defining a sequence of thresholds ¸0�5 = "05 + #0 and assuming, for large
values of ¸0�5 such that 3¹o¸0�5 p ≈ 1, a first-order Taylor series approximation:
ln�1 − 3¹o¸0�5 p� ≈ −3¹o¸0�5 p ⟹
lim0→� D ln&1 − 3¹�"05 + #0 * ≈ lim0→�D�−3¹o¸0�5 p� ≈ kD ¡¦�5 ⇒
lim0→�D3¹o¸0�5 p ≈ −kD ¡¦�5 ( 43 )
Finally, for D sufficiently large:
51
D3¹�¸ ≈ −kD ¡¦ w¸ − #" y⟹
3¹�¸ ≈ − 1D ln ¡¦ w¸ − #" y ⟹
3¹�¸ ≈ °− 1D ln Y:w�\¦·:ª� y@>§ C4 � ≠ 0
− 1D ln Y:¤@¾@«¬ C4 � = 0 ⟹
3¹�¸ ≈ °1D w1 + � ¸ − #" y:�¦ C4 � ≠ 01D Y:·:ª� C4 � = 0 ( 44 )
3¹�� + ¸ ≈®®°1D ¢1 + � � + ¸ − #" £:�¦ C4 � ≠ 01D Y:¼\·:ª� C4 � = 0 ( 45 )
Using these results, we can estimate 3¹·�� by recalling that:
3¹·�� = 3¹�� + ¸ 3¹�¸ ⟹
3¹·�� = 1D w1 + � � + ¸ − #" y:�¦1D w1 + � ¸ − #" y:�¦ ⟹
3¹·�� = Å1 + � � + ¸ − #"1 + � ¸ − #" Æ:�¦ ⟹
3¹·�� = �" + ��� + ¸ − # " + ��¸ − # �:�¦ = ¢1 + ��" + ��¸ − # £:�¦ ( 46 )
Alternatively, we could find 3¹·�5 based on Theorem 2:
kC[·→9Á ^¸W� 9Â9Á:·�3·�5 − º¦,��· �5 � = 0 ⟹
kC[·→9Á ^¸W� 9Â9Á:·�1 − 3·�5 − 1 + º¦,��· �5 � = 0 ⟹
52
kC[·→9Á ^¸W� 9Â9Á:·�3¹·�5 − º¦,��· �5 � = 0 ( 47 )
As a result, for ¸ sufficiently large, we can assume that:
3¹·�5 ≈ º¦,��· �5 ( 48 )
Lastly, supposing that we have a sufficiently large number F· of observations
greater than the threshold ¸, as proposed by Embrechts, Klüppelberg and Mikosch
[90], we can consider the empirical approximation:
3¹�5 ≈ 3¹0�5 = F·D ( 49 )
Using these results, we can estimate 3¹�� + ¸ by:
3¹�� + ¸ = 3¹·�� 3¹·�� ⟹
3¹�� + ¸ = F·D w1 + � � − ¸" y:�¦ ( 50 )
Then, we can use a maximum-likelihood approach to estimate the shape and
scale parameters � and " of the tail distribution 3¹�� + ¸ , as discussed by Embrechts,
Klüppelberg and Mikosch [90] and Hosking and Wallis [146].
As pointed out by McNeil, Frey and Embrechts [92], the POT approach uses
the data more efficiently than the BM approach but it still present two major challenges:
(i) appropriate choice of thresholds and (ii) unreliable results when return series exhibit
serial correlation.
The first challenge refers to the trade-off between variance and bias as
discussed by McNeil [211], and McNeil and Saladin [212]. A large threshold ¸ means
low bias because fewer observations outside the tail are selected to estimate the GPD
parameters. Conversely, it also means higher variance because the tail itself contains
fewer observations, which increases the estimates’ variance. Castillo and Hadi [48]
further emphasizes that the threshold ¸ must select significantly large quantiles
(preferably above 0.99) so we can apply the Pickands-Balkema-de Haan theorem.
The most two common approaches to choose suitable thresholds are the Hill
estimator plot discussed by Hill [139], Resnick and Stărică [261] and the empirical
mean excess function plot recommended by Embrechts, Klüppelberg and Mikosch
[41].
53
Following McNeil, Frey and Embrechts [214], using the order statistics �0,0 ≤⋯ ≤ ��,0 and assuming a large sample size D and sufficiently small number of
extremes u, beyond the threshold ¸, we can employ the standard form of the Hill
estimator:
�Èb,0 = É1u c kDo��,0p − kDo�b,0pb��� Ê:� , 2 ≤ u ≤ D ( 51 )
To plot estimates for different sets of order statistics (i.e., different values of u)
and look for a stable region in the Hill estimator plot, where we can assume that �È:�~�Ë. Similarly, we can also use the empirical mean excess function plot as
described by McNeil, Frey and Embrechts [214] to identify appropriate thresholds.
Embrechts, Klüppelberg and Mikosch [90] shows that a key property of the GPD is that
the mean excess function is linear in the threshold ¸, provided ¸ is sufficiently large.
Thus, we can use the empirical estimator of the mean excess function:
Y0�¸ = ∑ ��� − ¸ �Ì?Í·0���∑ �Ì?Í·0��� ( 52 )
�Ì?Í· = �1, C4 �� > ¸0, C4�� ≤ ¸
To plot estimates of the empirical mean excess function for different values of ¸. If the distribution tail follows a GPD with positive shape parameter �, the plot should
exhibit a linear upward trend for satisfactorily higher threshold values.
The second challenge requires that we handle dependence before employing
the POT approach. Resnick and Stărică [262], for instance, suggests using
standardized residuals from ARMA processes to mitigate this problem. McNeil and
Frey [213] extends the idea by using the standardized residuals from GARCH
processes. In the next section, we develop the latter to account for leverage effects
and tail co-dependence.
Lastly, we can also extend the POT approach previously described to a point
process framework, i.e., rather than handling the peaks over threshold and the
occurrence times as separate processes, we can combine them into a point process
of exceedances. We do not explore this alternative in this work and refer to Resnick
[260] for further details.
54
This brief review suggests that EVT provides an elegant way to deal with
univariate extreme returns. However, it is not easily extended to a multivariate
framework because there is no longer a unique definition for an extreme events (i.e.,
we may entertain different concepts of order). Fortunately, as we discuss next copulas
provide an interesting alternative to handle multivariate heavy-tailed distributions.
3.1.3 Copulas
Although Pearson’s correlation is likely one of the most applied measure of
dependence in practical finance models, it is not always able to describe properly how
close assets move together. Financial series often display higher (in absolute terms)
dependence during turbulent periods than in calmer times. In other words, linear
measures such as Pearson’s correlation cannot deal appropriately with the nonlinear
dependence structures exhibited by financial markets, particularly in the tails of the
multivariate distribution.
As we mentioned earlier some elaborated multivariate GARCH models, like
the DCC proposed by Engle [94] and TVC suggested by Tse and Tsui [302], are
capable of accounting for time-varying dependence. However, they often impose
restrictions that may reduce the flexibility to accommodate stylized facts of individual
series.
An alternative to explain and, more interestingly, simulate the dependence
structure observed among financial series are copulas, proposed by Sklar [282]. In
basic terms, copula are links between multivariate distributions and their marginal
distributions. They provide greater flexibility to (i) model particular properties of each
univariate series and (ii) select a suitable dependence structure for their joint
behaviour. Besides, they also offer a straightforward method to simulate draws from
joint distributions, once we estimate the marginal distributions and copulas’
parameters.
According Nelsen [231] and McNeil, Frey and Embrechts [214], an n-
dimensional copula is simply a multivariate uniform distribution Î�¸�, ⋯ , ¸0 such that Î: &0,1*0 → &0,1*, which exhibits the following properties:
(i) Î�¸�, ⋯ , ¸0 is grounded and n-increasing for all ¸� ∈ &0,1*, C = 1, ⋯ , D.
55
(ii) Î�1, ⋯ , ¸�, ⋯ ,1 = ¸� for all ¸� ∈ &0,1*, C = 1, ⋯ , D.
(iii) Î�¸�, ⋯ , ¸�:�, 0, ¸�\�, ⋯ , ¸0 = 0 for all ¸� ∈ &0,1*, C = 1, ⋯ , D.
In addition, the key result underlying the copula approach is the Sklar theorem,
which let us decouple the modelling of the marginal distributions from the dependence
structure. It states that we can either extract a copula from a multivariate distribution
function or combine marginal distributions to a copula to construct a multivariate
distribution function.
Theorem 3 (Sklar theorem) Let 3�3�, ⋯ , 30 be an n-dimensional distribution
function with continuous margins 3�, ⋯ , 30. Then there exists a unique copula Î: &0,1*0 → &0,1* with uniform margins such that
3�5�, ⋯ , 50 = Îo3��5� , ⋯ , 30�50 p ( 53 )
Another important result is the Fréchet–Hoeffding theorem, which defines the
lower and upper limits of a particular copula. This is important because some copulas
can describe only a certain range of dependence.
Theorem 4 (Fréchet–Hoeffding theorem) Let Î be an n-dimensional copula.
Then for any �¸�, ⋯ , ¸0 ∈ &0,1*Ï we have
max É1 − D + c ¸�0
��� ; 0Ê ≤ Î�¸�, ⋯ , ¸0 ≤ min�¸�, ⋯ , ¸0 ( 54 )
Given these results, the next step involves selecing an appropriate copula to
describe the dependence structure. As we pointed out previously, in this study we use
a Student t copula, from the elliptical family, to accommodate tail dependence.
Although elliptical copulas impose tail symmetry and do not offer a closed form, they
are tractable in higher dimensions. Conversely, other families like Archimedean and
vine copulas usually offer more flexibility at the cost of tractability. Even t copulas can
be extended to more elaborated models such as the skewed t or grouped t copulas,
as discussed by Demarta and McNeil [64], but these extensions are less tractable as
well.
The Student t copula ÎÓ,ÔZ is easily defined in terms of an n-dimensional
multivariate standardized Student t distribution function HÓ,Ô and a univariate
standardized Student t distribution function _Ô. Given the correlation matrix Õ, the
degrees of freedom � and the gamma function Ö�∙ , we have:
56
HÓ,Ô�5�, ⋯ , 50 = HÓ,Ô�5 = 6 ⋯ 6 Γ w� + D2 y |Õ|:��Γ w�2y ��Ù Ú� �1 + 5EÕ:�5� �:Ô\0� 859B
:�9>
:� ( 55 )
_Û�5� = ¸� , ¸� ∈ &0,1* and C = 1, ⋯ , D ( 56 )
ÎÓ,ÔZ �¸�, ⋯ , ¸0; Õ, � = ÎÓ,ÔZ �¸; Õ, � = HÓ,Ôo_Ô:��¸� , ⋯ , _Ô:��¸0 p = HÓ,Ô�� ( 57 )
Additionally, we can express the corresponding density function ÝÓ,ÔZ as:
ÝÓ,ÔZ �¸�, ⋯ , ¸0; Õ, � = |Õ|:�� Γ w� + D2 y ÞΓ w�2yß0ÞΓ w� + D2 yß0 Γ w�2y
¢1 + �EÕ:��� £:Ô\0�
à∏ ¢1 + &_Ô:��¸� *�� £:Ô\��0��� â ( 58 )
Following Bouyé et al. [39] and Cherubini, Luciano and Vecchiato [51], for a
vector of parameters ã, ä observations and a log-likelihood function k�ã such that:
k�ã = c ln ÝÓ,ÔZ o3�o5�bp, ⋯ , 30�50b ; ãpåb�� + c c ln 4�o5�b; ã�pÚ
���å
b�� ( 59 )
k�ã = c ln ÝÓ,ÔZ o¸�b, ⋯ , ¸0b; ãpåb�� ( 60 )
k�ã ∝ − ä2 ln Õ − � + D2 c ln �1 + �EÕ:��� �åb�� + � + 12 c c ln �1 + &_Ô:��¸� *�� �Ú
���å
b�� ( 61 )
We can use a parametric method such as the Exact Maximum Likelihood
(EML) to estimate jointly both the dependence structure and the marginal parameters.
Unfortunately, the EML method is computationally intensive for high dimensional
distributions.
Alternatively, we can use a two-step approach proposed by Joe [160] and
Genest, Ghoudi and Rivest [106] that first estimates the individual parameters of the
marginal distributions and then estimates the copula parameters. Joe [160] method,
called Inference Functions for Margins (IFM), assumes that the univariate marginal
distributions are known and that we can start by separately estimating their parameters ãç�:
57
k�ã = c ln ÝÓ,ÔZ o3�o5�b; ã�p, ⋯ , 30�50b; ã0 ; Jpåb�� + c c ln 4�o5�b; ã�pÚ
���å
b�� ( 62 )
ãç� = arg maxê? c ln 4�o5�b; ã�påb�� ( 63 )
Then, we use these values to estimate the parameters of the dependence
structure:
Jë = arg maxì c ln ÝÓ,ÔZ o3�o5�b; ãç�p, ⋯ , 30�50b; ã� ; Jpåb�� ( 64 )
Conversely, Genest, Ghoudi and Rivest [106] method, known as Canonical
Maximum Likelihood (CML) or Pseudo Maximum Likelihood, does not assume
parametric distributions for the margins. Instead, it relies on the univariate empirical
cumulative distribution functions 3ç� to generate uniform variates È�b, which are
subsequently used to estimate the copula parameters:
Jë = arg maxì c ln ÝÓ,ÔZ o È�b, ⋯ , È0b; Jpåb�� ( 65 )
For large samples, both methods yield similar results but for a small number
of observations Kim, Silvapulle and Silvapulle [170] strongly recommends the CML
approach. Their study suggests that IFM method is not robust against
misspecifications of the marginal distributions and that, consequently, the CML method
generally performs better in practical applications, where margins are often unknown.
3.1.4 Simulating a Multivariate Scenario Tree
Once we have estimated the copula parameters and the marginal distributions,
we have a powerful tool to jointly simulate the daily financial series. Based on Sklar
theorem, given an n-dimensional copula ÎÓ,ÔZ �¸�, ⋯ , ¸0 and a multivariate distribution 3�5�, ⋯ , 50 with margins 3��5� , ⋯ , 30�50 , such that:
3�5�, ⋯ , 50 = ÎÓ,ÔZ o3��5� , ⋯ , 30�50 p ( 66 )
We have:
58
¸� = 3��5� ⟹ 5� = 3�:��¸� , C = 1, ⋯ , D ( 67 )
Therefore, we simply need to generate n dependent uniform variates.
According Cherubini, Luciano and Vecchiato [51], the general iterative approach to
accomplish that involves the conditional copula distribution function:
Î�¸�|¸�, ⋯ , ¸�:� = 6 Ý�7|¸�, ⋯ , ¸�:� 87 ⟹·?�
Ý�7|¸�, ⋯ , ¸�:� = Ý�¸�, ⋯ , ¸� Ý�¸�, ⋯ , ¸�:�, 1 = í�Î�¸�, ⋯ , ¸� í¸� ⋯ í¸�í�:�Î�¸�, ⋯ , ¸�:�, 1 í¸� ⋯ í¸�:�⟹
Î�¸�|¸�, ⋯ , ¸�:� = í�:�Î�¸�, ⋯ , ¸�:�, ¸� í¸� ⋯ í¸�:�í�:�Î�¸�, ⋯ , ¸�:�, 1 í¸� ⋯ í¸�:� ( 68 )
We begin by drawing ¸� from a standard uniform distribution and then uses
the inverse of the conditional copula distribution to compute the subsequent dependent
uniform variates ¸�, ⋯ , ¸0:
¸� = Î:��7�|¸�, ⋯ , ¸�:� ( 69 )
C = 1, ⋯ , D
Where each 7� is also drawn from a standard uniform distribution.
Fortunately, Cherubini, Luciano and Vecchiato [51] also points out that there
is an easier method to generate dependent uniform variates from a t copula. Given a
correlation matrix Õ and the degrees of freedom �, we can:
(i) Find the Cholesky decomposition r of Õ
(ii) Draw n random variates ~ = �~�, ⋯ , ~0 ′ from the standard normal
distribution
(iii) Draw a random variate ^ from the chi-squared distribution with � degrees
of freedom
(iv) Set � = r~ and 5 = �ï� ^⁄
Set ¸� = _Ô�5� , C = 1, ⋯ , D, where _Ô is the univariate Student t distribution with � degrees of freedom
59
We use the method above to generate dependent multivariate random
residuals. Then, we combine each series of residuals with its previously fitted GJR-
GARCH + EVT (GPD) + t-Copula model to simulate fully dependent univariate
trajectories for all daily financial series. For example, suppose we simulate [
multivariate scenarios (i.e., paths) for D instruments over H periods. We are basically
generating the following scenario tree:
Figure 12 – Multivariate Scenario Fan Graph
Each multivariate scenario C is simulated independently from other scenarios
and has probability ��J� = �� = 1 [, C = 1, ⋯ , [⁄ . Moreover, following Pflug and
Pichler [247], we can also represent the scenario tree as a nested distribution (further
explained in later subsections):
Figure 13 – Multivariate Scenario Fan Structure
60
Note that in the structural scenario representation, we omit the root node
because it is not stochastic (i.e., it is assumed to be known a priori). As a numerical
example, suppose we have [ = 4 scenarios, D = 3 instruments and H = 2 periods.
Suppose also that the node values are:
�� = �1.0,1.0,1.0
��,� = �2.0,1.3,1.7 , ��,� = �0.9,1.4,1.2 , ��,K = �0.9,1.4,1.2 , ��,L = �0.7,1.1,0.8 ��,� = �2.2,1.7,1.6 , ��,� = �1.0,1.3,1.0 , ��,K = �0.8,1.2,1.0 , ��,L = �0.6,1.0,0.9
The respective scenario tree can be represented as:
Figure 14 – Multivariate Scenario Fan Graph Example (m=4, n=3 and T=2)
Figure 15 – Multivariate Scenario Fan Structure Example (m=4, n=3 and T=2)
As we discussed previously, this simulated multivariate scenario tree offers a
reasonably accurate way to represent a (continuous) underlying stochastic process by
a discrete approximation. However, in practical financial applications it faces two
61
important shortcomings: it is often too large (i.e., too computationally demanding to be
optimized within a reasonable amount of time) and it is not free of arbitrage
opportunities. Therefore, in the next section we introduce an additional step to address
these limitations.
3.2 Scenario Tree Approximation
Although scenario trees based on the combined GJR-GARCH + EVT (GPD) +
t-Copula approach provide an interesting framework for representing daily log returns,
they lead to two relevant shortcomings in the optimization process that must be
addressed before they can be used. First, stochastic portfolio optimizations based on
simulated scenarios are computationally demanding, particularly in multi-period
problems, because a huge number of scenarios are often required to achieve proper
accuracy. Second, there is no guarantee that the simulated scenarios preclude
arbitrage opportunities, which may induce unreliable optimization solutions.
An alternative to eliminate the first shortcoming is to approximate the original
set of simulated scenarios to a much smaller set, with adjusted probabilities, using
some metric distance to keep both scenarios reasonably similar. This approach is often
known as optimal discretization. As we saw in the literature review, several authors
have explored this approach to approximate the original scenario tree by smaller (and,
consequently, more tractable) tree.
Optimal discretization does not require knowledge of the corresponding
stochastic optimization model as, for example, moment-matching methods. This is
mostly interesting for general portfolio optimization applications where the structure of
the constraints and objective functions (and, therefore, the relevant statistical
properties of the optimization problem) differ from one investor to another. On the other
hand, alike most conditional sampling and clustering techniques, optimal discretization
must be particularly careful to avoid biased and unstable (in-sample and out-of-
sample) solutions.
In-sample stability is related to internal consistency. It requires that if we
simulate many approximated scenarios using the same discretization method, all of
them must lead to similar objective function values. If the optimal values are not similar
62
for slightly different scenarios generated using the same process and data, the
discretization method is not stable. Following Kaut and Wallace [168], most in-sample
stability tests focus on the objective function values rather than on the solution values
because objective functions are often flat, which means that an optimal objective
function value may be attained by quite different solutions, making solution stability
very hard to achieve.
Hence, given an objective function 4�5, � where 5 is a solution and � is the
parameter uncertainty embodied by the approximated scenario tree, in-sample stability
means that, for two approximated scenario trees generated by the same discretization
method leading to �� and ��, we should have:
min9∈ð 4�5, �� ≈ min9∈ð 4o5, ��p ( 70 )
In addition to the potential instability issues, we also need to keep the optimality
gap small. This optimality gap is simply the approximation error (or bias) between the
optimal objective function values achieved using the true underlying stochastic process
and the approximated scenario tree. Therefore, given a stochastic optimization
problem such that 4�5, ¸ is the objective function, º is the true underlying distribution
function and ºt is the approximated discrete distribution, we can define the
approximation error Yo3, 3tp:
min9∈ð 3�5 = min9∈ð 6 4�5, ¸ 8º�¸ ( 71 )
min9∈ð 3t�5 = min9∈ð 6 4�5, ¸ 8ºt�¸ ( 72 )
Yo3, 3tp = 3 ¢argmin9 3t�5 £ − 3 ¢argmin9 3�5 £ ≥ 0 ( 73 )
Yo3, 3tp = 3�5ñ∗ − 3�5∗ ≥ 0 ( 74 )
Pflug [241] shows that, although the true optimal objective value 3�5∗ is
unavailable for most practical applications, we can define an upper bound to the
approximation error Yo3, 3tp based on the following lemma:
Lemma 1
Yo3, 3tp ≤ 2^¸W9�3�5 − 3t�5 � ≥ 0 ( 75 )
63
Proof. Let 5∗ ∈ iXj[CD 3, 5ñ∗ ∈ iXj[CD 3t and ó = ^¸W9�3�5 − 3t�5 �. Let also � = �5: 3�5 ≤ 3�5∗ + 2ó� and suppose that 5ñ∗ ∉ �. Then:
3�5∗ + 2ó < 3�5ñ∗ ≤ 3t�5ñ∗ + ó ≤ 3t�5∗ + ó ≤ 3�5∗ + 2ó ( 76 )
This contradiction shows that 5ñ∗ ∈ � and, consequently:
Yo3, 3tp = 3�5ñ∗ − 3�5∗ ≤ 2ó ( 77 )
The quality of stochastic optimization solution rely heavily on the level of
stability and optimality gap. These levels, in turn, depend greatly on the chosen metric
distance for the approximated scenario. Among the many available metrics, the
Wasserstein or Kantorovich distance and its extensions seem particularly suited
according various works such as Hochreiter and Pflug [141], Pflug [241][243], Pflug
and Pichler [245][246] and Heitsch, Römisch [270], and Römisch and Strugarek [138].
The Wasserstein distance seems to be particular popular because it exhibits some
interesting properties when applied to stochastic programming problems. For instance,
(i) it metricizes convergence in distribution, (ii) it is intuitively extended to stochastic
processes and (iii) it is also computationally straightforward to handle.
Unfortunately, most traditional discretization methods do not handle the second
shortcoming. In fact, despite of the recent advances of in optimal discretization, there
is very limited research concerning the prevention of arbitrage opportunities. To
address that, we need to adjust the discretization approach to avoid scenarios that
lead to these opportunities. Arbitrage-free scenarios might not be particularly essential
for energy generation and logistics applications as we pointed out in the literature
review but they are critical for financial applications. For that reason, additional
constraints must be imposed in the optimal discretization approach to prevent scenario
tree nodes that lead could to these opportunities.
In the following subsections, we discuss in detail the Wasserstein and Process
distance concepts, and describe a method to approximate a large scenario tree by a
tractable and arbitrage-free one. First, we describe how we can use the Wasserstein
and Processes distances to compute, respectively, the distance between distributions
and stochastic processes. Then, we describe the approach proposed by Pflug and
Pichler [246], which performs optimal discretizations using the Process distance, and
extend it to prevent arbitrage opportunities.
64
3.2.1 Distributions Distance
Optimal discretization relies on the concept of probability metrics to evaluate
the dissimilarity between different probability distributions. Such discretization
approaches minimize an appropriately chosen probability metric to approximate the
original and often continuous distribution by a smaller but similar (in some statistical
sense) discrete one, which can be used in practical optimization problems.
Definition 18 Given a set of probability measures h on ℝd and ��, ��, �K ∈ h,
a probability metric 8 on h × h is a distance function between probability distributions
such that:
(i) 8���, �� = 0 ⇔ �� = �� ( 78 )
(ii) 8���, �� ≥ 0 ( 79 )
(iii) 8���, �� = 8���, �� ( 80 )
(iv) 8���, �� ≤ 8���, �K + 8���, �K ( 81 )
Definition 19 Given a set Ξ and a probability metric 8: Ξ × Ξ → ℝ, the pair �ø, 8 is called metric space. A metric space �ø, 8 is said to be separable and complete
if it has a countable dense subset and if every Cauchy sequence in �ø, 8 converges.
A complete separable metric space is called Polish space.
Various metrics or distances (e.g., Kolmogorov metric, Lévy-Prokhorov metric,
total variation distance) can be defined on a given set but few are suited to measure
distances between discrete probability distributions. Two metrics in particular, the
Fortet-Mourier and the Wasserstein distances, have been favoured in the optimal
discretization literature because of their appealing computational and statistical
properties, particularly in Polish spaces. As pointed out by Pflug [241], Pflug and
Pichler [244], and Kovacevic and Pichler [186], they are especially relevant for scenario
approximation because they metricize weak convergence and because discrete
probability measures are dense in the corresponding spaces of probability measures.
Definition 20 Given all continuous bounded functions ℎ, a sequence of
probability measures �0 on ℝ³ converge weakly as D → ∞ if:
65
6 ℎ8�0 → 6 ℎ8� ( 82 )
In other words, assuming a Polish space a proper distance 8 metricizes weak
converge from a discrete approximated distribution �0 to the true underlying distribution �, i.e. 8��0, � → 0, if and only if �0 ⟹ �. This property, exhibited by the Wasserstein
distance, is paramount to the approximation scenario technique used in the thesis.
The Wasserstein distance is generalization of the Kantorovich distance (also
known as Wasserstein distance of order X = 1) that can also be generalized from
probability measures to stochastic processes, as we discuss in the next subsection. It
is directly associated to the concept of optimal transportation cost between
distributions, which is extensively applied in engineering applications.
Definition 21 Given two probability spaces ��, ℱ, � and o�t, ℱt, �tp with
elements J ∈ � and Jù ∈ �t, a joint probability measure Ù�J, Jù on � × �t , and a
(transportation) cost function Ý�J, Jù between them the optimal transportation cost is
given by:
CD4ú 6 Ý�J, Jù Ù�8J, 8Jù û×ûv ( 83 )
Such that, for all measurable sets r ∈ ℱ and s ∈ ℱt:
Ùor × �tp = ��r ( 84 )
ÙoΩ × stp = �t�s ( 85 )
Definition 22 Given two probability spaces ��, ℱ, � and o�t, ℱt, �tp and two ℝd-
valued random variables �: � → ℝd and �x: �t → ℝd, the inherited distance (from � and �x) between � and �t is defined either as:
8�J, Jù = ý��J − �x�Jù ý or ( 86 )
8 w��J , �x�Jù y = ý��J − �x�Jù ý ( 87 )
For some norm ‖∙‖ in ℝd.
66
Definition 23 Given an inherited distance 8�J, Jù between � and �t , the
Wasserstein distance is defined either as:
8�o�, �tp = àCD4ú � 8�J, Jù �Ù�8J, 8Jù û×ûv â�� ( 88 )
8�o�, �tp = àCD4ú � 8�¸, 7 �Ù�8¸, 87 ℝ�×ℝ�â
�� ( 89 )
As we will discuss in detail later, multi-period stochastic programs depend on
multivariate stochastic processes ��Z�Z��E , which are defined on some probability space ��, ℱ, � such that �Z ∈ ℝÏ. In addition, a decision 5Z at time t cannot depend on future
realizations (i.e., �Z\b, u ≥ 1) meaning that 5Z is measurable with respect to ℱZ�� . This
restriction is known as nonanticipativity condition.
3.2.2 Stochastic Processes Distance
As we have underlined earlier, multi-period stochastic optimization problems
often require a simpler (i.e., tractable) discrete stochastic process �x with some filtration ℱt that approximates the original underlying stochastic process �, with another filtration ℱ. In other words, given two general multi-period stochastic optimization problems:
[i5��&¡�5�, ��, ⋯ , 5E:�, �E *: 5 ⊲ ℱ� ( 90 )
[i5���¡o5�, �x�, ⋯ , 5E:�, �xEp�: 5 ⊲ ℱt� ( 91 )
We need to identify �x and ℱt such that the optimality gap (i.e., the approximation
error) between the solutions of these optimization problems is stable and small.
However, an important detail cannot be overlooked: in practical applications, these
processes and filtrations are rarely defined in the same probability space �Ω, ℱ, � .
Therefore, the approximation metric used to measure the distance (i.e., the optimality
gap) between these processes must take it into account. The process or nested
distance, proposed by Pflug [243] and further detailed in Pflug and Pichler
[244][245][246], is particularly interesting.
67
3.2.3 Scenario Tree Reduction
The optimal discretization approach applied in this section follows the
techniques presented in Pflug [243], Pflug and Pichler [246], and Kovacevic and
Pichler [186], which are based on process distances.
3.2.4 No-Arbitrage Scenario Tree Reduction
Informally, an arbitrage opportunity consists of a riskless investment with a
positive return that, in real financial markets, does not persist for long (because
investors quickly explore it). Unfortunately, as discussed by Klassen [173][174][175],
and Geyer, Hanke and Weissensteiner [107], portfolio optimization models are ill
equipped to handle arbitrage opportunities, because they often lead to unbounded or
biased solutions. For that reason, scenario trees must preclude such opportunities to
achieve reliable solutions. Formally, following Ingersoll [151]:
Definition 24 Given non-redundant instruments u = 1, ⋯ , ä, their allocations ãb,Z and prices �b,Z0 at stage _ and state (i.e., scenario tree node) D, and their prices �b,Z\�d at stage _ + 1 and states [ such that [ ∈ D\ (i.e., scenario tree direct
descendent nodes), an arbitrage opportunity exists if either:
(i) It is possible to construct zero net investment portfolio or investment
strategy ãZ = oã�,Z, ⋯ , ãå,Zp that is profitable in at least one future state of
nature (i.e., that delivers a positive return in at least one scenario arc)
and never produces a loss, whatever the future state of nature (i.e., that
does not deliver negative returns in any scenario arc). c ãb,Z�b,Z0åb�� = 0 ( 92 )
c ãb,Z�b,Z\�dåb�� ≥ 0, 4lX ikk [ ∈ D\ ( 93 )
68
c ãb,Z�b,Z\�dåb�� > 0, 4lX i_ kYi^_ lDY [ ∈ D\ ( 94 )
(i) It is possible to construct an investment portfolio or strategy ãZ =oã�,Z, ⋯ , ãå,Zp that has a positive cash flow today and generates a
nonnegative return, whatever the future state of nature.
c ãb,Z�b,Z0åb�� < 0 ( 95 )
c ãb,Z�b,Z\�dåb�� ≥ 0, 4lX ikk [ ∈ D\ ( 96 )
Putting it differently, as imposed by Harrison and Kreps theorem, given �b,Z0
and �b,Z\�d , [ ∈ D\ the scenario tree does not exhibit arbitrage opportunities if and only
if there is a state price vector Ù such that:
�b,Z\�d Ù = �b,Z0 , 4lX ikk [ ∈ D\ ( 97 )
However, as explained by Jobst and Zenios [159], Fersti and Weissensteiner
[100], and Geyer, Hanke and Weissensteiner [107], portfolio optimization must be
carried on objective (i.e., real world) probability measures rather than risk-neutral
probability measures.
3.3 Stochastic Portfolio Optimization
Once we have constructed our tractable scenario tree, we need to choose an
appropriate optimization method that properly explores the impacts and outcomes of
potential sequence of allocations in each scenario and, as we discuss earlier, the
Stochastic Programming (SP) is a suited approach to accomplish that.
4 Methodology
4.1 Data Description
69
The combined GJR-GARCH + EVT (GPD) + t-Copula that we discuss next is
applied to fit the joint behaviour of five global stock indexes: IBX, S&P 500, FTSE 100,
DAX and Nikkei 225 from January 4, 2005 to August 15, 2015 comprising 2.673 trading
days.
According Rydberg [274], most daily financial series analyzed in the literature
are based on log returns of closing prices. We adopt the same approach and use Dollar
closing prices to compute the multivariate financial series of F = 5 assets and H =2.672 daily log returns:
X�,Z = ln ��,Z��,Z:� ( 98 )
C = 1, ⋯ , F; _ = 1, … , H
Table 1 – Univariate Statistics – Daily Log Returns
Unfortunately, there are only 2.379 trading days with no missing prices (Table
2 and Figure 16) and we need to handle the missing information before we can
proceed. Standard techniques for missing data cannot be applied without distorting
considerably the multivariate series. For example, row-wise removal decreases the
sample size and creates irregular (i.e., non-daily) time series. Single imputation
techniques, such as copying the last available price or interpolating the nearest prices,
may introduce substantial biases.
Instead, we use a better alternative: an Expectation-Maximization (EM)
algorithm for recovering multivariate normal missing data proposed by Meucci [217]. It
still induces some distortions, but they do not appear to be severe (Figure 16 and
Figure 17) as usually happens with the standard methods.
IBX S&P 500 FTSE 100 DAX Nikkei 225
Mean 0.0262% 0.0207% 0.0039% 0.0277% 0.0151%
Standard Deviation 2.3230% 1.2639% 1.4560% 1.6460% 1.5018%
Skewness -0.2630 -0.3327 -0.0310 -0.0317 -0.4776
Kurtosis 10.1024 14.2830 13.2818 9.0508 8.8348
Median 0.097% 0.074% 0.067% 0.055% 0.048%
Mean Absolute Deviation 1.631% 0.804% 0.978% 1.133% 1.073%
Lower Semi Standard Deviation 1.679% 0.920% 1.050% 1.173% 1.098%
Upper Semi Standard Deviation 1.605% 0.866% 1.009% 1.154% 1.024%
70
Table 2 – Missing Return Patterns
Figure 16 – Missing Returns Histogram
Occurrences DAX FTSE SP IBX NikkeiMissing Returns
per Row
2378 0
91 NA 1
50 NA 1
21 NA 1
1 NA 1
132 NA 1
5 NA NA 2
8 NA NA 2
11 NA NA 2
16 NA NA 2
2 NA NA 2
3 NA NA 2
8 NA NA 2
1 NA NA NA 3
2 NA NA NA 3
4 NA NA NA 3
2 NA NA NA 3
9 NA NA NA 3
18 NA NA NA NA 4
62 79 91 139 156 527
71
Figure 17 – Distributions with Adjusted Missing Data
4.2 Scenario Tree Generation
The framework proposed in this study combines the approaches proposed by
Diebold, Schuermann, and Stroughair [75], McNeil and Frey [213], and Rockinger and
72
Jondeau [162][268]. We substitute an asymmetric GJR-GARCH model for the original
GARCH model to encompass potential leverage effects and expand the univariate
analysis to a multivariate setting using a t-Copula to account for the dependency
structure among the financial series.
Our combined approach comprises four major steps:
(i) Fit a GJR-GARCH model to each of the univariate series to generate
approximately i.i.d. standardized residuals
(ii) Fit GPD tails and a smoothed-kernel body to each series of standardized
residuals
(iii) Estimate a t-copula using the transformed residuals of the previous
margins
(iv) Simulate dependent paths using the combined models
Before the initial step, we examine the series looking for indication of the four
stylized facts: heavy-tails, volatility clustering, leverage effect and tail co-dependence.
Cumulative returns and daily log returns can help us identify signs of these features.
Additionally, correlograms and tests such as the Augmented Dickey-Fuller test, Ljung-
Box Q test and Engle test for ARCH effects can further support our analysis.
Once we confirm that these stylized facts seem to be present, we can model
them. First, we take into account volatility clustering and leverage effects by fitting a
GJR-GARCH process with Student t innovations to each series of log daily returns.
This step constructs sequences of approximately i.i.d. standardized residuals with
heavy tails that we use in the subsequent steps.
As pointed out earlier, we selected a parsimonious specification GJR-GARCH
(1,1) to model the conditional variance because it is more tractable and still provides
adequate estimates. Conversely, we do not assume a priori that the conditional mean
does not have a significant autoregressive term. Instead we compare low order ARMA
specifications to verify that conditional raw returns indeed exhibit little serial correlation
are close to zero, as suggested by the literature.
Thus, we assume that:
|Z = XZ − }Z:�&XZ* ( 99 )
}Z:�&XZ* = Ý + XZ:� + ãXZ:� ( 100 )
73
|Z = ~Z"Z ( 101 )
"Z� = i� + i�|Z:�� + ��"Z:�� + 8���|Z:� < 0 |Z:�� ( 102 )
i� > 0, i� ≥ 0, �� ≥ 0, 8� ≥ 0
Where �XZ� is the series of daily log returns, }Z:�&XZ* is the conditional return,
whose best specification is likely to omit autoregressive and moving average terms, �~Z� is a i.i.d. sequence with Student t distribution and "Z� is the conditional variance at
time _. We then model each univariate series of daily log returns using the formulation
above and examine the residuals diagnostics.
We perform a visual analysis of the daily standardized residuals to check how
reasonable is the hypothesis of (approximately) i.i.d. standardized residuals. We also
check for remaining signs of volatility clustering, leverage effects and heavy-tails.
Given the discussed literature, we expect only the latter to be present. Further checks
employ correlograms, Ljung-Box Q tests and Engle tests for ARCH effects of the raw
and squared standardized residuals to confirm that the former two stylized facts have
been adequately modelled.
Once we have the i.i.d. standardized residual series, we can improve the fitting
by separating each series’ distribution in three parts: the lower tail, the upper tail and
the central part of distribution. First, we need to decide which particular distribution is
suitable to each one of these parts. Based on the Peaks over Threshold (POT) method
of Extreme Value Theory discussed earlier, it seems reasonable to assume that, given
a sufficiently large threshold ¸ and shape parameter � > 0, tails follow approximately
a GPD
º¦,��· �� = »1 − w1 + � ¼��· y:>§ C4 � ≠ 01 − Y: ½¬�¾ C4 � = 0 where ¿ > 0 ( 103 )
Note that for the upper tails (i.e., the positive extreme observations), we can fit
a GPD directly to the daily standardized residuals while for the lower tails (i.e., the
negative extreme observations), we need to fit a GPD to the negative standardized
residuals.
The first step we need to estimate each GPD is to define proper thresholds,
which separate the tails from the centre of the distributions. As we underlined in the
74
previous section, these thresholds are not easily identified. One alternative is to
analyze the quantile-quantile and empirical excess mean plots to obtain an
approximate idea of each tail’s threshold. Once the lower and upper thresholds are
selected, we proceed with the GPD estimation and identify, for each standardized
residual series, the shape parameter � (or, equivalently, the tail index � = 1 �⁄ ) and the
scale parameter " of each tail.
Then, based on selected thresholds we use the central part of standardized
residuals to fit the body of each distribution to a Gaussian kernel using a Kernel Density
Estimator (KDE). This approach assumes that a large number of observations are
available to properly describe the central part of distribution.
After we have fitted the parametric tails and kernel body of each distribution of
standardized residuals, we evaluate the results. First, we compare the empirical upper
and lower tail distributions to the suggested GPD model. Secondly, we compare the
empirical cumulative distribution functions to the semi-parametric ones generated by
each combined GJR-GARCH + EVT (GPD) model. Lastly, we evaluate how properly
each proposed semi-parametric distribution fits the empirical distribution using
quantile-quantile and probability density function plots.
Provided the combined GJR-GARCH + EVT(GPD) models are adequate, we
can proceed and assess the joint dependence structure by using the standardized
residuals to fit the a t-copula. The approach is straightforward: employing the GJR-
GARCH + EVT (GPD) cumulative distribution functions, we transform the residuals into
uniform variates that are used to fit the t-copula using the CML method described
earlier. As pointed out in the literature, we expect to explain some of the tail co-
dependence between the margins observed in the bivariate scatter and density plots.
As the final step, we use the combined GJR-GARCH + EVT (GPD) + t Copula
model to simulate dependent samples (or trajectories) of the daily log returns. As
described in the previous section, we first generate samples of dependent uniform
variates using the fitted t copula. Then, we transform back these variates into
standardized residuals using the inverse of their distributions based on calibrated GPD
tails and kernel-smoothed bodies. Lastly, we filter the standardized residuals into the
ARMA + GJR-GARCH models to simulate the dependent daily log returns.
75
This combined approach generates multivariate samples that exhibit the four
key stylized facts discussed previously: heavy-tails, volatility clustering, leverage effect
and tail co-dependence. As practical applications, we perform a multi-objective Mean-
Variance-CVaR portfolio optimization; using samples simulated from a multivariate
Gaussian distribution and from the proposed GJR-GARCH + EVT (GPD) + t Copula
approach. We present the results and analyses in the next section.
5 Empirical Results
In this section, we present our empirical findings and comment some of the
results. All analysis were performed using Matlab R2015a (version: 8.5), except for the
missing data analysis discussed in the previous section, which were partially
implemented using R 3.2.3.
We begin by an initial inspection of the cumulative log returns of these financial
series (Figure 18). It suggests some interesting features such as time-varying
volatilities, erratic extreme variations and some degree of co-dependence.
Figure 18 – Stock Indexes’ Cumulative Returns
76
We first examine the univariate behaviour of each series. A closer look at the
daily log returns (Figure 21) seem to confirm the presence of volatility clustering,
heavy-tails and some leverage effects in all series.
Figure 19 – ACF and PACF of Raw Returns
77
Figure 20 – ACF and PACF of Squared Returns
78
Figure 21 – Daily Log Returns
Augmented Dickey-Fuller tests (Table 3) with significance level of 5% reject
the hypothesis of unit roots in these series. Additionally, as suggested by previous
researches, the sample Autocorrelation Functions (ACFs) (Figure 19) and sample
Partial Autocorrelation Functions (PACFs) (Figure 20) as well as the Ljung-Box Q-tests
(Table 4) show little evidence of serial correlation in raw returns. They do, however,
79
suggest strong evidence of serial correlation in squared returns (with apparently very
slow decay) and consequently reject the assumption of i.i.d. series. Similar analyses
of the absolute returns also corroborate these findings.
Table 3 – ADF Tests
Following Tsay [301], we set the number of lags to ln T = ln 2.672 ≅ 7.89, rounded up to the next integer, to increase the
performance of the Ljung-Box Q test.
Table 4 – Ljung Box Q-test on Squared Residuals
Table 5 – ARCH Engle test for Residual Heteroscedasticity
Fitting a ARMA (p,q) + GJR-GARCH (1,1) model to each one of the univariate
series, we observe that most of them are best explained by parsimonious ARMA (0,0)
+ GJR-GARCH (1,1) processes (Table 6), which suggests that the conditional means
are constant and, in these cases, not statistically different from zero (Table 7). The
exceptions are the IBX and S&P 500 series. However, the autoregressive term is not
p-Value t-Statistic
IBX < 0.001 -49.91
S&P 500 < 0.001 -57.85
FTSE 100 < 0.001 -52.33
DAX < 0.001 -51.15
Nikkei 225 < 0.001 -59.68
ADF Test (Critical Value = -1.94)
Lag 1 2 3 4 5 6 7 8
IBX 177.30 736.70 947.32 1230.76 1783.94 1981.57 2506.86 2596.46
S&P 500 122.33 539.67 643.83 854.88 1193.11 1448.50 1761.37 1962.63
FTSE 100 81.20 229.19 401.58 558.66 1005.58 1117.65 1221.16 1305.16
DAX 66.57 219.65 412.30 505.39 736.86 794.04 929.37 988.69
Nikkei 225 143.81 585.49 670.36 1005.09 1088.10 1285.66 1373.98 1532.22
Critical
Values3.84 5.99 7.81 9.49 11.07 12.59 14.07 15.51
Q-Statistic
Ljung-Box Q-test on Squared Residuals
Lag 1 2 3 4 5 6 7 8
IBX 177.05 614.10 651.65 675.28 887.74 889.99 939.63 951.04
S&P 500 122.15 463.30 476.97 510.91 662.00 705.58 738.34 751.14
FTSE 100 81.09 196.56 294.74 353.44 599.38 608.99 609.21 610.24
DAX 66.47 192.20 312.01 334.02 423.16 423.40 441.99 441.78
Nikkei 225 143.60 494.23 497.49 594.37 596.66 607.92 613.54 619.85
Critical
Values3.84 5.99 7.81 9.49 11.07 12.59 14.07 15.51
ARCH Engle test for Residual Heteroscedasticity
F-Statistic
80
statistically relevant in the former (Table 8) and neither is the moving average term in
the latter (Table 9). Therefore, in this study we also assume ARMA (0,0) + GJR-
GARCH (1,1) specifications for these series (Table 7).
Table 6 – BIC Comparison of ARMA (P,Q) + GJR-GARCH (1,1) Models
0 1 2
0 -13,561.9 -13,562.2 -13,554.4
1 -13,562.2 -13,554.4 -13,550.7
2 -13,554.4 -13,550.1 -13,542.9
P
IBX
Bayesian Information Criterion
Q
0 1 2
0 -17,559.9 -17,560.1 -17,554.8
1 -17,559.6 -17,560.1 -17,546.9
2 -17,554.5 -17,546.7 -17,544.6
P
S&P 500
Bayesian Information Criterion
Q
0 1 2
0 -16,407.0 -16,399.1 -16,391.2
1 -16,399.1 -16,391.6 -16,383.7
2 -16,391.2 -16,386.3 -16,377.5
P
Q
FTSE 100
Bayesian Information Criterion
0 1 2
0 -15,433.0 -15,425.3 -15,417.6
1 -15,425.3 -15,418.4 -15,410.8
2 -15,417.6 -15,410.8 -15,404.2
P
Q
DAX
Bayesian Information Criterion
0 1 2
0 -15,786.9 -15,731.8 -15,779.1
1 -15,784.4 -15,779.1 -15,773.2
2 -15,779.3 -15,773.1 -15,765.5
P
Q
Nikkei 225
Bayesian Information Criterion
81
Table 7 – ARMA (0,0) + GJR-GARCH (1,1) Specifications
Parameter Value Std. Error t Stat
Constant 0.0003 0.0003 0.91
Parameter Value Std. Error t Stat
Constant 0.0000 0.0000 5.66
GARCH{1} 0.9194 0.0105 87.65
ARCH{1} 0.0061 0.0108 0.57
Leverage{1} 0.1106 0.0156 7.09
Parameter Value Std. Error t Stat
DoF 12.7322 2.8893 4.41
ARMA(0,0) Model
IBX
GJR-GARCH(1,1) Model
Student t Innovations
Parameter Value Std. Error t Stat
Constant 0.0004 0.0001 3.13
Parameter Value Std. Error t Stat
Constant 0.0000 0.0000 3.02
GARCH{1} 0.8864 0.0136 64.99
ARCH{1} 0.0000 0.0168 0.00
Leverage{1} 0.1953 0.0256 7.63
Parameter Value Std. Error t Stat
DoF 6.0985 0.8277 7.37
S&P 500
ARMA(0,0) Model
GJR-GARCH(1,1) Model
Student t Innovations
Parameter Value Std. Error t Stat
Constant 0.0002 0.0002 0.88
Parameter Value Std. Error t Stat
Constant 0.0000 0.0000 2.75
GARCH{1} 0.9080 0.0108 83.90
ARCH{1} 0.0092 0.0105 0.88
Leverage{1} 0.1444 0.0191 7.57
Parameter Value Std. Error t Stat
DoF 9.4403 1.5830 5.96
Student t Innovations
FTSE 100
ARMA(0,0) Model
GJR-GARCH(1,1) Model
Parameter Value Std. Error t Stat
Constant 0.0005 0.0002 2.35
Parameter Value Std. Error t Stat
Constant 0.0000 0.0000 3.54
GARCH{1} 0.9078 0.0122 74.49
ARCH{1} 0.0019 0.0122 0.16
Leverage{1} 0.1493 0.0199 7.51
Parameter Value Std. Error t Stat
DoF 8.3243 1.3512 6.16
Student t Innovations
DAX
ARMA(0,0) Model
GJR-GARCH(1,1) Model
Parameter Value Std. Error t Stat
Constant 0.0002 0.0002 0.95
Parameter Value Std. Error t Stat
Constant 0.0000 0.0000 4.05
GARCH{1} 0.8802 0.0155 56.66
ARCH{1} 0.0296 0.0159 1.86
Leverage{1} 0.1154 0.0224 5.15
Parameter Value Std. Error t Stat
DoF 9.2877 1.4095 6.59
Nikkei 225
ARMA(0,0) Model
GJR-GARCH(1,1) Model
Student t Innovations
82
Table 8 – ARMA (1,0) + GJR-GARCH (1,1) Specification - IBX
Table 9 – ARMA (0,1) + GJR-GARCH (1,1) Specification – S&P 500
We notice that there is strong evidence of volatility clustering and leverage
effects in all series. On the other hand, we also notice that ARCH terms are not
statistically different from zero, which might have led to identification problems and
compromise quasi-maximum likelihood estimates as underlined in Andersen et al. [7].
Fortunately, it seems not to be the case as residual diagnostics suggest.
A visual examination of the daily standardized residuals appears to support
that they are nearly i.i.d. (Figure 23). Analyses of the ACFs and PACFs further
corroborate this assumption (Figure 24 and Figure 25). Lastly, Quantile-Quantile (QQ)
plots against Gaussian and Student t-distributions indicate that we can improve the
Parameter Value Std. Error t Stat
AR{1} 0.0002 0.0003 0.66
Constant 0.0567 0.0207 2.73
Parameter Value Std. Error t Stat
Constant 0.0000 0.0000 5.72
GARCH{1} 0.9185 0.0105 87.86
ARCH{1} 0.0040 0.0108 0.37
Leverage{1} 0.1168 0.0163 7.15
Parameter Value Std. Error t Stat
DoF 12.9052 2.9286 4.41
IBX
ARMA(1,0) Model
GJR-GARCH(1,1) Model
Student t Innovations
Parameter Value Std. Error t Stat
MA{1} 0.0005 0.0003 1.47
Constant -0.0559 0.0211 -2.66
Parameter Value Std. Error t Stat
Constant 0.0000 0.0000 2.95
GARCH{1} 0.8899 0.0134 66.43
ARCH{1} 0.0000 0.0164 0.00
Leverage{1} 0.1863 0.0244 7.63
Parameter Value Std. Error t Stat
DoF 6.0824 0.8226 7.39
Student t Innovations
S&P 500
ARMA(0,1) Model
GJR-GARCH(1,1) Model
83
fitting of the tails (Figure 26), except maybe for FTSE 100 series. Although Gaussian
distributions are too thin to accommodate the extreme observations, Student t
distributions do not seem to achieve good results regarding tail behaviour, possibly
because they are fitted to the whole distributions rather than to the tails individually.
Figure 22 – Filtered Residuals and Filtered Conditional Volatility
84
Figure 23 – Standardized Residuals
85
Figure 24 – ACF and PACF of Raw Standardized Residuals
86
Figure 25 – ACF and PACF of Squared Standardized Residuals
87
Figure 26 – QQ Plots of the Standardized Residuals vs Normal and Student t Distributions
As we saw previously, an interesting alternative to accomplish that is to use
the Peaks over Threshold (POT) method to fit each tail of standardized residuals
individually. First, we use the QQ plots (Figure 26) and the Empirical Excess Mean
(EEM) plots (Figure 27 and Figure 28) to select an appropriate threshold for each tail
(Table 10).
88
Figure 27 – Empirical Mean Excess Plots – Lower Tails
89
Figure 28 – Empirical Mean Excess Plots – Upper Tails
90
Table 10 – GPD Thresholds
Given the selected lower and upper thresholds, we estimate the GPD
parameters of each tail, fit the kernel-smoothed bodies and evaluate the results (Figure
29 to Figure 33). We see that the lower tail of the IBX and DAX distributions and the
upper tail of the FTSE 100 distribution have finite variance but not finite fourth moment
(Table 11). We also notice that distribution tails do not seem to be symmetric.
Nonetheless, an overall analysis of the cumulative function, density function and QQ
plots suggest that proposed GJR-GARCH + EVT (GPD) approach provides a good fit
to the heavy-tails, volatility clustering, leverage effect and tail co-dependence observed
in the empirical distributions (Figure 31, Figure 32 and Figure 33).
Table 11 – Tail Fitting – GPD Parameters
Lower Tail Upper Tail
IBX -3.0 2.2
S&P 500 -2.5 2.0
FTSE 100 -2.5 2.0
DAX -3.2 2.6
Nikkei 225 -2.5 2.1
GDP Thresholds
Tail IndexScale
ParameterTail Index
Scale
Parameter
IBX 2.8416 0.2710 5.0607 0.2591
S&P 500 6.3026 0.4457 56.5416 0.3355
FTSE 100 34.8760 0.5119 2.5257 0.3028
DAX 3.3312 0.4719 257.8469 0.2498
Nikkei 225 4.2835 0.4692 28.5293 0.3123
GDP Parameters
Lower Tail Upper Tail
91
Figure 29 – Lower Tail Fitting – Standardized Residuals
92
Figure 30 – Upper Tail Fitting – Standardized Residuals
93
Figure 31 – Empirical Cumulative Distribution Functions
94
Figure 32 – QQ Plots of the Standardized Residuals vs GJR-GARCH + GPD Distributions
95
Figure 33 – Standardized Residuals Fitting
Therefore, assuming that the GJR-GARCH + EVT (GPD) models are suited to
model the marginal distributions, we use them to fit a t-copula that captures their
96
dependence structure. As we described previously, the approach is simple: we use the
GJR-GARCH + EVT (GPD) cumulative distribution functions to transform the residuals
into uniform variates. Then we fit the t-copula to the uniform univariates using the CML
method. As we observe in the bivariate scatter and density plots (Figure 34 to Figure
37) all series, except for the Nikkei 225, appear to exhibit some degree of tail co-
dependence. The t copula estimated parameters (Table 12) also indicate a high
correlation between most of them series.
Table 12 – t-Copula Parameters
Figure 34 – Dependence between Series – Part 1
IBX S&P 500 FTSE 100 DAX Nikkei 225
IBX 1.00 0.63 0.60 0.57 0.11
S&P 500 0.63 1.00 0.57 0.59 0.00
FTSE 100 0.60 0.57 1.00 0.86 0.22
DAX 0.57 0.59 0.86 1.00 0.20
Nikkei 225 0.11 0.00 0.22 0.20 1.00
13.41Degrees of Freedom
t-Copula Correlations
97
Figure 35 – Dependence between Series – Part 2
98
Figure 36 – Bivariate Histograms – Part 1
99
Figure 37 – Bivariate Histograms – Part 2
Lastly, we use the complete combined approach GJR-GARCH + EVT (GPD)
+ t Copula to simulate 500.000 cross and serially dependent samples of the daily
series. We begin by construing dependent uniform variates using the fitted t copula as
explained in the previous section.
Figure 38 – Mean-CVaR Efficient Frontiers
100
Figure 39 – Mean-Variance Efficient Frontiers
101
6 Conclusions and Further Research
In the thesis, we discuss three key steps to perform a long-term multistage
multi-objective portfolio optimization based on stochastic programming. First, we
develop and estimate a combined GJR-GARCH + EVT-GPD + t-Copula econometric
model to simulate a large number of discrete multivariate trajectories, which is
represented by a large scenario tree. Secondly, we approximate the original scenario
tree by a smaller one that is both tractable and arbitrage-free. Lastly, we use the
approximated scenario tree to optimize a Mean-Variance-CVaR portfolio.
The first step combines different econometric models, extending approaches
similar to those suggested by McNeil and Frey [213], and Jondeau and Rockinger
[162], to simulate four major stylized facts: volatility clustering, heavy tails, leverage
effects and tail co-dependence, which are typically indispensable to proper risk
management and portfolio optimization in financial markets. The reason is
straightforward: without unrealistic simplifying assumptions, proper optimization
models (i.e., models that can handle real-world multivariate distributions) often achieve
superior results because the simulated scenario tree (or, more precisely, scenario fan)
describe better key empirical features of the financial series.
However, the original scenario tree is neither tractable nor arbitrage-free.
Therefore, we need a second step that (i) approximates (i.e., reduce the size) the
original tree by a smaller one and (ii) removes potential opportunities of arbitrage.
Otherwise, practical portfolio optimization problems are likely to be either unsolvable
(within a reasonable amount of time) or biased. To accomplish both objectives, this
step extends the optimal discretization approach, proposed by Pflug [203] and
developed by Pflug and Pichler [205][206], and Kovacevic and Pichler [155], to
preclude arbitrage opportunities.
Finally, the last step exploits the approximated scenario tree to perform a multi-
objective multi-period stochastic portfolio Mean-Variance-CVaR optimization that
considers market frictions like transaction costs and liquidity restrictions. Maximizing
expected return and minimizing variance are virtually mandatory objectives in most
real-world portfolio optimization problems. On the other hand, minimizing the
Conditional Value-at-Risk is incorporated to handle market risk more accurately in
102
practical asset allocation applications. This final step extends the approach discussed
by Roman, Darby-Dowman and Mitra [269] to our multi-period setting.
Results discussed in the previous section suggest that these combined steps
offer a thorough and reliable framework to optimize such portfolios. The absence of
questionable assumptions (e.g., no market frictions or liquidity restrictions) and better
uncertainty modelling (e.g., co-dependence and heavy tails) combined with a more
realistic set of objectives seem better equipped to manage practical portfolios.
Although some of these portfolio optimization enhancements have been proposed in
the literature, to the best of our knowledge they have never been combined into a single
framework.
Such improvements have only been possible because of the recent advances
in computing power, which proved invaluable to implement approaches like multi-
period stochastic programming. Such approaches are instrumental to real applications
because they handle key features of real markets that are often dismissed by common
optimization approaches.
Furthermore, the techniques and models used in the proposed framework can
be replaced by newer extensions that have been recently explored and/or further
developed. The Student t distribution, for instance, can be replaced by Hansen’s
skewed-t distribution [125] to accommodate also potential gain and loss asymmetries
in daily returns, a fifth and potentially relevant stylized fact. Bayesian Stochastic
Volatility models, based on Bayesian Markov Chain Monte Carlo (MCMC) algorithms,
can substitute the GJR-GARCH models to mitigate with potential model
misspecifications, such as discussed by Jacquier, Polson and Rossi [152][153], and
Bauwens, Hafner and Laurent [20]. The t copulas can be replaced by vine copulas, as
suggested by Kurowicka and Joe [189], and Patton [237][238], which are able to
explain more elaborated dependence structures and include some additional financial
multivariate properties like tail dependence asymmetries. Lastly, we assumed that the
true probability distribution was fully known. In practice, however, this is hardly true. In
financial applications, the probability model is usually unknown and can only be
estimated. This model “ambiguity”, as discussed by Pflug and Pichler [246] should also
be accounted by the stochastic optimization problems to produce solutions that are
more robust.
103
References
[1] Abdelaziz, F., 2012. Solution Approaches for the Multiobjective Stochastic
Programming. European Journal of Operational Research, Volume 216, Issue
1, Pages 1–16.
[2] Abraham, A., Jain, L., Goldberg, R., 2005. Evolutionary Multiobjective
Optimization. Springer-Verlag, London.
[3] Aggarwal, C., 2014. Data Classification – Algorithms and Applications.
Chapman and Hall/CRC, Boca Raton.
[4] Alberg, D., Shalit, H., Yosef, R., 2008. Estimating Stock Market Volatility
using Asymmetric GARCH Models. Applied Financial Economics, Volume 18,
Issue 15, Pages 1201-1208.
[5] Andersen, T., Bollerslev, T., Diebold, F., Ebens, H., 2001. The Distribution
of Realized Stock Return Volatility. Journal of Financial Economics, Volume
61, Issue 1, Pages 43-67.
[6] Andersen, T., Bollerslev, T., Diebold, F., Labys, P., 2001a. The Distribution
of Realized Exchange Rate Volatility. Journal of the American Statistical
Association, Volume 96, Issue 453, Pages 42–55.
[7] Andersen, T., et al., 2009. Handbook of Financial Time Series. Springer-
Verlag. Heidelberg, Berlin.
[8] Artzner, P., Delbaen, F., Heath, D., 1997. Thinking Coherently. Risk, Volume
10, Issue 11, Pages 68–71.
[9] Artzner, P., Delbaen, F., Eber, J., Heath, D., 1999. Coherent Measures of
Risk. Mathematical Finance, Volume 9, Issue 3, Pages 203–228.
[10] Artzner, P., Delbaen, F., Eber, J., Heath, D., Ku, H., 2007. Coherent
Multiperiod Risk Adjusted Values and Bellman Principle. Annals of Operations
Research, Volume 152, Issue 1, Pages 5-22.
[11] Baba, Y., Engle, R., Kraft, D., Kroner F., 1990. Multivariate Simultaneous
Generalized ARCH. Working Paper. Department of Economics, University of
California, San Diego.
[12] Baillie, R., Bollerslev, T., Mikkelsen, H., 1996a. Fractionally integrated
Generalized Autoregressive Conditional Heteroskedasticity. Journal of
Econometrics, Volume 74, Issue 1, Pages 3–30.
104
[13] Baillie, R., Chung, C., Tieslau, M., 1996b. Analyzing Inflation by the
Fractionally Integrated ARFIMA-GARCH Model. Journal of Applied
Econometrics, Volume 11, Issue 1, Pages 23–40.
[14] Balkema, A., de Haan, L., 1974. Residual Life Time at Great Age. Annals of
Probability, Volume 2, Issue 5, Pages 792-804.
[15] Balzer, L., 1995. Measuring Investment Risk: A Review. Journal of Investing,
Volume 4, Issue 3, Pages 5-16.
[16] Barndorff-Nielsen, O. 1977. Exponentially Decreasing Distributions for the
Logarithm of Particle Size. Proceedings of the Royal Society of London A,
Volume 353, Issue 1674, Pages 401–419.
[17] Barndorff-Nielsen, O., 1995. Normal Inverse Gaussian Processes and the
Modelling of Stock Returns. Research Report Issue 300, Department of
Theoretical Statistics, University of Aarhus.
[18] Barndorff-Nielsen, O., 1997. Normal Inverse Gaussian Distributions and
Stochastic Volatility Modelling. Scandinavian Journal of Statistics, Volume
24, Issue 1, Pages 1-13.
[19] Barndorff-Nielsen, O., Shephard, N. 2002. Econometric Analysis of
Realised Volatility and its use in Estimating Stochastic Volatility Models.
Journal of the Royal Statistical Society, Series B, Volume 64, Issue 2, Pages
253–280.
[20] Bauwens, L., Hafner, C., Laurent, S., 2012. Handbook of Volatility Models
and their Applications. John Wiley & Sons. Hoboken, New Jersey.
[21] Bauwens, L., Laurent, S., Rombouts, J., 2006. Multivariate GARCH
Models: A Survey. Journal of Applied Econometrics, Volume 21, Issue 1,
Pages 79-109.
[22] Bawa, V., 1975. Optimal Rules for Ordering Uncertain Prospects. Journal of
Financial Economics, Volume 2, Issue 1, Pages 95-121.
[23] Bawa, V., Lindenberg, E., 1977. Capital Market Equilibrium in a Mean-
Lower Partial Moment Framework. Journal of Financial Economics, Volume
5, Issue 2, Pages 189-200.
[24] Ben-Tal, A., Ghaoui, L., Nemirovski, A., 2009. Robust Optimization.
Princeton University Press, New Jersey.
105
[25] Beraldi, P., Bruni, M., 2014. A Clustering Approach for Scenario Tree
Reduction: An Application to a Stochastic Programming Portfolio
Optimization Problem. TOP, Volume 22, Issue 3, Pages 934-949.
[26] Beraldi, P., Simone, F., Violi, A., 2010. Generating Scenario Trees: A
Parallel Integrated Simulation-Optimization Approach. Journal of
Computational and Applied Mathematics, Volume 233, Issue 9, Pages 2322-
2331.
[27] Berman, M., 1964. Limiting Theorems for the Maximum Term in Stationary
Sequences. Annals of Mathematical Statistics, Volume 35, Issue 2, Pages
502-516.
[28] Bertsimas, D., Lauprete, G., Samarov, A., 2004. Shortfall as a Risk
Measure: Properties, Optimization and Applications. Journal of Economic
Dynamics & Control, Volume 28, Issue 7, Pages 1353-1381.
[29] Bertocchi, M., Consigli, G., Dempster, M., 2011. Stochastic Optimization
Methods in Finance and Energy. Springer, New York.
[30] Birge, J., Louveaux, F., 2011. Introduction to Stochastic Programming,
Second Edition. Springer, New York.
[31] Black, F. 1976. Studies of Stock Price Volatility Changes. Proceedings of the
1976 Meetings of the American Statistical Association, Business, and
Economics Section, Pages 177-181.
[32] Blattberg, R., Gonedes, N., 1974. A Comparison of the Stable and Student
Distributions as Statistical Models for Stock Prices. Journal of Business,
Volume 47, Issue 2, Pages 244-280.
[33] Boender, G., 1997. A Hybrid Simulation/Optimisation Scenario Model for
Asset/Liability Management. European Journal of Operational Research,
Volume 99, Issue 1, Pages 126–135.
[34] Bollerslev, T., 1986. Generalized Autoregressive Conditional
Heteroscedasticity. Journal of Econometrics, Volume 31, Issue 3, Pages 307-
327.
[35] Bollerslev, T., 1987. A Conditional Heteroskedastic Time Series Model for
Speculative Prices and Rates of Return. Review of Economics and Statistics,
Volume 69, Issue 3, Pages 542–547.
106
[36] Bollerslev, T., 1990. Modelling the Coherence in Short-Run Nominal
Exchange Rates: A Multivariate Generalized ARCH Model. Review of
Economics and Statistics, Volume 72, Issue 3, Pages 498-505.
[37] Bollerslev, T., Engle, R., Wooldridge, J., 1988. A Capital Asset Pricing
Model with Time-Varying Covariances. Journal of Political Economy, Volume
96, Issue 1, Pages 116-131.
[38] Bollerslev, T., Wooldridge, J., 1992. Quasi-Maximum Likelihood Estimation
and Inference in Dynamic Models with Time-Varying Covariances.
Econometric Reviews, Volume 11, Issue 2, Pages 143-172.
[39] Bouyé, E., et al., 2000. Copulas for Finance - A Reading Guide and Some
Applications. Working Paper, City University Business School – Financial
Econometrics Research Center. London.
[40] Brandt, M., 2010. Portfolio Choice Problems. In: Aït-Sahalia, Y., and
Hansen, L., 2010. Handbook of Financial Econometrics: Tools and
Techniques. Elsevier B.V., Pages 269-336.
[41] Cai, J., 1994. A Markov Model of Switching-Regime ARCH. Journal of
Business and Economic Statistics, Volume 12, Issue 3, Pages 309-316.
[42] Campbell, J., Lo, A., MacKinlay, C. 1997. The Econometrics of Financial
Markets. Princeton University Press. Princeton, New Jersey.
[43] Campbell, J., Vieira, L., 2002. Strategic Asset Allocation: Portfolio Choice
for Long-Term Investors. Oxford University Press. Oxford.
[44] Cappiello, L. Engle, R., Sheppard, K., 2006. Asymmetric Dynamics in the
Correlations of Global Equity and Bond Returns. Journal of Financial
Econometrics, Volume 4, Issue 4, Pages 537-572.
[45] Cariño, D., et al., 1994. The Russell-Yasuda Kasai Model: An Asset Liability
Model for a Japanese Insurance Company using Multi-Stage Stochastic
Programming. Operations Research, Volume 24, Issue 1, Pages 29-49.
[46] Cariño, D., Myers, D., Ziemba, W., 1998. Concepts, Technical Issues, and
Uses of the Russell-Yasuda Kasai Financial Planning Model. Operations
Research, Volume 46, Issue 4, Pages 450-462.
[47] Cariño, D., Turner, A., 1998. Multiperiod Asset Allocation with Derivative
Assets. In: Ziemba, W., Mulvey, J.. Worldwide Asset and Liability Modeling.
Cambridge University Press, Cambridge.
107
[48] Castillo, E., Hadi, A., 1997. Fitting the Generalized Pareto Distribution to
Data. Journal of the American Statistical Association, Volume 92, Issue 440,
Pages 1609-1620.
[49] Chekhlov, A., Uryasev, S., Zabarankin, M., 2000. Portfolio Optimization
with Drawdown Constraints. Research Report 2000-5, Center for Applied
Optimization, Department of Industrial and Systems Engineering, University
of Florida. Available at: http://papers.ssrn.com/sol3/papers.cfm?
abstract_id=223323
[50] Chekhlov, A., Uryasev, S., Zabarankin, M., 2005. Drawdown Measure in
Portfolio Optimization. International Journal of Theoretical and Applied
Finance, Volume 8, Issue 1, Pages 13-58.
[51] Cherubini, U., Luciano, E., Vecchiato, W., 2004. Copulas Methods in
Finance. John Wiley & Sons. Chichester, UK.
[52] Christie, A., 1982. The Stochastic Behavior of Common Stock Variances:
Value, Leverage and Interest Rate Effects. Journal of Financial Economics,
Volume 10, Issue 4, Pages 407–432.
[53] Coles, S., 2001. An Introduction to Statistical Modeling of Extreme Values.
Springer Series in Statistics. London.
[54] Consigli, G., 1997. Dynamic Stochastic Programming for Asset and Liability
Management. Doctoral Thesis, University of Essex, London.
[55] Consigli, G., Dempster, M., 1998. Dynamic Stochastic Programming for
Asset and Liability Management. Annals of Operations Research, Volume 81,
Issue 0, Pages 131-162.
[56] Consigli, G., Ianquita, G., Moriggia, V., 2010. Path-Dependent Scenario
Trees for Multistage Stochastic Programmes in Finance. Quantitative
Finance, Volume 12, Issue 8, Pages 1265-1281.
[57] Consiglio, A., Carollo, A., Zenios, S., 2015. A Parsimonious Model for
Generating Arbitrage-Free Scenario Trees. Quantitative Finance.
Forthcoming. Available at: http://papers.ssrn.com/sol3/papers.cfm?
abstract_id=2362014
[58] Cont, R., 2001. Empirical Properties of Asset Returns: Stylized Facts and
Statistical Issues. Quantitative Finance, Volume 1, Issue 2, Pages 223-236.
108
[59] Dacorogna, M., Müller, U., Pictet, O., Vries, C., 2001. Extremal Forex
Returns in Extremely Large Data Sets. Extremes, Volume 4, Issue 2, Pages
105–127.
[60] Dantzig, G., Infanger, G., 1991. Large-Scale Stochastic Linear Programs:
Importance Sampling and Benders Decomposition. Technical Report SOL
91-4, Department of Operations Research, Stanford University.
[61] Dantzig, G., Infanger, G., 1993. Multi-Stage Stochastic Linear Programs for
Portfolio Optimization. Annals of Operations Research, Volume 45, Issue 1,
Pages 59-76.
[62] Date, P., Mamon, R., Jalen, L. 2008. A New Moment Matching Algorithm for
Sampling from Partially Specified Symmetric Distributions. Operations
Research Letters, Volume 36, Issue 6, Pages 669–672.
[63] Davison, A., Smith, R., 1990. Models for Exceedances over High
Thresholds. Journal of the Royal Statistical Society – Series B. Volume 52,
Issue 3, Pages 393-442.
[64] Demarta, S., McNeil, A., 2005. The t Copula and Related Copulas.
International Statistical Review, Volume 73, Issue 1, Pages 111-129.
[65] DeMiguel, V., Garlappi, L., Uppal, R., 2009. Optimal versus Naive
Diversification: How Inefficient is the 1/N Portfolio Strategy? Review of
Financial Studies. Volume 22, Issue 5, Pages 1915-1953.
[66] DeMiguel, V., Plyakha, Y., Uppal, R., Vilkov, G., 2009. Improving Portfolio
Selection using Option-Implied Volatility and Skewness. Journal of Financial
and Quantitative Analysis, Volume 48, Issue 6, Pages 1813-1845.
[67] Dempster, M., 2006. Sequential Importance Sampling Algorithms for
Dynamic Stochastic Programming. Journal of Mathematical Sciences,
Volume 133, Issue 4, Pages 1422-1444.
[68] Dempster, M., Medova, E., Sook, Y., 2011. Comparison of Sampling
Methods for Dynamic Stochastic Programming. In: Bertocchi, M., Consigli,
G., and Dempster, M.. Stochastic Optimization Methods in Finance and
Energy. Springer, New York.
[69] Dempster, M., Medova, E., Sook, Y., 2015. Stabilizing Implementable
Decisions in Dynamic Stochastic Programming. Available at:
http://ssrn.com/abstract=2545992
109
[70] Dempster, M., Pflug, G., Mitra, G., 2008. Quantitative Fund Management.
Chapman and Hall/CRC, Boca Raton.
[71] Dempster, M., Thompson, R., 1999. EVPI-Based Importance Sampling
Solution Procedures for Multistage Stochastic Linear Programmes on Parallel
MIMD Architectures, Annals of Operations Research, Volume 90, Issue 0,
Pages 161-184.
[72] Dempster, M., Thorlacius, A., 1998. Stochastic Simulation of Internal
Economic Variables and Asset Returns: The Falcon Asset Model.
Proceedings of the 8th International AFIR Colloquium, London Institute of
Actuaries, Pages 29–45.
[73] Dias, J., Vermunt, J., Ramos, S., 2009. Mixture Hidden Markov Models in
Finance Research. In: Fink, A., et al.. Advances in Data Analysis, Data
Handling and Business Intelligence. Springer, New York.
[74] Dias, J., Vermunt, J., Ramos, S., 2015. Clustering Financial Time Series:
New Insights from an Extended Hidden Markov Model. European Journal of
Operational Research, Volume 243, Issue 3, Pages 852-864.
[75] Diebold, F., Schuermann, T., Stroughair, J., 1998. Pitfalls and
Opportunities in the Use of Extreme Value Theory in Risk Management. In:
Advances in Computational Finance, Volume 2, Pages 3-12, Kluwer
Academic Publishers, Amsterdam.
[76] Ding, Z., Granger, C., 1996. Modeling Volatility Persistence of Speculative
Returns: A New Approach. Journal of Econometrics, Volume 73, Issue 1,
Pages 185–215.
[77] Ding, Z., Granger, C., Engle, R., 1993. A Long Memory Property of Stock
Market Returns and a New Model. Journal of Empirical Finance, Volume 1,
Issue 1, Pages 83–106.
[78] Doerner, K., et al., 2004. Pareto Ant Colony Optimization: A Metaheuristic
Approach to Multiobjective Portfolio Selection. Annals of Operations
Research, Volume 131, Issue 1, Pages 79-99.
[79] Dowd, K., 2002. Measuring Market Risk. John Wiley & Sons, Hoboken, New
Jersey.
[80] Dueker, M., 1997. Markov Switching in GARCH Process and Mean-
Reverting Stock-Market Volatility. Journal of Business and Economics
Statistics, Volume 15, Issue 1, Pages 26-34.
110
[81] Dupačová, J., 1999. Portfolio Optimization via Stochastic Programming:
Methods of Output Analysis. Mathematical Methods of Operational Research,
Volume 50, Pages 245-270.
[82] Dupačová, J., Consigli, G., Wallace, S., 2000. Scenarios for Multistage
Stochastic Programs. Annals of Operations Research, Volume 100, Issue 1,
Pages 25-53.
[83] Dupačová, J., Gröwe-Kuska, N., Römisch, W., 2003. Scenario Reduction
in Stochastic Programming. Mathematical Programming, Volume 95, Issue 3,
Pages 493-511.
[84] Dupačová, J., Hurt, J., Štěpán, J. 2003. Stochastic Modeling in Economics
and Finance. Kluwer Academic Publishers, New York.
[85] Eberlein, E., Keller, U., 1995. Hyperbolic Distributions in Finance. Bernoulli,
Volume 1, Issue 3, Pages 281-299.
[86] Eberlein, E., Keller, U., Prause, K., 1998. New Insights into Smile,
Mispricing, and Value at Risk – The Hyperbolic Model. Journal of Business,
Volume 71, Issue 3, Pages 371-405.
[87] Ehrgott, M., 2005. Multicriteria Optimization, Second Edition. Springer-
Verlag, Heidelberg.
[88] Ekin, T., Polson, N., Soyer, R., 2014. Augmented Markov Chain Monte
Carlo Simulation for Two-Stage Stochastic Programs with Recourse.
Decision Analysis, Volume 11, Issue 4, Pages 250-264.
[89] Embrechts, P., 2000. Extreme Value Theory: Potential and Limitations as an
Integrated Risk Management Tool. In: Derivatives Use, Trading and
Regulation, Volume 6, Issue 1, Pages 449-456.
[90] Embrechts, P., Klüppelberg, C., Mikosch, T., 1997. Modelling Extremal
Events. Springer, Berlin.
[91] Embrechts, P., Lindskog, F., McNeil, A., 2001. Modelling Dependence with
Copulas and Applications to Risk Management. In: Handbook of Heavy
Tailed Distributions in Finance, Elsevier, North Holland.
[92] Embrechts, P., McNeil, A., Straumann, D., 2002. Correlation and
Dependence in Risk Management: Properties and Pitfalls. In: Risk
Management: Value at Risk and Beyond, Pages 176-223, Cambridge
University Press, Cambridge.
111
[93] Engle, R., 1982. Autoregressive Conditional Heteroscedasticity with
Estimates of the Variance of United Kingdom Inflation. Econometrica,
Volume 50, Issue 4, Pages 987-1007.
[94] Engle, R., 2002a. Dynamic Conditional Correlation: A Simple Class of
Multivariate Generalized Autoregressive Conditional Heteroskedasticity
Models. Journal of Business and Economics Statistics, Volume 20, Issue 3,
Pages 339-350.
[95] Engle, R., Kroner, F., 1995. Multivariate Simultaneous Generalized ARCH.
Economic Theory, Volume 11, Pages 122-150.
[96] Engle, R., Ng, V., 1993. Measuring and Testing the Impact of News on
Volatility. Journal of Finance, Volume 48, Issue 5, Pages 1749-1778.
[97] Fama, E., 1963. Mandelbrot and the Stable Paretian Hypothesis. Journal of
Business, Volume 36, Issue 4, Pages 420-429.
[98] Fama, E., 1965. Portfolio Analysis in a Stable Paretian Market. Management
Science, Volume 11, Issue 3, Pages 404-419.
[99] Fazlollahi, S., Maréchal, F., 2013. Multi-Objective, Multi-Period Optimization
of Biomass Conversion Technologies using Evolutionary Algorithms and
Mixed Integer Linear Programming (MILP). Applied Thermal Engineering,
Volume 50, Issue 2, Pages 1504–1513.
[100] Fersti, R., Weissensteiner, A., 2010. Cash Management using Multi-Stage
Stochastic Programming. Quantitative Finance, Volume 10, Issue 2, Pages
209-219.
[101] Fieldsend, J., Matatko, J., Peng, M. 2003. Cardinality Constrained Portfolio
Optimisation Problem. Proceedings of the Fifth International Conference on
Intelligent Data Engineering and Automated Learning, Exerter.
[102] Finkenstädt, B., Holger, R. 2003. Extreme Values in Finance,
Telecommunications, and the Environment. Chapman and Hall/CRC.
London, UK.
[103] Fishburn, P., 1977. Mean-Risk Analysis with Risk Associated with Below-
Target Returns. American Economic Review, Volume 1967, Issue 2, Pages
116-126.
[104] Fisher, R., Tippett, L., 1928. Limiting Forms of the Frequency Distribution of
the Largest or Smallest Member of a Sample. Proceedings of the Cambridge
Philosophical Society, Volume 24, Issue 2, Pages 180-190.
112
[105] Föllmer, H., Schied, A., 2002. Convex Measures of Risk and Trading
Constraints. Finance and Stochastics, Volume 6, Issue 4, Pages 429-447.
[106] Genest, C., Ghoudi, K., Rivest, L., 1995. A Semiparametric Estimation
Procedure of Dependence Parameters in Multivariate Families of
Distributions. Biometrika, Volume 82, Issue 3, Pages 543-552.
[107] Geyer, A., Hanke, M., Weissensteiner, A., 2010. No-Arbitrage Conditions,
Scenario Trees, and Multi-Asset Financial Optimization. European Journal of
Operational Research, Volume 206, Issue 3, Pages 609-613.
[108] Geyer, A., Hanke, M., Weissensteiner, A., 2013. Scenario Tree Generation
and Multi-Asset Financial Optimization Problems. Operations Research
Letters, Volume 41, Issue 5, Pages 494–498.
[109] Geyer, A., Hanke, M., Weissensteiner, A., 2014. No-Arbitrage Bounds for
Financial Scenarios. European Journal of Operational Research, Volume
236, Issue 2, Pages 657–663.
[110] Ghose, D., Kroner, K., 1995. The Relationship between GARCH and
Symmetric Stable Processes - Finding the Source of Fat Tails in Financial
Data. Volume 2, Issue 3, Pages 225-251.
[111] Glosten, L., Jagannathan, R., Runkle, 1993. On the Relation between the
Expected Value and the Volatility of the Nominal Excess Return on Stocks.
Journal of Finance, Volume 48, Issue 5, Pages 1779–1801.
[112] Gibbs, A., Su, F., 2002. On Choosing and Bounding Probability Metrics.
International Statistical Review, Volume 70, Issue 3, Pages 419-435.
[113] Gilli, M., Këllezi, E., 2006. An Application of Extreme Value Theory for
Measuring Financial Risk. Computational Economics, Volume 27, Issue 1,
Pages 1-23.
[114] Gnedenko, B., 1943. Sur la Distribution Limite du Terme Maximum d’une
Série Aléatoire. The Annals of Mathematics, Volume 44, Issue 3, Pages 423–
453.
[115] Gonzalez-Rivera, G., 1998. Smooth Transition GARCH Models – Studies in
Nonlinear Dynamics and Economics, Volume 3, Issue 2, Pages 161-178.
[116] Granger, C., Spear, S., Ding. Z., 2000. Stylized Facts on the Temporal and
Distributional Properties of Absolute Returns: An Update. In: Statistics and
Finance: An Interface. Imperial College Press, London.
113
[117] Gray, S., 1996. Modeling the Conditional Distribution of Interest Rates as a
Regime-Switching Process. Journal of Financial Economics, Volume 42,
Issue 1, Page 27-62.
[118] Gröwe-Kuska, N., Heitsch, H., Römisch, W., 2003. Scenario Reduction
and Scenario Tree Construction for Power Management Problems.
Proceedings of the IEEE Bologna Power Tech Conference. Bologna, Italy.
[119] Guastaroba, G., 2010. Portfolio Optimization: Scenario Generation, Models
and Algorithms. Doctoral Thesis, University of Bergamo, Bergamo.
[120] Gulpinar, N., Rustem, B., Settergren, R., 2004. Simulation and
Optimization Approaches to Scenario Tree Generation. Journal of Economic
Dynamics & Control, Volume 28, Issue 7, Pages 1291-1315.
[121] Hafner, C., Franses, P., 2009. A Generalized Dynamic Conditional
Correlation Model: Simulation and Application to Many Assets. Econometric
Reviews, Volume 26, Issue 6, Pages 612-631.
[122] Hagerman, R., 1978. More Evidence on the Distribution of Security Returns.
Journal of Finance, Volume 33, Issue 4, Pages 1213-1221.
[123] Hagerud, G., 1997. A New Non-Linear GARCH Model. Doctoral Thesis,
Stockholm School of Economics, Stockholm.
[124] Hamilton, J., Susmel, R., 1994. Autoregressive Conditional
Heteroskedasticity and Changes in Regime. Journal of Econometrics,
Volume 64, Issues 1-2, Pages 307-333.
[125] Hansen, B., 1994. Autoregressive Conditional Density Estimation.
International Economic Review, Volume 35, Issue 3, Pages 705-730.
[126] Hansen, P., Lunde, A., 2005. A Forecast Comparison of Volatility Models:
Does Anything beat a GARCH (1,1)? Journal of Applied Econometrics,
Volume 20, Issue 7, Pages 873–889.
[127] Harrison, M., Kreps, D., 1979. Martingale and Arbitrage in Multiperiod
Securities Markets. Journal of Economic Theory, Volume 20, Issue 3, Pages
381–408.
[128] Harvey, A., Shephard, N., 1996. Estimation of an Asymmetric Stochastic
Volatility Model for Asset Returns. Journal of Business and Economic
Statistics, Volume 14, Issue 4, Pages 429-434.
[129] Hansen, B., 1994. Autoregressive Conditional Density Estimation.
International Economic Review, Volume 35, Issue 3, Pages 705–730
114
[130] Hauksson, A., Dacorogna, M., Domenig, T., Müller, U., Samorodnitsky,
G., 2001. Multivariate Extremes, Aggregation and Risk Estimation.
Quantitative Finance, Volume 1, Issue 1, Pages 79–95.
[131] He, C., 1997. Statistical Properties of GARCH Processes. Doctoral Thesis,
Stockholm School of Economics, Stockholm.
[132] Heitsch, H., Römisch, W., 2003. Scenario Reduction in Stochastic
Programming. Computational Optimization and Applications, Volume 24,
Issue 2, Pages 187-206.
[133] Heitsch, H., Römisch, W., 2005. Generation of Multivariate Scenario Trees
to model Stochasticity in Power Management. IEEE St. Petersburg
PowerTech Proceedings, St. Petersburg.
[134] Heitsch, H., Römisch, W., 2007. Scenario Tree Modelling for Multistage
Stochastic Programs. Mathematical Programming, Volume 118, Issue 2,
Pages 371-406.
[135] Heitsch, H., Römisch, W., 2008. Scenario Tree Reduction for Multistage
Stochastic Programs. Computational Management Science, Volume 6, Issue
2, Pages 117-133.
[136] Heitsch, H., Römisch, W., 2011. Stability and Scenario Trees for Multistage
Stochastic Programs. In: Infanger, G.. Stochastic Programming. Springer,
New York.
[137] Heitsch, H., Römisch, W., 2011. Scenario Tree Generation for Multi-Stage
Stochastic Programs. In: Bertocchi, M., Consigli, G., Dempster, M..
Stochastic Optimization Methods in Finance and Energy. Springer, New
York.
[138] Heitsch, H., Römisch, W., Strugarek, C., 2006. Stability of Multistage
Stochastic Programs. SIAM Journal on Optimization, Volume 17, Issue 2,
Pages 511-525.
[139] Hill, B., 1975. A Simple General Approach to Inference about the Tail of a
Distribution. Annals of Statistics, Volume 3, Issue 5, Pages 1163–1174.
[140] Hochreiter, R., 2014. Modeling Multi-Stage Decision Optimization Problems.
In: Fonseca, R., Weber, G., Telhada, J.. Computational Management
Science: State of the Art 2014. Pages 209-214. Springer, New York.
115
[141] Hochreiter, R., Pflug, G., 2006. Financial Scenario Generation for
Stochastic Multi-Stage Decision Processes as Facility Location Problems.
Annals of Operations Research, Volume 152, Issue 1, Pages 257-272.
[142] Hochreiter, R., Pflug, G., Wozabal, D., 2006. Multi-Stage Stochastic
Electricity Portfolio Optimization in Liberated Energy Markets.
[143] Hosking, J., 1985. Maximum Likelihood Estimation of the Parameters of the
Generalized Extreme-Value Distribution. Applied Statistics, Volume 34, Issue
3, Pages 301–310.
[144] Høyland, K., Wallace, S., 2001. Generating Scenario Trees for Multistage
Decision Problems. Management Science, Volume 47, Issue 2, Pages 295-
307.
[145] Høyland, K., Kaut, M., Wallace, S., 2003. A Heuristic for Moment-Matching
Scenario Generation. Computational Optimization and Applications, Volume
24, Issue 2, Pages 169-185.
[146] Hosking, J., Wallis, J., 1987. Parameter and Quantile Estimation for the
Generalized Pareto Distribution. Technometrics, Volume 29, Issue 3, Pages
339–349.
[147] Hsu, D., Miller, R., Wichern, D., 1974. On the Stable Paretian Behavior of
Stock-Market Prices. Journal of American Statistical Association, Volume 69,
Issue 345, Pages 108-113.
[148] Hu, W., Kercheval, A., 2007. Risk Management with Generalized Hyperbolic
Distributions. Proceedings of the Fourth IASTED International Conference on
Financial Engineering and Applications, Pages 19-24. Berkeley, California.
[149] Huang, S., Chien, Y., Wang, R., 2011. Applying GARCH-EVT-Copula
Models for Portfolio Value-at-Risk on G7 Currency Markets. International
Research Journal of Finance and Economics, Issue 74, Pages 136-151.
[150] Huang, J., Lee, K., Liang, H., Lin, W., 2009. Estimating Value at Risk of
Portfolio by Conditional Copula-GARCH Method. Insurance: Mathematics
and Economics, Volume 45, Issue 3, Pages 315–324.
[151] Ingersoll, J., 1987. Theory of Financial Decision Making. Rowman &
Littlefield Publishers, Lanham.
[152] Jacquier, E., Polson, N., Rossi, P. 1994. Bayesian Analysis of Stochastic
Volatility Models. Journal of Business & Economics Statistics, Volume 12,
Issue 4, Pages 371–389.
116
[153] Jacquier, E., Polson, N., Rossi, P. 2004. Bayesian Analysis of Stochastic
Volatility Models with Fat-Tails and Correlated Errors. Journal of
Econometrics, Volume 122, Issue 1, Pages 185–212.
[154] Jansen, D,. Vries, C., 1991. On the Frequency of Large Stock Market
Returns: Putting Booms and Busts into Perspective. Review of Economics
and Statistics Volume 73, Issue 1, Pages 18-24.
[155] Jarraya, B., 2013. Asset Allocation and Portfolio Optimization Problems with
Metaheuristics: A Literature Survey. Business Excellence and Management,
Volume 3, Issue 4, Pages 38-56.
[156] Jean, W., 1971. The Extension of Portfolio Analysis to Three or More
Parameters. Journal of Financial and Quantitative Analysis, Volume 6, Issue
1, Pages 505-515.
[157] Jenkinson, A., 1955. The Frequency Distribution of the Annual Maximum (or
Minimum) Values of Meteorological Elements. Quarterly Journal of the Royal
Meteorological Society, Volume 81, Issue 348, Pages 158–171.
[158] Jenkinson, A., 1969. Statistics of Extremes. In: Estimation of Maximum
Floods. World Meteorological Organization. Technical Note 98, Pages 183-
228.
[159] Jobst, N., Zenios, S., 2005. On the Simulation of Portfolios of Interest Rate
and Credit Risk Sensitive Securities. European Journal of Operational
Research, Volume 161, Issue 2, Pages 298-324.
[160] Joe, H. 1997. Multivariate Models and Dependence Concepts. Chapman &
Hall, London, UK.
[161] Joe, H., 2015. Dependence Modelling With Copulas. CRC Press, Taylor &
Francis Group, Boca Raton.
[162] Jondeau, E., Rockinger, M., 2006. A Copula-GARCH Model of Conditional
Dependencies: An International Stock Market Application. Journal of
International Money and Finance, Volume 25, Issue 5, Pages 827-853.
[163] Kahneman, D., Tversky, A., 1979. Prospect Theory: An Analysis of Decision
under Risk. Econometrica, Volume 47, Issue 2, Pages 263-291.
[164] Kall, P., Mayer, J., 2011. Stochastic Linear Programming: Models, Theory
and Computation, Second Edition. Springer, New York.
[165] Kall, P., Wallace, S., 2003. Stochastic Programming. John Wiley & Sons.
Hoboken, New Jersey.
117
[166] Kolm, P., Tütüncü, R., Fabozzi, F., 2014. 60 Years of Portfolio Optimization:
Practical Challenges and Current Trends. European Journal of Operational
Research, Volume 234, Issue 2, Pages 356-371.
[167] Karanasos, M., Karanassou, M., Fountas, S., 2004. Analyzing US Inflation
by a GARCH Model with Simultaneous Feedback. WSEAS Transactions on
Information Science and Applications, Volume 2, Issue 1, Pages 767–772.
[168] Kaut, M., Wallace, W., 2007. Evaluation of Scenario Generation Methods for
Stochastic Programming. Pacific Journal of Optimization, Volume 3, Issue 2,
Pages 275-271.
[169] Keefer, D., 1994. Certainty Equivalents for Three-Point Discrete-Distribution
Approximations. Journal of Management Science, Volume 40, Issue 6,
Pages 760-773.
[170] Kim, G., Silvapulle, M., Silvapulle, P., 2007. Comparison of Semiparametric
and Parametric Methods for Estimating Copulas. Computational Statistics &
Data Analysis, Volume 51, Issue 6, Pages 2836-2850.
[171] King, A., Wallace, S., 2012. Modeling with Stochastic Programming.
Springer, New York.
[172] Klaassen, F., 2002. Improving GARCH Volatility Forecasts with Regime-
Switching GARCH. Empirical Economics, Volume 27, Issue 2, Pages 363-
394.
[173] Klaassen, P., 1997. Discretized Reality and Spurious Profits in Stochastic
Programming Models in Asset/Liability Management. European Journal of
Operational Research, Volume 44, Issue 1, Pages 31-48.
[174] Klaassen, P., 1998. Financial Asset-Pricing Theory and Stochastic
Programming Models for Asset/Liability Management: A Synthesis.
Management Science, Volume 44, Issue 1, Pages 31-48.
[175] Klaassen, P., 2002. Comment on “Generating Scenario Trees for Multistage
Decision Problems. Management Science, Volume 48, Issue 11, Pages
1512-1516.
[176] Kleywegt, A., Shapiro, A., Homem-de-Mello, T., 2002. The Sample
Average Approximation Method for Stochastic Discrete Optimization. SIAM
Journal on Optimization, Volume 12, Issue 2, Pages 479-2002.
[177] Knab, B., et al., 2003. Model-Based Clustering with Hidden Markov Models
and its Applications to Financial Time-Series Data. In: Schader, M., Gaul, W.,
118
Vichi, M.. Between Data Science and Applied Data Analysis. Springer, New
York.
[178] Koedijk, G., Schafgans, M., Vries, C., 1990. The Tail Index of Exchange
Rate Returns. Journal of International Economics, Volume 29, Issues 1-2,
Pages 93-108.
[179] Koedijk, G., Stork, P., Vries, C., 1992. Differences between Foreign
Exchange Rate Regimes: The View from the Tails. Journal of International
Money and Finance, Volume 11, Issue 5, Pages 462-473.
[180] Kon, S., 1984. Models of Stock Returns – A Comparison. Journal of Finance,
Volume 39, Issue 1, Pages 147-165.
[181] Konno, H., Yamazaki, H., 1991. Mean-Absolute Deviation Portfolio
Optimization Model and its Applications to Tokyo Stock Market. Management
Science, Volume 37, Issue 5, Pages 519-531.
[182] Konno, H., Shirakawa, H., Yamazaki, H., 1993. A Mean-Absolute Deviation-
Skewness Portfolio Optimization Model. Annals of Operations Research,
Volume 45, Issue 1, Pages 205-220.
[183] Konno, H., Waki, H., Yuuki, A., 2002. Portfolio Optimization under Lower
Partial Risk Measures. Asian-Pacific Financial Markets, Volume 9, Issue 2,
Pages 127-140.
[184] Kouwenberg, R., 2001. Scenario Generation and Stochastic Programming
Models for Asset Liability Management. European Journal of Operational
Research, Volume 134, Issue 2, Pages 279-292.
[185] Kouwenberg, R., Zenios, S., 2001. Stochastic Programming Models for
Asset Liability Management. In: Zenios, S., Ziemba, W., 2001. Handbook of
Asset and Liability Management: Applications and Case Studies. Elsevier,
Amsterdam.
[186] Kovacevic, R., Pichler, A., 2015. Tree Approximation for Discrete Time
Stochastic Processes: A Process Distance Approach. Annals of Operations
Research, Volume 235, Issue 1, Pages 395-421.
[187] Krokhmal, P., Palmquist, J., Uryasev, S., 2002. Portfolio Optimization with
Conditional Value-at-Risk Objective and Constraints. Journal of Risk, Volume
4, Pages 43-68.
119
[188] Krokhmal, P., Zabarankin, M., Uryasev, S., 2011. Modeling and
Optimization of Risk. Surveys in Operations Research and Management
Science, Volume 16, Pages 49-66.
[189] Kurowicka, D., Joe, H., 2011. Dependence Modelling: Vine Copula
Handbook. World Scientific Publishing, Singapore.
[190] Liao, W., 2005. Clustering of Time Series Data – A Survey. Pattern
Recognition, Volume 38, Issue 11, Pages 1857-1874.
[191] Lin, D., Wang, S., Yan, H., 2001. A Multiobjective Genetic Algorithm for
Portfolio Selection Problem. Proceedings of ICOTA, Hong Kong.
[192] Ling, S., McAleer, M., 2002. Stationarity and the Existence of Moments of a
Family of GARCH Processes. Journal of Econometrics, Volume 106, Issue 1,
Pages 109-117.
[193] Lo, A., MacKinlay, A., 1988. Stock Market Prices do not follow Random
Walks: Evidence from a Simple Specification Test. Review of Financial
Studies, Volume 1, Issue 1, Pages 41–66.
[194] Lo, A., MacKinlay, A., 1990. When are Contrarian Profits due to Stock
Market Overreaction? Review of Financial Studies, Volume 3, Issue 2, Pages
175–205.
[195] Lo, A., MacKinlay, A., 1999. A Non-Random Walk Down Wall Street.
Princeton University Press, Princeton, New Jersey.
[196] Löhndorf, N., 2016. An Empirical Analysis of Scenario Generation Methods
for Stochastic Optimization. Working Paper. Vienna University of Economics
and Business. Available at: http://www.optimization-
online.org/DB_HTML/2016/02/5319.html, 2016-03-03.
[197] Longin, F., 1996. The Asymptotic Distribution of Extreme Stock Market
Returns. Journal of Business, Volume 69, Issue 3, Pages 383–408.
[198] Longin, F., 1999. Optimal Margin Level in Futures Markets: Extreme Price
Movements. Journal of Futures Markets, Volume 19, Issue 2, Pages 127–
152.
[199] Lubrano, M. 2001. Smooth Transition GARCH Models: A Bayesian
Perspective. Recherches Economiques de Louvain, Louvain Economic
Review, Volume 67, Issue 3, Pages 257-287.
120
[200] Lux, T. 2001. The Limiting Extremal Behaviour of Speculative Returns: An
Analysis of Intra-Daily Data from the Frankfurt Stock Exchange. Applied
Financial Economics, 2001, Volume 11, Issue 3, Pages 299-315.
[201] Malmsten, H., Teräsvirta, T., 2010. Stylized Facts of Financial Series and
Three Popular Models of Volatility. European Journal of Pure and Applied
Mathematics, Volume 3, Issue 3, Pages 443-477.
[202] Mandelbrot, B., 1963. The Variation of Certain Speculative Prices. Journal
of Business, Volume 36, Issue 4, Pages 394-419.
[203] Mandelbrot, B., 1967. The Variation of Some Other Speculative Prices.
Journal of Business, Volume 40, Issue 4, Pages 393-413.
[204] Mandelbrot, B., Taylor, H., 1967. On the Distribution of Stock Price
Differences. Operations Research. Volume 15, Issue 6, Pages 1057-1062.
[205] Mansini, R., Ogryczak, W., Speranza, M., 2014. Twenty Years of Linear
Programming based Portfolio Optimization. Volume 234, Issue 2, Pages 518-
535.
[206] Mansini, R., Ogryczak, W., Speranza, M., 2015. Linear and Mixed Integer
Programming for Portfolio Optimization. Springer International Publishing,
Switzerland.
[207] Mao, J., 1970. Models of Capital Budgeting, E-V vs E-S. Journal of Financial
and Quantitative Analysis, Volume 4, Issue 5, Pages 657-675.
[208] Markowitz, H., 1952. Portfolio Selection, Journal of Finance, Volume 7,
Issue 1, Pages 77-91.
[209] Markowitz, H., 1959. Portfolio Selection: Efficient Diversification of
Investments, Second Edition. John Wiley & Sons, New York.
[210] Marler, R., Arora, J., 2004. Survey of Multi-Objective Methods for
Engineering. Structural and Multidisciplinary Optimization, Volume 26, Issue
6, Pages 369-395.
[211] McNeil, A., 1997. Estimating the Tails of Loss Severity Distributions using
Extreme Value Theory. ASTIN Bulletin, Volume 27, Issue 1, Pages 117-137.
[212] McNeil, A., Saladin, T., 1997. The Peak over Threshold Method for
Estimating High Quantiles of Loss Distributions. In: Proceedings of 28th
International ASTIN Colloquium.
121
[213] McNeil, A., Frey, R., 2000. Estimation of Tail Related Risk Measure for
Heteroscedastic Financial Time Series: An Extreme Value Approach. Journal
of Empirical Finance, Volume 7, Issues 3-4, Pages 271-300.
[214] McNeil, A., Frey, R., Embrechts, P., 2005. Quantitative Risk Management.
Princeton University Press. Princeton, New Jersey.
[215] Merton, R., 1969. Lifetime Portfolio Selection under Uncertainty: The
Continuous-Time Case. Review of Economics and Statistics. Volume 51,
Issue 3, Pages 247-257.
[216] Merton, R., 1971. Optimum Consumption and Portfolio Rules in a
Continuous-Time Model. Journal of Economic Theory. Volume 3, Issue 4,
Pages 373-413.
[217] Meucci, A., 2005. Risk and Asset Allocation. Springer-Verlag. Heidelberg,
Berlin.
[218] Mikosch, T., 2003. Modeling Dependence and Tails of Financial Time
Series. In: Extreme Values in Finance, Telecommunications, and the
Environment. Chapman & Hall, London.
[219] Mikosch, T., Stărică, C., 2000. Limit Theory for the Sample Autocorrelations
and Extremes of a GARCH (1,1) Process. Annals of Statistics, Volume 28,
Issue 5, Pages 1427–1451.
[220] Mikosch, T., Stărică, C., 2004. Nonstationarities in Financial Time Series,
the Long-Range Dependence, and the IGARCH effects. Review of
Economics and Statistics, Volume 86, Issue 1, Pages 378-390.
[221] Mises, R., 1936. La distribution de la Plus Grande de N Valeurs. Review of
the Mathematical Union Interbalcanique, Volume 1, Pages 141-160.
[222] Mittnik, S., Rachev, S., Paolella, M., 1998. Stable Paretian Modelling in
Finance: Some Empirical and Theoretical Aspects. In: A Practical Guide to
Heavy Tails: Statistical Techniques and Applications. Birkhäuser. Pages 79–
110.
[223] Mittnik, S., Paolella, M., Rachev, S., 2000. Diagnosing and Treating the Fat
Tails in Financial Returns Data. Volume 7, Issues 3–4, Pages 389–416.
[224] Mossin, J., 1968. Optimal Multiperiod Portfolio Choices. Journal of Business
of the University of Chicago. Volume 41, Pages 215-229.
[225] Mulvey, J., 1996. Generating Scenarios for the Towers Perrin Investment
System. Interfaces, Volume 26, Issue 2, Pages 1-15.
122
[226] Mulvey, J., Pauling, W., Madey, R., 2003. Advantages of Multiperiod
Portfolio Models. Journal of Portfolio Management, Volume 23, Issue 2,
Pages 35-45.
[227] Mulvey, J., Thorlacius, A., 1998. The Towers Perrin Global Capital Market
Scenario Generation System. In: World Asset and Liability Management,
Cambridge University Press, Cambridge.
[228] Mulvey, J., Vladimirou H., 1989. Stochastic Network Optimization Models
for Investment Planning. Annals of Operations Research. Volume 20, Issue
1, Pages 187-217.
[229] Musmeci, N., Aste, T., Di Matteo, T. 2014. Relation between Financial
Market Structure and the Real Economy: Comparison between Clustering
Methods. PLOS One, Volume 10, Issue 4, Pages 1-24.
[230] Nawrocki, D., 1999. A Brief History of Downside Risk Measures. Journal of
Investing, Volume 8, Issue 3, Pages 9-25.
[231] Nelsen, R., 2006. An Introduction to Copulas, Second Edition. Springer, New
York.
[232] Nelson, D., 1991. Conditional Heteroskedasticity in Asset Returns: A New
Approach. Econometrica, Volume 59, Issue 2, Pages 347-370.
[233] Nystrom, K., Skoglund, J., 2002. Univariate Extreme Value Theory,
GARCH and Measures of Risk. Preprint, Swedbank.
[234] Parpas, P. et al., 2015. Importance Sampling in Stochastic Programming: A
Markov Chain Monte Carlo Approach. INFORMS Journal on Computing,
Volume 27, Issue 2, Pages 358-377.
[235] Patton, A., 2002. Applications of Copula Theory in Financial Econometrics.
PhD Dissertation. University of California, San Diego.
[236] Patton, A., 2004. On the Out-of-Sample Importance of Skewness and
Asymmetric Dependence for Asset Allocation. Journal of Financial of
Econometrics, Volume 2, Issue 1, Pages 130-168.
[237] Patton, A., 2011. Copula Methods for Forecasting Multivariate Time Series.
In: Handbook of Economic Forecasting, Volume 2, Elsevier, Oxford.
[238] Patton, A., 2012. A Review of Copula Models for Economic Time Series.
Journal of Multivariate Analysis, Volume 110, Pages 4-18.
123
[239] Perry, P., 1983. More Evidence on the Nature of Distribution of Security
Returns. Journal of Financial and Quantitative Analysis, Volume 18, Issue 2,
Pages 211-221.
[240] Peters, J., 2001. Estimating and Forecasting Volatility of Stock Indices using
Asymmetric GARCH Models and (Skewed) Student-t Densities. Preprint.
University of Liege, Belgium.
[241] Pflug, G., 1996. Optimization of Stochastic Models: The Interface between
Simulation and Optimization. Kluwer Academic Publishers, New York.
[242] Pflug, G., 2001. Scenario Tree Generation for Multiperiod Financial
Optimization by Optimal Discretization. Mathematical Programming, Volume
89, Issue 2, Pages 251-271.
[243] Pflug, G., 2009. Version-Independence and Nested Distributions in
Multistage Stochastic Optimization. SIAM Journal on Optimization, Volume
20, Issue 3, Pages 1406-1420.
[244] Pflug, G., Pichler, A., 2011. Approximations for Probability Distributions and
Stochastic Optimization Problems. In: Bertocchi, M., Consigli, G., and
Dempster, M.. Stochastic Optimization Methods in Finance and Energy.
Springer, New York.
[245] Pflug, G., Pichler, A., 2012. A Distance for Multistage Stochastic
Optimization Models. SIAM Journal on Optimization, Volume 22, Issue 1,
Pages 1-23.
[246] Pflug, G., Pichler, A., 2014. Multistage Stochastic Optimization. Springer
International Publishing, Switzerland.
[247] Pflug, G., Pichler, A., 2015. Dynamic Generation of Scenario Trees.
Computational Optimization and Applications, Volume 62, Issue 3, Pages
641-668
[248] Pflug, G., Römisch, W., 2007. Modeling, Measuring and Managing Risk.
World Scientific Publishing Company, Singapore.
[249] Pickands, J. 1975. Statistical Inference using Extreme Value Order
Statistics. Annals of Statistics. Volume 3, Issue 1, Pages 119-131. The
[250] Prause, K., 1999. The Generalized Hyperbolic Model: Estimation, Financial
Derivatives, and Risk Measures. PhD Dissertation. University of Freiburg,
Freiburg im Breisgau.
[251] Prékopa, A., 1995. Stochastic Programming. Springer, Netherlands.
124
[252] Rabemananjara, R., Zakoian, J., 1993. Threshold ARCH Models and
Asymmetries in Volatility. Journal of Applied Econometrics, Volume 8, Issue
1, Pages 31-49.
[253] Rachev, S., 1991. Probability Metrics and the Stability of Stochastic Models.
John Wiley & Sons. Chichester, UK.
[254] Rachev, S., Römisch, W., 2002. Quantitative Stability in Stochastic
Programming: The Method of Probability Metrics. Mathematics of Operations
Research, Volume 27, Issue 4, Pages 792-818. John Wiley & Sons.
Chichester, UK.
[255] Rachev, S., Rüschendorf, L., 1998. Mass Transportation Problems, Volume
I: Theory. Springer Verlag, New York.
[256] Rachev, S., Rüschendorf, L., 1998. Mass Transportation Problems, Volume
II: Applications. Springer Verlag, New York.
[257] Rachev, S., Stoyanov, S., Fabozzi, F., 2011. A Probability Metrics
Approach to Financial Risk Measures.
[258] Rani, S., Sikka, G., 2012. Recent Techniques of Clustering of Time Series
Data: A Survey. International Journal of Computer Applications, Volume 52,
Issue 15, Pages 1-9.
[259] Reiss, R., Thomas, M., 2007. Statistical Analysis of Extreme Value with
Applications to Insurance, Finance, Hydrology and Other Fields, 3rd Edition.
Birkhäuser, Basel.
[260] Resnick, S., 1987. Extreme Values, Regular Variation, and Point Processes.
Springer Verlag, New York.
[261] Resnick, S., Stărică, C., 1995. Consistency of Hill's Estimator for Dependent
Data. Journal of Applied Probability, Volume 32, Issue 1, Pages 139-167.
[262] Resnick, S., Stărică, C., 1998. Tail index estimation for dependent data.
Annals of Applied Probability, Volume 8, Number 4, Pages 1156-1183.
[263] Riccetti, L., 2010. The Use of Copulas in Asset Allocation: When and How a
Copula Model can be Useful? Lambert Academic Publishing, Saarbrücken.
[264] Riccetti, L., 2013. A Copula–GARCH Model for Macro Asset Allocation of a
Portfolio with Commodities. Empirical Economics, Volume 44, Issue 3, Pages
1315-1336.
[265] Rockafellar, T., 2007. Coherent Approaches to Risk in Optimization under
Uncertainty. In: Klastorin, T., Gray, P., Greenberg, H., 2007. INFORMS
125
Tutorials in Operations Research: OR Tools and Applications – Glimpses of
Future Technologies, Chapter 3, Pages 38-61.
[266] Rockafellar, T., Uryasev, S., 2000. Optimization of Conditional Value-at-
Risk. Journal of Risk, Volume 2, Pages 21-42
[267] Rockafellar, T., Uryasev, S., 2002. Conditional Value-at-Risk for General
Loss Distributions. Journal of Banking & Finance, Volume 26, Issue 7, Pages
1443-1471.
[268] Rockinger, M., Jondeau, E., 2001. Conditional Dependency of Financial
Series: An Application of Copulas. Working Paper Nº 82, Banque de France.
[269] Roman, D., Darby-Dowman, K., Mitra, G., 2008. Mean-Risk Models using
Two Risk Measures: A Multi-Objective Approach. In: Dempster, M., Pflug, G.,
Mitra, G., 2008. Quantitative Fund Management. Chapman and Hall/CRC,
Boca Raton.
[270] Römisch, W., 2009. Scenario Reduction Techniques in Stochastic
Programming. In: Watanabe, O., Zeugmann, T.. Stochastic Algorithms:
Foundations and Applications. Springer Berlin Heidelberg. Pages 1-14.
[271] Roy, A., 1952. Safety First and the Holding of Assets. Econometrica, Volume
20, Issue 3, Pages 431-449.
[272] Ruszczyński, A., Shapiro, A., 2002. Handbooks in Operations Research
and Management Science - Stochastic Programming, Volume 10. Elsevier.
[273] Ruszczyński, A., Shapiro, A., 2006. Optimization of Risk Measures. In:
Calafiore, G., Dabbene, F., 2006. Probabilistic and Randomized Methods for
Design and Uncertainty. Springer-Verlag, London.
[274] Rydberg, T., 2000. Realistic Statistical Modelling of Financial Data.
International Statistical Review, Volume 68, Issue 3, Pages 233-258.
[275] Salvadori, G., Michele, C., Kottegoda, N., Rosso, R., 2007. Multivariate
Extreme Value theory. In: Extremes in Nature, Volume 56. Pages 113-129.
[276] Samuelson, P., 1969. Lifetime Portfolio Selection by Dynamic Stochastic
Programming. Review of Economics and Statistics, Volume 51, Issue 3,
Pages 239-246.
[277] Schilling, R., 2005. Measures, Integrals and Martingales. Cambridge
University Press, Cambridge.
126
[278] Shapiro, A., 2013. Sample Average Approximation. In: Gass, S., Fu, M..
Encyclopedia of Operations Research and Management Science, 3rd Edition.
Springer, New York.
[279] Shapiro, A., Dentcheva, D., Ruszczyński, A., 2009. Lectures on Stochastic
Programming: Modeling and Theory. SIAM Series on Optimization.
[280] Sigurbjörnsson, S., 2008. Scenario Tree Generation by Optimal
Discretization. Master Dissertation. Technical University of Denmark,
Denmark.
[281] Silvennoinen, A., Teräsvirta, T., 2009. Modeling Multivariate
Autoregressive Conditional Heteroskedasticity with the Double Smooth
Transition Conditional Correlation GARCH Model. Journal of Financial
Econometrics, Volume 7, Issue 4, Pages 373-411.
[282] Sklar, A., 1959. Fonctions de Repartition à n Dimensions et Leurs Marges.
Publication de l’Institut de Statistique de l’Université de Paris, Volume 8,
Pages 229-231.
[283] Shiryaev, A., 1996. Probability, Second Edition. Springer, New York.
[284] Smith, R., 1985. Maximum Likelihood Estimation in a Class of Nonregular
Cases. Biometrika, Volume 72, Issue 1, Pages 67-90.
[285] Smith, R., 1989. Extreme Value Analysis of Environmental Time Series: An
Application to Trend Detection in Ground-Level Ozone. Statistical Science,
Volume 4, Issue 4, Pages 367-377.
[286] Sodhi, M., 2005. LP Modeling for Asset-Liability Management: A survey of
Choices and Simplifications. Operations Research, Volume 53, Issue 2,
Pages 181–196.
[287] Spall, J., Hill, S., Stark, D., 2006. Theoretical Framework for Comparing
Several Stochastic Optimization Approaches. In: Calafiore, G., Dabbene, F.,
2006. Probabilistic and Randomized Methods for Design and Uncertainty.
Springer-Verlag, London.
[288] Speranza, M., 1993. Linear Programming Model for Portfolio Optimization.
Finance, Volume 14, Pages 107–123.
[289] Šutienė, K., Makackas, D., Pranevičius, H., 2010. Multistage K-Means
Clustering for Scenario Tree Construction. Informatica, Volume 21, Issue 1,
Pages 123-138.
127
[290] Šutienė, K., Pranevičius, H., 2007. Scenario Tree Generation by Clustering
the Simulated Data Paths. Proceedings of the 21st European Conference on
Modelling and Simulation. Prague, Czech Republic.
[291] Šutienė, K., Pranevičius, H., 2008. Copula Effect on Scenario Tree.
International Journal of Applied Mathematics, Volume 37, Issue 2, Pages
112-127.
[292] Taleb, N., 2009. Finiteness of Variance is Irrelevant in the Practice of
Quantitative Finance. Complexity, Volume 14, Issue 3, Pages 66-76.
[293] Tang, J., Zhou, C., Yuan, X., Sriboonchitta, S., 2015. Estimating Risk of
Natural Gas Portfolios by Using GARCH-EVT-Copula Model. Scientific World
Journal, Volume 2015, Article ID 125958, Pages 1-7.
[294] Taylor, S. 1982. Financial Returns Modelled by the Product of Two
Stochastic Processes: A Study of Daily Sugar Prices 1961-79. In: Time
Series Analysis: Theory and Practice 1. North-Holland, Amsterdam.
[295] Taylor, S., 2005. Asset Price Dynamics, Volatility, and Prediction. Princeton
University Press. Princeton, New Jersey.
[296] Taylor, S., 2007. Modelling Financial Time Series, Second Edition. World
Scientific Publishing. New York.
[297] Teräsvirta, T., 1996. Two Stylized Facts and the GARCH(1,1) Model.
Working Paper Series in Economics and Finance, Stockholm School of
Economics, Nº 96.
[298] Teräsvirta, T., Zhao, Z., 2007. Stylized Facts of Return Series, Robust
Estimates, and Three Popular Models of Volatility. Working Paper Series in
Economics and Finance, Stockholm School of Economics, No. 662.
[299] Timonina, A., 2015. Multi-Stage Stochastic Optimization: The Distance
between Stochastic Scenario Processes. Computational Science Magazine,
Volume 12, Issue 1, Pages 171-195.
[300] Tsay, R., 1999. Extreme Value Analysis of Financial Data. Working paper.
Graduate School of Business, University of Chicago.
[301] Tsay, R., 2010. Analysis of Financial Data, 3rd Edition. John Wiley & Sons.
Hoboken, New Jersey.
[302] Tse, Y., Tsui, A., 2002. A Multivariate Generalized Autoregressive
Heteroscedasticity Model with Time-Varying Correlations. Journal of
Business and Economic Statistics, Volume 20, Issue 3, Pages 351-362.
128
[303] Tumminello, M., Lillo, F., Mantegna, R., 2010. Correlation, Hierarchies, and
Networks in Financial Markets. Journal of Economics Behaviour &
Organization, Volume 75, Issue 1, Pages 40-58.
[304] Unser, M. 2000. Lower Partial Moments as Measures of Perceived Risk: An
Experimental Study. Journal of Economic Psychology, Volume 21, Issue 3,
Pages 253-280.
[305] Uryasev, S., Pardalos, P., 2001. Stochastic Optimization: Algorithms and
Applications. Springer, New York.
[306] Viebig, J., Poddig, T., 2010. Modeling Extreme Returns and Asymmetric
Dependence Structures of Hedge Fund Strategies using Extreme Value
Theory and Copula Theory. Journal of Risk, Volume 13, Issue 2, Pages 23-
55.
[307] Villani, C., 2009. Optimal Transport, Old and New. Springer-Verlag,
Heidelberg, Berlin.
[308] Vladimirou, H., Zenios, S., 2012. Parallel Algorithms for Large-Scale
Stochastic Programming. In: Migdalas, A.,, Pardalos, P., and Storøy, S..
Parallel Computing in Optimization, Springer Publishing Company, London.
[309] Wallace, S., Ziemba, W., 2005. Applications of Stochastic Programming.
SIAM Series on Optimization.
[310] Wang, Z., Jin, Y., Zhou, Y., 2010. Estimating Portfolio Risk Using GARCH-
EVT-Copula Model: An Empirical Study on Exchange Rate Market. Advances
in Neural Network Research and Applications, Volume 67, Pages 65-72.
[311] Xiong, J., Idzorek, T., 2011. The Impact of Skewness and Fat Tails on the
Asset Allocation Decision. Financial Analysts Journal, Volume 67, Issue 2,
Pages 23-35.
[312] Zakoian, J., 1994. Threshold Heteroskedastic Models. Journal of Economic
Dynamics and Control, Volume 18, Issue 5, Pages 931-955.
[313] Ziemba, W., 2003. The Stochastic Programming Approach to Asset, Liability,
and Wealth Management. Research Foundation of AIMR, Charlottesville,
Virginia.
[314] Ziemba, W., Mulvey, J. 1998. World Asset and Liability Management.
Cambridge University Press, Cambridge.