Firp Paper

19
1 Modeling securities distribution for a better measure of VaR and Expected Shortfall Saket Anand - 61010063 Abstract Value at Risk(VaR) and Expected Shortfall (ES) are widely used risk measures for  portfolios. For computing VaR and ES it is often assumed that the historical returns are normally distributed. In this study, it is shown that normality assumption for historical returns does not hold even for highly diversified indices such as BSE-Sensex and S&P- 500. The paper presents a two step method for computing a more precise measure of VaR and Expected Shortfall. Firstly, the entire historical distribution is modeled to find a  precise measure of VaR. Thereafter, the tail is modeled separately to get an accurate measure of Expected Shortfall.

Transcript of Firp Paper

8/2/2019 Firp Paper

http://slidepdf.com/reader/full/firp-paper 1/19

1

Modeling securities distribution for a better measure of VaR 

and Expected Shortfall

Saket Anand - 61010063

Abstract

Value at Risk(VaR) and Expected Shortfall (ES) are widely used risk measures for  portfolios. For computing VaR and ES it is often assumed that the historical returns are

normally distributed. In this study, it is shown that normality assumption for historical

returns does not hold even for highly diversified indices such as BSE-Sensex and S&P-500. The paper presents a two step method for computing a more precise measure of VaR 

and Expected Shortfall. Firstly, the entire historical distribution is modeled to find a

 precise measure of VaR. Thereafter, the tail is modeled separately to get an accuratemeasure of Expected Shortfall.

8/2/2019 Firp Paper

http://slidepdf.com/reader/full/firp-paper 2/19

2

I. Introduction

It is a well known fact that the security distributions are not normally distributed.However, one might expect returns of a diversified index to be normally distributed

 because of the central limit theorem. A quick look at the daily returns data of BSE-

Sensex over a ten year period reveals that not only does the normal distributionunderestimate the chances of extreme events (Figure 1 and Figure 2) but it also does not

capture the shape of the empirically observed distribution around the core ( i.e. around

mean).

Figure-1-Normal Distribution underestimates the probability of extreme events.

Returns time series

-15

-10

-5

0

5

10

15

20

3/11/1997 7/24/1998 12/6/1999 4/19/2001 9/1/2002 1/14/2004 5/28/2005 10/10/2006 2/22/2008 7/6/2009

date

  r  e   t  u  r  n  s

Returns mu +2 sigma mu-2 s igma

 Figure 2: Normal Distribution underestimates the probability of extreme events.

8/2/2019 Firp Paper

http://slidepdf.com/reader/full/firp-paper 3/19

3

Empirical CDF vs Normal CDF

0

0.2

0.4

0.6

0.8

1

1.2

-15 -10 -5 0 5 10 15 20

returns

   C   D   F

empir ical normal 

Figure-3-Normal Distribution does not fit the core well

In order to compute VaR and Expected Shortfall it is important to match the core because only

if we model the core precisely will we be able to model the tail. For example, to computeExpected Shortfall (ES) we first need to find an accurate measure of VaR. Clearly, if we use

normal distribution for estimating the VaR our estimates will not match with VaR suggested by

the empirical data. Moreover, it is much easier to match the core than the tails.

Section III presents the reasons for failure of normal distribution to capture the entire empirical

distribution. However, there are certain desirable properties of Normal distribution because of 

which it is still used (Section IV). Form Section V to Section VIII, the new method of computing VaR and ES is developed. In these sections, the method is applied to 10 yeas daily

returns of BSE-returns as a proof of concept. Finally in section VIII, the new approach of 

computing VaR and ES is applied to the 10 year daily returns data of S&P-500.

Let us begin by listing some of the properties of Probability Density Function (pdf) (section II)

II. Properties of Probability Density Functions f(x) 

Property 1: f(x) should be bounded for all values of x : .1)( ≤ x f    

Property 2: f(x) should be positive for all values of x : 0)( ≥ x f    

8/2/2019 Firp Paper

http://slidepdf.com/reader/full/firp-paper 4/19

4

Property 3: f(x) should be continuous for all values x.

)()()(0h0h

 x f  h x f   Limh x f   Lim =−=+→→

 

Property 4: The area under the curve f(x) from –infinity to + infinity should be 1.

1)( =∫+∞

∞−

dx x f    

Property 5: f(x) should tend to zero at –infinity and + infinity.

0)()( 

==∞>−∞−→ x f   Lim x f   Lim

 x x 

III. Why does the normal fail to capture the empirical distribution? 

The normal distribution has the following functional form:2

2

1

2

1)(

 

  

  −−

= σ 

µ 

σ π 

 x

e x f  

 

Although the normal distribution is the most widely used distribution, it has certain built-infeatures that create difficulty in fitting the normal distribution to the real data.

Firstly, the normal distribution is symmetric around the mean. Most distributions are right

skewed or left skewed. Therefore, to capture the empirical distribution we have to think of a

 pdf whose functional form allows asymmetry.

Secondly, there are only two degrees of freedom in a normal distribution i.e. the mean and the

standard deviation. With only two degrees of freedom it becomes difficult to capture the coredistribution (data points around the mean) and the tails simultaneously. We need to have more

knobs for fitting the distribution.

IV. Why normal distribution is still preferred?

Inspite of the disadvantages of the normal distribution it has certain properties that are still

desirable. For example, any p.d.f. should approach zero at the +inf and –inf (Property 5).

In literature, power law functions are used to handle such scenarios. Power law function in

general can be described by the following functional form: 

∑= −

+= N 

ii

i

i

b x

aa x f  

1

0)(

)(  

A power law function, although tractable and flexible, suffers from the problem of singularityat location parameters { ib }. Therefore, it cannot be used to fit the whole distribution but it can

 be used to model the tail. In fact the generalized pareto distribution (GPD), which is used tomodel tail in this paper, is a special form of power law distribution.

There are other distributions such as Cauchy’s distribution that are polynomial in nature and do

not have the problem of singularity.

8/2/2019 Firp Paper

http://slidepdf.com/reader/full/firp-paper 5/19

5

( )

π 

aca

b xa

c x f  

=>

−+=

;0

)(22

 

However, Cauchy’s distribution also has built in symmetry assumption around the location

 parameter “b”. Moreover, Cauchy’s distribution has only two degrees of freedom just like thenormal distribution. Therefore, it is not clear that it will provide a better fit to the empirical

distribution than the fit provided by the normal distribution.

It is difficult to think of a function that satisfies all the five properties of a pdf , is flexible and

is yet tractable. For example, let us consider a highly flexible pdf function which has the

following form:

0

)(6

344

232

110 )()()(

>

= −−−−−−−

i

b xab xab xaa

aall 

ce x f   

It is easy to see that the above functional form satisfies Property 1 (since )( x g e− is always less

than 1 if g(x)>0 for all x). Property 2 is also satisfied if c>0. Property 3 i.e. continuity property

is also satisfied (since )( x g e− is continuous for all x if  )( x g  is continuous). Propert5 is satisfied if 

a4>0. To satisfy Property 4 we have to adjust “c” so that the area under the p.d.f. sums up to

one.

V. New Approach 

As discussed in section IV, there are certain properties of Normal distribution, such as

continuity and tractability, which are desirable. However, there are some implicit assumptions,such as symmetry, which create problems while fitting the whole distribution. We want to

retain the tractability and continuity of the normal distribution but at the same time we want to

do away the symmetry. Moreover, we need to increase the degrees of freedom i.e. we need tohave more than two knobs for fitting the p.d.f.

I propose to use a functional form which is a linear combination of two normal pdfs

10

)()1()()(2

1)(;

2

1)(

21

2

1

1

2

2

1

1

1

2

2

2

2

1

1

≤≤

−+=

== 

  

  −−

 

  

  −−

w

 x f  w xwf   x f  

e x f  e x f  

 x x

σ 

µ 

σ 

µ 

σ π σ π  

Let us now examine whether all the properties of p.d.f. are satisfied by the proposeddistribution.

8/2/2019 Firp Paper

http://slidepdf.com/reader/full/firp-paper 6/19

6

Property 1: f(x) should be bounded for all values of x : .1)( ≤ x f    

Since 1)(and1)( 21 ≤≤ x f   x f   , therefore 1)( ≤ x f   because )( x f   will lie between

)(and)( 21 x f   x f   because )( x f   is a convex combination of  )(and)( 21 x f   x f   .

Property 2: f(x) should be positive for all values of x : 0)( ≥ x f    

Since 0)(and0)( 21≥≥ x f   x f   , therefore 0)( ≥ x f   because )( x f   will lie between

)(and)( 21 x f   x f   because )( x f   is a convex combination of  )(and)( 21 x f   x f   .

Property 3: f(x) should be continuous for all values x.

Since )(and )( 21 x f   x f   are both continuous, therefore )( x f   is continuous because )( x f   is

a linear combination of )(and )( 21 x f   x f    

Property 4: The area under the curve f(x) from –infinity to + infinity should be 1.

1)(and1)(( 1)1)(1()1( 

)()1()())()1()(()(

21

1121

===−+=

−+=−+=

∫∫

∫∫∫∫∞+

∞−

∞+

∞−

+∞

∞−

+∞

∞−

+∞

∞−

+∞

∞−

dx x f  dx x f  ww

dx x f  wdx x f  wdx x f  w xwf  dx x f  

Q

 

Property 5: The pdf should tend to zero at –infinity and + infinity.

0)(

0)0)(1()0(

)()1()(

)]()1()([)(

21

21

=

=−+=

−+=

−+=

−∞→

∞→∞→

∞→∞→

 x f   Lim

 similarly

ww

 x f   Limw x f   Limw

 x f  w xwf   Lim x f   Lim

 x

 x x

 x x

 

We now argue that the proposed functional form will provide a fit that is at least as good as the

normal distribution. Let us take a look at the proposed functional form of the p.d.f.

10

)()1()()( 21

≤≤

−+=

w

 x f  w xwf   x f   

If we set w=1 then we get a standalone normal distribution. Hence, if use the Maximum

likelihood method to approximate the parameter, then we know for sure that there is at least

one solution in with following parameters:

 R Rn

 x

n

 xw ∈∈

−===

∑∑22

2

11 ,;)(

;;1 σ µ µ 

σ µ   

8/2/2019 Firp Paper

http://slidepdf.com/reader/full/firp-paper 7/19

7

It is worth noting that with this simple arrangement we have overcome the symmetry

assumption. In addition, the degrees of freedom, or the knobs, have gone up from 2 to 5. The

additional parameters being: 22,, σ µ w  

The Maximum Likelihood (MLE) method is used to estimate the parameters of the of the proposed distribution

0;0

1w0

such that

2

1)1(

2

1)(ln 

21

2

1

2

2

1

1,,,,

2

2

2

2

1

1

2211

≥≥

≤≤

−+∑

 

  

  −−

 

  

  −−

σ σ 

σ π σ π 

σ 

µ 

σ 

µ 

σ µ σ µ 

ii x x

w

ewew Max

 We now test our solution, on 10 year daily returns data of BSE-Sensex. Using the method

described, we estimate the parameters of the proposed distribution (Figure 4) 

Figure 4: Parameter values of proposed distribution for 10 year daily-return data of BSE-sensex

Figure 5 shows the result of fitting the proposed distribution, on ten year returns data of BSE-

Sensex. The proposed distribution almost overlaps the empirical distribution and is definitely

 better than the normal distribution.

Similar analysis was done for the Infosys stock (a stock which is constituent of BSE-Sensex)

and Praj Industry stock (a stock which is not a part of BSE-Sensex). The results of the same are presented in Figure 6 and Figure 7 respectively. In all cases, the proposed distribution does

much better than the empirical distribution. Having fitted the distribution to the index returns I

now turn my attention to the distribution of the returns on options, which is believed to behighly skewed.

A hypothetical call option on BSE-Sensex with expiry of 1/1/2011 and Strike of 15000 is

considered. In addition, the risk free rate is assumed to be fixed at 5% and the volatility at 35%.The option return in this case depends only on the movement of the underlying and the time

decay. Even in this case, the proposed distribution matches well with the empirical distribution

(Figure 8).

8/2/2019 Firp Paper

http://slidepdf.com/reader/full/firp-paper 8/19

BSE-Sensex

0

0.2

0.4

0.6

0.8

1

1.2

-20 -15 -10 -5 0 5

x

      P     r     o      b     a      b      i      l      i      t     y

Empirical CDF Proposed CDF Normal CDF

Figure 5 : Results of fitting proposed pdf for BSE-Sensex returns

BSE-Sensex

8/2/2019 Firp Paper

http://slidepdf.com/reader/full/firp-paper 9/19

Infosys

0

0.2

0.4

0.6

0.8

1

1.2

-1 -0.5 0 0.5

empirical cdf f itted_cdf normalcdf 

Figure 6 : Results of fitting proposed distribution to Infosys returns

8/2/2019 Firp Paper

http://slidepdf.com/reader/full/firp-paper 10/19

Praj Industries

0

0.2

0.4

0.6

0.8

1

1.2

-0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1

returns

     c      d      f

empirical Fitted Normal

Figure 7: Results of fitting proposed distribution to Praj Industries return

8/2/2019 Firp Paper

http://slidepdf.com/reader/full/firp-paper 11/19

Call Option Analysis

0

0.2

0.4

0.6

0.8

1

1.2

-0.6 -0.4 -0.2 0 0.2 0.4 0.6

x

     c      d      f

empirical Fitted

Figure 8 : Results for fitting proposed distribution for Call option returns : Strike=15000, Expiry=

8/2/2019 Firp Paper

http://slidepdf.com/reader/full/firp-paper 12/19

12

VI. VaR 

In section V, we saw that the proposed p.d.f. fits empirical distribution much better than thenormal distribution. Now we need to see the impact of the proposed distribution on the value of 

VaR.

We arrive at the VaR numbers as follows:

∞−

 

  

  −−

∞−

 

  

  −−

 

  

  −−

=−

=−

−+=

==

VaR x

VaR

 x x

dxe PercentileVar 

dx x f   PercentileVar 

 x f  w xwf   x f  

e x f  e x f  

2

2

2

2

2

1

1

2

1

21

2

1

1

2

2

1

1

1

2

1

onDistributi NormalFor 

)(

)()1()()(

2

1)(;

2

1)(

ondistributiProposedFor 

σ 

µ 

σ 

µ 

σ 

µ 

σ π 

σ π σ π 

 

Figure 7: VaR Numbers for BSE-Sensex

It is clear that over-all VaR numbers are closer to the historical VaR numbers when the

 proposed distribution is used. This is the VaR number we will use for computing Expected

Shortfall.

VII. Expected Shortfall

Although the proposed distribution gives a much better measure of VaR, we need to seewhether this distribution can be used for extreme values as well. The conditional cumulative

 probabilities given by empirical distribution, normal distribution and proposed distribution are

 plotted on the same graph to check the fit at the tails (Figure 9). The conditional cumulative probabilities are defined as:

)()(1

)()()|Pr( y F 

VaR F 

VaR F  yVaR F VaR x yVaR xVaR VaR=

−+=>+<<  

8/2/2019 Firp Paper

http://slidepdf.com/reader/full/firp-paper 13/19

13

In the expression above, F is the cumulative probability distribution of negative of the returns.

Since we are concerned with VaR or the loss, it is okay to multiply the whole time-series by -1and focus on the right tail instead of left tail. Therefore, at 95% probability level F(VaR)=95%.

tails

0

0.2

0.4

0.6

0.8

1

1.2

0 1 2 3 4 5 6 7 8 9

u

   C  o  n   d   i   t   i  o  n  a   l   P  r  o   b  a   b   i   l   i   t   i  e  s

Cond/Emp Cond/Fi tted Cond/Norm 

Figure 9 – Conditional Probability distribution for tails

Cleary neither the normal distribution nor the proposed distribution explains the tail

satisfactorily. Therefore, we need to fit the tail separately to estimate the ExpectedShortfall number.

To do that , I fit a Generalized Pareto Distribution (GPD) to the tail. GPD is defined as: 

≥  

   +−

=−

= −

 0if  )11

0if  1

)( 1

ξ σ ζ 

ξ 

ξ 

σ 

 y

e

 yG

 y

 

Where y is excess-over VaR, ζ  is the shape parameter and σ  the scale parameter. 

Tail-BSE Sensex

8/2/2019 Firp Paper

http://slidepdf.com/reader/full/firp-paper 14/19

14

We again use maximum likelihood method to estimate the shape and scale parameter.

There are few points worth mentioning about the GPD:

1.  GPD is conditional distribution.

2.  GPD is cumulative distribution.

3.  y is the excess value over the threshold and y >0

4.  ζ  can never be negative because if it were the case then the conditional

distribution will approach infinity as y tends to infinity. 

5.  σ  can never be negative because ζ  in non negative andσ 

ζ  y+1 >0 for all y and

y>0. 

The result of fitting GPD on BSE-sensex tail is shown in Figure 10:

Tail Distribution

0

0.2

0.4

0.6

0.8

1

1.2

0 1 2 3 4 5 6 7 8 9

u

   C   o   n   d   i   t   i   o   n   a

   l   P   r   o   b   a   b   i   l   i   t   i   e   s

Cond/Emp Cond/Fitted Cond/Norm Cond/Pareto

Figure 10 : Generalized Pareto Distribution fits the tail much better than either normal or the

proposed distribution.

Tail-BSE Sensex

8/2/2019 Firp Paper

http://slidepdf.com/reader/full/firp-paper 15/19

15

Expected Shortfall(ES) is defined as

∫∞

=

VaR

VaR

dx x f  

dx x xf  

 ES 

)(

)(

 

For GPD it is possible to obtain a closed form solution for ES:

≥−

+

=+

= 0if  

1

0if  

ξ ξ 

σ 

ξ σ 

VaR

VaR

 ES   

For the BSE-Sensex data the estimated values for shape and sigma are ξ =0 and

σ =1.243706

Figure 11 : The Expected Shortfall measure at 95% Var Measure.

The ES number for Generalized Pareto lies in between ES predicted by the Normal

Distribution and ES predicted by the proposed distribution. This is to be expected from

what we see in figure 9. The Normal Distribution is under-estimating the values of extreme events where as the proposed distribution is over-estimating the values of 

extreme events.

VIII Analysis for S&P 500

First we fit the distribution on 10 year daily returns data for S&P and fit the mixeddistribution and the normal distribution.

8/2/2019 Firp Paper

http://slidepdf.com/reader/full/firp-paper 16/19

Distributions - SNP

0

0.2

0.4

0.6

0.8

1

1.2

-15 -10 -5 0 5

returns

     c      d      f

Empirical Normal Fitted

Figure 12: For S&P the Proposed Distribution does a much better job

8/2/2019 Firp Paper

http://slidepdf.com/reader/full/firp-paper 17/19

17

The parameters estimates for the proposed distributions are:

Figure 13: Parameters of the Proposed Distribution for S&P

Then we calculate VaR Estimates for various percentiles:

Figure 14: The VaR numbers are closer to empirical distribution if use the Proposed

distribution.

Finally we fit a Pareto distribution to the tail of daily returns of S&P distribution (figure

14)

Tail-S&P

0

0.2

0.4

0.6

0.8

1

1.2

0 1 2 3 4 5 6 7 8

excess return over VaR

   C  o  n   d   i   t   i  o  n  a   l   P  r  o   b  a   b   i   l   i   t   i  e  s

Conditonal Empricial Conditional Normal Conditional fitted Pareto 

Figure 14: The Pareto distribution fits the tail in a much better way for SNP

8/2/2019 Firp Paper

http://slidepdf.com/reader/full/firp-paper 18/19

18

Figure 15 : The expected Shortfall number from fitting Pareto, Normal and the proposeddistribution

IX. Conclusion:

The method of taking weighted average pdfs fits entire the empirical distribution much better than the normal distribution. Although I have taken the pdf to be a mixture of two

normal pdfs, there is no such restriction on what kind of pdfs to be used or how many

 pdfs to be used. We could have used 3 normal distributions or 2 normal and 1 Cauchydistributions. For the purpose of this study the weighted sum of two normal distributions

was sufficient. We use this distribution to approximate the VaR number and showed thatthe VaR computed using the proposed distribution was much closer to the VaR predicted by the empirical data. Next we examined the behavior at the tail and found that the

 proposed distribution does not satisfactorily explain the behavior at the tail. This is going

to be the case irrespective of the distribution we use. If we try to fit a single curve to the

whole curve, it will fit the core better than at the tails. Therefore, we need to model thetail separately. To model the distribution of the tail we fitted a Generalized Pareto

Distribution (GPD) to the tail alone. Form the GPD, we extract the value for Expected

Shortall (ES).

8/2/2019 Firp Paper

http://slidepdf.com/reader/full/firp-paper 19/19

19

References:

1.  Wo-Chiang Lee, Applying Generalized Pareto Distribution to the Risk Management of Commerce

Fire Insurance

2.  http://www.autonlab.org/tutorials, Maximum likelihood 3.  Ramazan Gençay  , Faruk Selçuk , Abdurrahman Ulugülyaˇgci, High volatility, thick tails and

extreme value theory in value-at-risk estimation 4.  Alexander J. McNiel, Rudiger Frey, Quantitative Risk Management.