On the analysis of cross-correlations in South African market data

ARTICLE IN PRESS

Physica A 344 (2004) 294–298

0378-4371/$ -

doi:10.1016/j

�CorrespoE-mail ad

www.elsevier.com/locate/physa

On the analysis of cross-correlations in SouthAfrican market data

Diane Wilcoxa,�, Tim Gebbieb

aDepartment of Mathematics & Applied Mathematics, University of Cape Town, Rondebosch,

Cape Town, 7700, South AfricabFutureGrowth, 3rd Floor Great Westerford, Main Rd, Rondebosch, 7700, South Africa

Received 12 December 2003

Available online 21 July 2004

Abstract

We report briefly on an application of random matrix theory to the analysis of SA financial

market data (An Analysis of cross-correlations in South African financial market data, e- print

cond-mat/0402389). Correlation matrices C are constructed from 10 years of daily data for

stocks listed on the Johannesburg Stock Exchange from January 1993 to December 2002.

Spectral properties of C are tested against random matrix predictions. We highlight some

quantitative differences which arise when treating prices as existing only when measured, as

opposed to interpolating missing or illiquid trading days with a zero-order hold.

r 2004 Elsevier B.V. All rights reserved.

PACS: 02.10.Yn; 05.40.Ca; 05.45.Tp; 87.23.Ge

Keywords: Random matrices; Cross-correlations; Finance; Emerging markets

1. Introduction

We consider how the treatment of missing data and thin trading (no priceschanges for a stock over several time periods) impacts on the computation of cross-correlations in an emerging market. Our investigation is based on 10 years of daily

see front matter r 2004 Elsevier B.V. All rights reserved.

.physa.2004.06.138

nding author. Tel.: +27-21-650-3191; fax: +27-21-650-2334.

dresses: [email protected] (D. Wilcox), [email protected] (T. Gebbie).

www.elsevier.com/locate/physa

ARTICLE IN PRESS

D. Wilcox, T. Gebbie / Physica A 344 (2004) 294–298 295

data for traded shares listed on the JSE main board from January 1993 to December2002 [1].Properties of random matrices first became known with Wigner’s seminal work in

the 1950s for application in nuclear physics [2,3]. More recently, random matrixtheory has been applied to calibrate and reduce the effects of noise in financial timeseries. The extent to which properties of the estimated correlation matrices deviatefrom random matrix theory (RMT) predications clarifies the status of theinformation derived from the computed matrices (cf. Refs. [4–8]).Let A denote an N � L matrix whose entries are i.i.d. Gaussian random variables

with zero mean and unit variance. The matrix W ¼ 1L

AAT is referred to as aWishart

matrix. Letting N ;L ! 1 while Q ¼ L=N is kept fixed, we summarise universalproperties to be applied in our analysis, namely the Wishart distribution foreigenvalues [9,10] the Wigner surmise for eigenvalue spacings [2,3] the Porter–Tho-mas distribution of and the inverse participation ratio (IPR) for eigenvectorcomponents [3,7,8]:

Universal properties of Wishart matrices

Density ofeigenvalues
pðlÞ ¼
Q2pl

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiðlmax � lÞðl� lminÞ

p; l 2 ½lmin; lmax

0; otherwise;

(

where lmax =min ¼ 1þ 1=Q � 2ffiffiffiffiffiffiffiffiffi1=Q

p:

ð1Þ

Density ofspacings ofordered eigenvalues

pðsÞ ¼ s2p exp � sp2

4

� �;

where s ¼ ðliþ1 � liÞ=d and d ¼P

ðliþ1 � liÞ:ð2Þ

Density ofeigenvectorcomponents

pðuÞ ¼1ffiffiffiffiffiffi2p

p exp �u2

2

� �: ð3Þ

Expected IPR
E½Ia ¼ 3=N ð4Þ
2. Cross-correlations in JSE data

We summarise some of the known qualitative features of the JSE, one of the 20largest national markets in the world. Although many of the main board JSE sharesare illiquid, the market as a whole is a fairly liquid one. About six dominant shares

ARTICLE IN PRESS

D. Wilcox, T. Gebbie / Physica A 344 (2004) 294–298296

account for almost a third of the index and have a bias towards resources. Theresources sector in turn is strongly correlated with the rand-dollar exchange rate.Next, since shares are not always listed at the same time, different shares do notalways trade on the same day. Lastly, some shares which trade infrequentlyoccasionally trade in large volumes for several days.The data set used incorporated a zero-order hold for prices when there was no

trading. This accounts for sequences of zero-valued returns in the return times-serieseven though no measurements occurred. The 10 years of data was windowed tocreate six sets of 5 years of price data consisting of 253, 293, 321, 330, 335 and 341shares, respectively ranging from 1993–1997 to 1998–2002. Each block was screenedto remove shares which were de-listed or which traded very infrequently.Investigating the effect of different treatments of measurements for prices, ourapproach favours the notions that (1) no trading implies no price measurement, and(2) share cross-correlations can only be computed when there are measurements onthe same day. We find the returns riðtÞ ¼ ln Siðt þ DtÞ=SiðtÞ; where SiðtÞ denotes theprice of asset i 2 f1; . . . ;Ng at time t. We compute the usual cross-correlation matrix,Cij ¼ ðhrirji � hriihriiÞ=sisj ; for idealized data (non-zero price fluctuations, nomissing data), where h. . .i denotes average over period studied and s2i :¼ hr2i i �

hrii2 is the variance of the price changes of asset i. We apply the measured-data cross-

correlation matrix, Cij ¼ ðhrirji � hriihriiÞ=BiBj ; for the case when there is missingdata in returns time series, where ri andrj denote subseries of ri and rj such that foreach i–j pair there exists measured data for both ri and rj at every time period in thesubseries, and B2i :¼ hr2i i � hrii

2: We compute covariance matrices for the 6 sets ofdata in three different ways.

Case 1: We assign the value of zero whenever there is no measured data andcompute the covariance matrix in the usual way.

Case 2: We compute the measured-data covariance matrix to overcome missingmeasurements.

Case 3: We address the problem of no trading, i.e., zero price fluctuations, forseveral time periods in succession. To do so, in the event of 2 or more successivezero-valued price fluctuations we delete the measured value riðtÞ, effectively turningthe zero-valued information into missing data. We then compute the measured-datacovariance matrix. Once the matrices are computed we apply random matrixmethods to analyse their spectral properties. We consider the temporal stability ofthe correlation matrices for the 6 epochs by computing overlap matrices, and alsoinvestigate the characteristics of the eigenvectors, determined by their components,particularly those corresponding to large eigenvalues (see Ref. [1]).Figs. 1 and 2 summarise some our analysis. Fig. 1 gives a comparison of some of

the spectral properties for two of the three different covariance matrices computed asdescribed above, along side RMT predictions. Fig. 2 depicts the autocorrelation ofeigenvectors. Some notable differences in properties of the matrices are exposed. Asin preceding analyses of financial market data, in both cases reported here, asignificant part of the distribution of eigenvalues falls within the range of RMTpredictions. However, it was found that by computing measured-data covariancematrices, a far less substantial part of the spectrum falls within the range of the

ARTICLE IN PRESS

0 5 10 150

0.2

0.4

0.6

0.8

1

Eigenvalue PDF’s

0 2 40

0.5

1

1.5

0 2 40

0.2

0.4

0.6

0.8

1

dens

ity o

f spa

cing

s

Nearest-neighbour spacings

100

Inverse Participation Ratios

loca

lizat

ion

0 10 200

0.2

0.4

0.6

0.8

1

Eigenvalue λ

-5 0 50

0.5

1

0 2 40

0.2

0.4

0.6

0.8

1

Spacing ξk+1

− ξk

dens

ity o

f spa

cing

ssampleRMT

100

100

100

10-1

10-2

10-3

10-1

10-2

10-3

Eigenvalue λ

loca

lizat

ion

CA

SE

3

(a) (b) (c)

P(λ

)C

AS

E 1

P

(λ)

Fig. 1. Spectral properties of the covariance matrices computed for daily returns of shares on the JSE

Main Board from 1998 to 2002 for Cases 1 and 3. (a) The eigenvalue densities with circles highlighting

distinct eigenvalues greater than RMT predicted maximum values; Inset: the Wishart distribution, Eq. (1),

is superimposed on the distribution of small eigenvalues (lo5). (b) The nearest-neighbour spacingdistributions of the folded eigenvalues [3,8] with a plot of the Wigner Surmise, Eq. (2). (c) The IPRs,

Ia ¼PN

i¼1ðviaÞ4, where vi

a is the ith component of eigenvector va; Eq. (4) give the RMT prediction for E½Ia.

5 10 15

2

4

6

8

10

12

14

Overlap delay 0

eige

nvec

tor

#

eigenvector #5 10 15

2

4

6

8

10

12

14

Case 1 : Overlap delay 1

eigenvector #

-1

-0.5

0

0.5

1

5 10 15

2

4

6

8

10

12

14

Case 3 : Overlap delay 1

eigenvector #

corr

elat

ion

of e

igen

vect

ors

(a) (b) (c)

Fig. 2. Overlap matrices are computed using 6 sets of 5-year windows of daily returns. (a) The overlap

matrix with zero delay: there is perfect overlap with ones on the diagonal and zeros elsewhere for the

autocorrelation of eigenvectors from period 1st January 1998 to 31st December 2002. (b) and (c) The first

temporal overlaps for Cases 1 and 3, respectively, i.e, the correlations of the largest 14 eigenvectors

computed for the epoch 1998–2002 with those for the period 1997–2001.

D. Wilcox, T. Gebbie / Physica A 344 (2004) 294–298 297

Wishart distribution. In contrast, our results suggest that using zero-values whenthere are no measurements tends to yield a distribution of eigenvalues which is moredominated by Gaussian noise (Case 1). Further analysis, including a closer

ARTICLE IN PRESS

D. Wilcox, T. Gebbie / Physica A 344 (2004) 294–298298

examination of the distribution of eigenvectors components and characteristics isgiven in Ref. [1].

References

[1] D. Wilcox, T. Gebbie, An analysis of cross-correlations in South African financial market data, e-

print cond-mat/0402389.

[2] M.L. Mehta, Random Matrices, second ed., Academic Press, New York, 1991.

[3] T. Guhr, A. Muller-Groeling, H.A. Weidenmuller, Random matrix theories in quantum physics:

common concepts, Phys. Rep. 299 (1998) 190.

[4] Z. Burda, J. Jurkiewicz, M.A. Nowak, G. Papp, I. Zahed, Free Levy matrices and financial

correlations, e-print cond-mat/0103109.

[5] S. Droz’dz’ , J. Kwapien, J. Speth, M. Wojcik, Identifying complexity by means of matrices, Physica A

314 (2002) 355–361, e-print cond-mat/0112271.

[6] S. Gallucio, J-P. Bouchaud, M. Potters, Rational decisions, random matrices and spin glasses,

Physica A 259 (1998) 449–456.

[7] L. Laloux, P. Cizeau, J.P. Bouchaud, M. Potters, Noise dressing of financial correlation matrices,

Phys. Rev. Lett. 1467–1470.

[8] V. Plerou, P. Gopikrishnan, B. Rosenow, L.A.N. Amaral, T. Guhr, H.E. Stanley, Random matrix

approach to cross correlations in financial data, Phys. Rev. E 65 (2001) 066126.

[9] Z.D. Bai, Methodologies in spectral analysis of large dimensional random matrices, a review, Statist.

Sinica 9 (1999) 611–677.

[10] A.M. Sengupta, P.P. Mitra, Distribution of singular values for some random matrices, Phys. Rev. E

60 (1999) 3389.

On the analysis of cross-correlations in South African market data

Documents

Transcript of On the analysis of cross-correlations in South African market data