Comparative tests of Fama-French Three and Five-Factor models using Principal Component Analysis on...

Comparative tests of Fama-French Three and

Five-Factor models using Principal Component

Analysis on the Chinese Stock Market

Name: Eric Lai

Degree: BSc Investment and Financial Risk Management

Cass Business School

Supervisor Name: John Hatgioannides

Submission Date: 8th April 2016

"I certify that I have complied with the guidelines on plagiarism

outlined in the Course Handbook in the production of this

dissertation and that it is my own, unaided work".

Signature...................................................................................................

1

Acknowledgements

I offer my gratitude and thanks to my supervisor John Hatgioannides who has

piqued my interest in financial engineering for asset pricing models and guided me

throughout the project.

I also thank my friends and family for their continued support with helping me

make this project possible.

2

Table of Contents

Acknowledgements .................................................................................................................. 1

List of Tables............................................................................................................................ 4

List of Figures .......................................................................................................................... 5

Abstract .................................................................................................................................... 6

1. Introduction ......................................................................................................................... 7

1.1. Chinese Stock Market Overview .................................................................................. 7

1.2. Outline of the paper....................................................................................................... 9

2. Literature Review ............................................................................................................. 10

2.1. Harry Markowitz Portfolio Theory............................................................................. 10

2.2. Capital Asset Pricing Model ....................................................................................... 12

2.3. Time-varying beta ....................................................................................................... 13

2.4. Early Tests ................................................................................................................... 13

2.5. Alternative variables and higher moment models .................................................... 14

3. Empirical Framework ....................................................................................................... 17

3.1. Fama and French Three-Factor Model ...................................................................... 17

3.2. Principal Component Analysis ................................................................................... 18

3.3. Fama and French Five-Factor Model ........................................................................ 22

4. Data .................................................................................................................................... 24

5. Methodology ....................................................................................................................... 27

5.1. Fama-French Three-Factor Model ............................................................................. 27

5.2. Principal Component Regression ............................................................................... 28

5.3. Fama and French Five-Factor Model ........................................................................ 31

6. Results ................................................................................................................................ 34

7. Diagnostics ......................................................................................................................... 40

7.1. Multicollinearity .......................................................................................................... 40

7.2. Normality ..................................................................................................................... 41

7.3. Heteroscedasticity ....................................................................................................... 42

3

7.4. Autocorrelation ............................................................................................................ 43

7.5. Stationarity .................................................................................................................. 44

8. Empirical Analysis ............................................................................................................ 46

8.1. Summary Statistics for factor returns ....................................................................... 46

8.2. Regression Analysis ..................................................................................................... 48

8.3. Principal Component Analysis ................................................................................... 50

8.4. Model Evaluation ......................................................................................................... 51

9. Conclusion .......................................................................................................................... 53

References .............................................................................................................................. 55

4

List of Tables

Table 1: Total index weight of the top 100 market constituents listed in the Shanghai

Stock Exchange from the period June 2004 to June 2015 ................................................. 25

Table 2: Construction of Size and BE/ME factors ............................................................. 28

Table 3: Construction of Size, BE/ME, profitability, and investment factors ................. 33

Table 4: Average monthly percent returns for portfolios formed on Size and BE/ME, Size

and OP, Size and Inv; June 2004 to June 2015, 132 months ........................................... 34

Table 5: Summary statistics for average monthly factor returns .................................... 35

Table 6: Fama and French 3-Factor model regressions for 9 Size-BE/ME portfolios; June

2004 to June 2015 ................................................................................................................. 36

Table 7: Principal component regressions for 9 Size-BE/ME portfolios; June 2004 to June

2015 ....................................................................................................................................... 37

Table 8: Fama and French 5-factor 2 x 2 sort regressions for 9 Size-BE/ME portfolios;

June 2004 to June 2015 ....................................................................................................... 38

Table 9: Fama and French 5-factor 2 x 2 x 2 x 2 sort regressions for 9 Size-BE/ME

portfolios; June 2004 to June 2015 ..................................................................................... 39

Table 10: Correlation matrix of Fama and French 3-factor model factors ...................... 40

Table 11: Correlation matrix of Fama and French 5-factor 2 x 2 sort model factors ..... 40

Table 12: Correlation matrix of Fama and French 5-factor 2 x 2 x 2 x 2 sort model factors

................................................................................................................................................. 41

Table 13: Jarque-Bera test of Fama and French 3-factor model ...................................... 41

Table 14: Jarque-Bera test of Fama and French 5-factor 2 x 2 sort model ..................... 42

Table 15: Jarque-Bera test of Fama and French 5-factor 2 x 2 x 2 x 2 sort model ......... 42

Table 16: White's Heteroscedastic test of Fama and French 3-factor model .................. 43

Table 17: White's Heteroscedastic test of Fama and French 5-factor 2 x 2 sort model .. 43

Table 18: White's Heteroscedastic test of Fama and French 5-factor 2 x 2 sort model .. 43

Table 19: Breusch-Godfrey test of Fama and French 3-factor model .............................. 44

Table 20: Breusch-Godfrey test of Fama and French 5-factor 2 x 2 sort model .............. 44

Table 21: Breusch-Godfrey test of Fama and French 5-factor 2 x 2 x 2 x 2 sort model . 44

5

Table 22: Augmented Dickey-Fuller test of all nine portfolio log returns series (SL, SN,

SH, ML, MN, MH, BL, BN, and BH) ................................................................................. 45

Table 23: Correlation of PCA factors and factors of the Fama-French three-factor model

................................................................................................................................................. 50

Table 24: Weighting coefficients of principal components of average monthly returns for

nine portfolios sorted on Size-BE/ME from June 2004 to June 2015, 132 months ......... 50

Table 25: Summarised alpha and R2 of regressions on nine portfolios sorted on Size-

BE/ME from June 2004 to June 2015, 132 months ........................................................... 51

List of Figures

Figure 1: Graphical orthogonal transformation of principal component analysis .......... 19

Figure 2: Scree plot of ordered 9 eigenvalues and Kaiser-Guttman rule ........... 30

6

Abstract

The Fama-French (1993) three-factor model directed at capturing size and value

patterns in average stock returns is comparatively tested using principal component

analysis. Motivated by the missed variations in average returns of the three-factor

model. The three-factor model is augmented with Fama-French (2015) 2 x 2 and 2 x 2 x

2 x 2 joint controls for profitability and investment factors. Size, value, and profitability

effects are found similar to the findings of Fama and French (2015). Furthermore,

opposite investment effects are observed in which firms that invest aggressively yield

higher average returns than firms that invest conservatively. Moreover, the value factor

is not found to be redundant or absorbed by the slopes of profitability and investment

factors.

7

1. Introduction

Following the central role of Markowitz's (1952) portfolio theory, an abundant

amount research of fundamental factors and tests within asset pricing models has been

widely debated among practitioners and academics. Embodied by the Markowitz (1952)

mean-variance model of optimal portfolio construction, the Capital Asset Pricing Model

(CAPM) of Sharpe (1964), Lintner (1965), and Black (1972), has long shaped the

perception of risk and return. With growing recognition of asset pricing modelling,

researchers (see, Ross, 1976; Stattman, 1980; Banz, 1981; Basu, 1983; Rosenberg, Reid

and Lanstein, 1985 and Chan, Hamao, and Lakonishikok, 1991) have since incorporated

alternative theories and factors such as: macroeconomic indices, debt-to-equity,

earnings-to-price, book-to-market and size, to more accurately model the cross-sectional

excess portfolio returns.

Tests within the study follow the Fama and French (1993, 2015) three-factor and

five-factor empirical model, which is directed at capturing the size, value, profitability

and investment patterns in average stock returns. Evaluations of three-factor model are

made against a constructed principal component regression up to the third component.

Evidence shows that the three-factor model does not fully capture average cross-

sectional returns of the Chinese stock market, which further prompts the performance

assessment and augmentation of the model to accommodate for profitability and

investment factors.

Since the economic reforms of China in 1978, emerging markets have gradually

repealed government controls for a more open economy to domestic markets and foreign

investments. Spurred by the historic large returns of emerging markets, investors

undertake the inherent sources of higher risk associated with the higher returns in

these markets. The motivation of the study lies in these additional sources of risks in

emerging markets, and tests whether these risks and premiums are sufficiently priced

within the Fama-French three-factor and five-factor model.

1.1. Chinese Stock Market Overview

Stocks listed on the Shanghai Stock Exchange (SSE) and Shenzhen Stock

Exchange (SZSE) are listed as A-shares and B-shares. Due to government restrictions,

A-shares are not available for purchase through foreign investment except through

strict regulations are known as Qualified Foreign Institutional Investor (QFII) system.

On the contrary, B-shares are eligible for foreign investment. The focus of the paper will

8

be the constituents of Shanghai Stock Exchange Composite Index which track daily

performance of all A-shares and B-shares listed on the Shanghai Stock Exchange.

Through long-term efforts of reformation, the People's Republic of China plan for a more

open economy and uniform investment by the process of combining the two stock

classes. This transition, albeit slowly, of blending the two classes has seen the

conversion of listed B-shares into H-shares, the neighbouring market asset class of Hong

Kong. As a result of persistent tight regulations, foreign listed B-shares experienced

poor liquidity, thus making sizeable investments difficult. The conversion to H-shares

into an environment with overall more standardised regulations served to attract higher

foreign investment.

Under China's stringent political system, the government plays an important

role in the economy by means of control and ownership of the country's largest

monopolies, known as the China state-owned enterprise (SOE) system. These state-

controlled enterprises extend throughout sectors such as finance, chemical, transport,

construction, etc. SOEs dominate sectors such as the oil industry where the state-owned

portion of shares exceed 70 percent, whereas the tobacco industry is completely state-

owned. However, the transition to a more open economy has seen progress in late 2014

which saw Sinopec, one of China's largest monopolies alongside PetroChina co., sell a

$17.4 billion stake to domestic investors. Despite these efforts, the political and

regulatory environment in China still operates as a heavy hand under a transition

economy. Additionally, more market does not necessarily mean less government, it

advocates different government where enterprises such as banks must be separated to

become more competitive in a free market.

Asset pricing models assume normality in returns which generally performs well

in developed markets, even with small departures from the over-simplified assumption

of normal returns. On the contrary, emerging markets such as China are more likely to

experience shocks induced by political, regulatory, and many more issues. These

complications indicate that stock returns may have different skewness, kurtosis, and

higher volatility. Thus, treating emerging markets in the same manner as other

developed markets could be erroneous. This study aims to provide insight on how asset

pricing models perform in emerging markets and what implications emerging markets

have on financial theory.

9

1.2. Outline of the paper

The remainder of the study is organised as follows. Section 2 provides the

background literature review. Section 3 defines the empirical framework. Section 4

discusses the data used. Section 5 demonstrates the methodology, and the results and

diagnostic tests are presented in Section 6 and 7 respectively. Section 8 analyses the

empirical results and Section 9 concludes the evidence and findings.

10

2. Literature Review

The paradigm of portfolio theory, in particular, has been heavily researched and

debated over the past five decades. The purpose of the literature review is to outline and

follow the landmarks in the key issues concerning wealth allocation and discuss the

pivotal implications and contributions to financial theory.

2.1. Harry Markowitz Portfolio Theory

Harry Markowitz’s (1952) research on portfolio theory was the first to clearly

outline and pave the way for modern portfolio theory. Markowitz’s paper “Portfolio

Selection” which has proven to be one of the most pivotal contributions to financial

theory, introduces the “mean-variance” (E-V rule) model which constructs the optimal

portfolio for a number of risky assets according to the risk-adverse nature of an investor.

The prevailing theory defines two parameters, expected return and variance, in which

Markowitz asserts that there is a rate at which investors can gain by taking on

variance, or reduce by giving up expected return. Markowitz builds the model on the

following assumptions:

I. Investors are risk adverse such that given an expected return, they will minimise

portfolio variance.

II. Investors will maximise portfolio expected return given a portfolio variance.

III. Investors maximise the expected utility curve that demonstrates the diminishing

marginal utility of wealth.

IV. Investors make decisions and estimates solely on the basis of expected return

and risk.

V. Markets are perfectly efficient, such that all assets are liquid and there are no

transaction costs or taxes.

VI. Quantities of all assets are restricted on short selling such that for all

securities i.

Under these assumptions, the efficient portfolio is defined by considering the

portfolio expected return and variance parameters. These parameters are derived using

the following formulas,

expected portfolio return,

11

the variance of portfolio returns,

and the covariance between ith and jth assets in the portfolio,

.

The is the expected return of the portfolio, and are percentage

weights of securities i and j where is the expected return of security i. is the

variance of the portfolio calculated using the term which is the covariance between

the ith and jth assets in the portfolio.

Markowitz theorem emphasised the importance of portfolio selection with

consideration to imperfect correlations and co-movements with all other securities in the

portfolio. Due to these factors, diversification can be achieved such that two imperfectly

correlated securities can yield higher returns with a lower weighted variance of both

securities. The ability to construct these portfolios resulted in a portfolio that has same

expected return, but reduced risk than a portfolio constructed ignoring the interactions

between securities (Elton, Gruber 1997). Under these assumptions, Markowitz

developed the concept of the efficient frontier which is a set of portfolios in which there

is no other portfolio that has a higher expected return given a standard deviation of

return (E-V rule).

Furthermore, there have been many alternative portfolio theories such as Kraus

and Litzenberger (1976) which incorporated the effect of skewness on valuation to more

realistically explain the distribution of returns. Since then, researchers (see, Ross, 1976;

Stattman, 1980; Banz, 1981; Basu, 1983; Rosenberg, Reid and Lanstein, 1985 and Chan,

Hamao, and Lakonishikok, 1991) consider higher moment factors such as:

macroeconomic indices, debt-to-equity, earnings-to-price, book-to-market and size, to

more accurately model the cross-sectional excess portfolio returns. Much of the research

have been milestones in the journey of portfolio theory which will be further discussed.

Nevertheless, Markowitz’s mean-variance has remained the cornerstone of modern

portfolio theory despite the abundant research and alternative theories in the past five

decades.

12

2.2. Capital Asset Pricing Model

The problem remained concerning how to evaluate the relationship between

variance and expected return, and how to discern what proportion of risk is from the

market. Subsequently, a tremendous amount of research was dedicated to estimate the

inputs in Markowitz’s portfolio theory, variance, correlation and covariance. The main

tool to estimate covariance was found to be derived from single index models, in which

Sharpe (1964) developed and popularised.

Where is the return of the market at time t and the error term is assumed

to have a mean of zero and variance is the sensitivity of the security to movements

in the market, a indicates that the security is more sensitive than the market, and

shows that the security is less sensitive than the market.

Following the portfolio theory developed by Markowitz (1952), William Sharpe

(1964), John Lintner (1965) and Jan Mossin (1966) formed the Sharpe Lintner Mossin

Capital Asset Pricing Model more commonly known as the CAPM. CAPM asserts that

the equilibrium portfolio return of risky assets is a linear function of the portfolio

covariance with the market. The proxy for the market beta is denoted as , CAPM

postulates that the market m is mean-variance efficient according to the similar

assumptions made by Markowitz (1952). The CAPM is formed under the following

assumptions:

I. Perfect markets in which all securities are priced correctly reflecting freely

available perfect information with a large number of buyers and sellers in the

market.

II. There are no transaction costs or taxes.

III. Investors are all risk adverse and desire to maximise their expected utility that

demonstrates the diminishing marginal utility of wealth.

IV. Investors hold diversified portfolios requiring return for the systematic risk of

the portfolio, idiosyncratic risk is therefore ignored.

V. Investors can freely borrow and lend at the risk-free rate of return.

VI. Investors are risk adverse and wish to aim to maximise portfolio return.

The CAPM equation is written as follows:

13

Where is the risk-free rate of lending and borrowing. The coefficient of

CAPM is assumed to be constant over all risky securities in the portfolio and stable over

time. Thus, the CAPM models a linear positively sloped relationship between expected

portfolio returns and beta. Because of the simplistic nature and ease of use, the CAPM

still remains a widely used and is the most popular model by US companies to estimate

the cost of capital (Graham and Harvey 2001). Black, Jensen and Scholes (1972) and

also Fama and Macbeth (1973) support the CAPM in their paper and find that during

the sample period prior to 1969, there exist a simple positive relationship between beta

and expected portfolio returns. However, in response to the over-simplistic assumptions

made by CAPM, many authors (see, Fama and French, 1992; Jagannathan and Wang,

1996; Fabozzi and Francis, 1978; Lakonishok and Shapiro, 1986; among others) argue

that the standard CAPM that uses the market index as a proxy performs poorly in

explaining cross-sectional stock returns.

2.3. Time-varying beta

Jagannathan and Wang (1996) assert that the systematic risk measured by

CAPM vary over time. Empirical evidence showed that during recessionary periods,

leverage of troubled firms increased which also caused the beta to increase. Thus, the

constant CAPM assumed beta does not capture accurately the real world returns which

are inherently dynamic and not static. Fabozzi and Francis (1978) classify the factors

that lead to a time-varying beta into four categories, firstly, microeconomic variables

such as leverage, dividend and management changes. Second, macroeconomic influences

such as inflation and business cycles. Third, political factors such as war, labour

legislation, elections and pollutions. And finally, market factors such as market

conditions, disintermediation and credit crunches.

2.4. Early Tests

Many of the early tests of CAPM (see, Lintner, 1965; Douglas, 1968; Black,

Jensen and Scholes, 1972) employed time series and cross-sectional regressions to test

the performance of CAPM. However, Lintner (1965) found that the variance of the error

term of the time series regression to be significant, which suggests that the CAPM is

inadequate in explanatory power. Similar results were also found by Douglas (1968) in a

different study by using the same method of Lintner.

14

2.5. Alternative variables and higher moment models

In response to the early tests and poor performance of the standard CAPM,

alternative variables were examined beyond the mean and variance parameters which

allowed models to take higher moments for financial modelling. The abundance of early

tests and investigations of CAPM were a pivotal point in financial theory as this

subsequently led to an enormous amount of research, where a variety of variables were

tested to explain portfolio characteristics and asset returns. Of these numerous

variables studied, some of the most prominent factors will be explored in depth such as

size, book-to-equity, earnings-price, debt-to-equity and macroeconomic variables.

Banz (1981) tested the size effect of firms in explaining the residual variation of

average cross-sectional returns. Banz finds that average return for small stocks are too

high given their beta estimates, and average returns for large stocks are too low. Banz

further asserts that there is a size effect premium which is an empirical finding that

firms with small market capitalisation exhibit higher returns than those of large firms.

This initial evidence showed that size effect is not linear to market value, and the effect

is greater observed for small firms. Chan et al. (1985) also found that small size firms

are more sensitive to macroeconomic factors such as production risks and changes in

risk premium than larger firms, which may explain why the size factor is significant in

capturing cross-sectional returns. Researchers have since tried to explain the size

premium anomaly due to illiquidity, greater transaction costs, less information available

and unreliable beta estimates. However, these explanations do not explain with

certainty if size factor really is insignificant. Thus, the simplistic standard CAPM does

not seem to fully capture the size effect factor.

Additional early evidence of alternative relevant factors was found in the book-

to-market equity (BE/ME) effect of firms. Stattman (1980) and Rosenberg et al. (1985)

found a value premium in US stocks whereby high book value firms were undervalued

by the market and would become profitable if held long as prices increase. This positive

relation between average returns and ratio of a firm’s book-to-market value were also

confirmed in the study of Chan, et al. (1991) in the Japanese market.

Prior to the 1990’s, the additional factors that were researched were only used to

identify the insufficient explanatory power of beta and anomalies of the CAPM. Fama

and French (1992) evaluated the significance of market beta, size, book-to-market,

earnings/price (E/P) and debt-to-equity on NYS, AMEX and NASDAQ stocks and found

15

that beta alone was not a comprehensive measure and statistically insignificant in

explaining the cross-section of average returns. Basu (1983) showed that the earnings-

price (E/P) of US stocks help explain the cross-section of average returns. Stocks that

exhibited higher risks and return were more likely to have a higher earnings-price ratio.

Bhandari (1988) found that common stock returns were positively related to debt-to-

equity (DER) but stocks with higher DER did not necessarily signal higher risk.

However, it still posed as a natural proxy for risk of common equity and can be used as

an additional variable to explain cross-sectional returns.

Fama and French (1992) in their 1963 to 1990 sample period, evaluated each of

the variables and found that each factor had significant power in explaining the cross-

section of average returns. However, when jointly tested it was observed that size and

book-to-market factors seem to absorb the explanatory roles of E/P and DER. Thus,

Fama and French proposed their three-factor model with variables SMB (small-minus-

big) and HML (high-minus-low) such that systematic risk is not only characterised by

market beta alone but also the size and value premiums. The research asserted that in

order to justify a price of an asset, the multidimensional risks should be considered.

Lakonishok and Shapiro (1986) investigated the single factor CAPM model beta

over the 50 year period from 1941-1990 to be statistically weak. Additional tests from

Fama and French (1996) test their three-factor model to the CAPM and find that the

CAPM beta becomes insignificant and intercept coefficients of the three-factor model

were closer to zero when the two extra factors are accounted for. Hence, the combination

of size and BE/ME in addition to the CAPM has superior performance and greater

explanatory power in cross-sectional average returns.

At the stage of financial theory in the 1970s, there existed a gap between the

theoretical importance of systematic variables and their exogenous influences. Ross

(1976) first developed and introduced the arbitrage pricing theory (APT) to explain the

pricing of assets. The author asserts that the mean-variance efficiency of the market

portfolio and the underlying theory and assumptions such as normality in returns are

difficult to justify. The arbitrage pricing model estimates the expected returns as a

linear function of various fundamental macroeconomic variables, where the sensitivity

to each factor has its own beta specific coefficient. The arbitrage pricing model is

expressed as,

16

where is a constant for asset i, is the systematic kth factor, is the factor loading

for factor k and is the idiosyncratic component of the securities return. If the return

structure holds, then according to arbitrage pricing theory the expected return of ith

security is,

where is the risk premium associated with factor k and is the risk-free rate. The

main assumptions made by arbitrage pricing theory are that markets are perfectly

competitive, idiosyncratic risk is assumed to be diversified away and the factor model

explains the relationship between risk and return of a security. Chen, Roll, Ross (1986)

propose a set of relevant “state variables” and employ their method in identifying

important macroeconomic variables and whether exposure to these systematic variables

explains expected returns. The paper tests these macroeconomic variables and

individual systematic effect on market returns and finds that: the spread of long and

short interest rates, inflation, industrial production and spread of bonds are influential

sources that are significantly priced in stock returns.

17

3. Empirical Framework

3.1. Fama and French Three-Factor Model

The Fama and French (1993) three-factor model was developed in response to the

increasing empirical evidence that the CAPM beta proxy was not sufficient in explaining

variations in returns. Leaning on the research of Banz (1981), Stattman (1980),

Rosenberg, Reid and Lanstein (1985), Fama and French found empirical evidence that

small size and high BE/ME firms on average exhibit higher returns than of large firms

and low BE/ME stocks. In addition to the CAPM beta, the size and value factors seemed

to significantly capture the majority of returns of non-financial firms in the NYSE,

AMEX, and NASDAQ market. Thus, Fama and French (1993) create diversified

portfolios sorted on size and book-to-market equity using NYSE median breakpoints as

intersections of portfolios. The diversified portfolios are exposed to combinations of state

variables, SMB (small minus big, is the difference in average returns between small

stocks and big stocks) and HML (high minus low, is the difference in average return

between high BE/ME stocks and low BE/ME stocks) factors, which are used to augment

the CAPM to identify the size and BE/ME effects. The established three-factor model

explained approximately 95% of the variation of excess returns from 1963 to 1990. The

model is as follows,

where is the return of a security or portfolio i over period t, is the return of the

market portfolio, is the risk-free rate of return, and is the zero-mean residual.

Intercept is equal to zero if , , and completely capture the variation of cross-

sectional returns.

There have been various studies regarding the validity and power of the size and

value factors. Because the unknown state variables are explained through mimicking

portfolios, and the justification of state variables are not identified. Black (1993) and

Kothari et al. (1995) argue that a substantial part of premiums may be the result of data

snooping and survivor bias. Sorting portfolios on book-to-market equity also includes

firms that are financially distressed which typically exhibit higher returns and

disproportionately high BE/ME ratios. Studies suggest that the distress risk is priced

into the premiums due to fixation of variables which include the distressed firms that

survive, but exclude the distressed firms that fail. Other arguments include the

18

illiquidity of small stocks which will incur greater transaction costs, and given the lower

traded frequency of stocks, the parameters and beta coefficients may be unreliable.

Moreover, further research of Fama and French (1995) and Lakonishok et al.

(1994) find positive empirical evidence which supports the validity of size and BE/ME

factors in capturing the variation of cross-sectional returns to be significant. Barber and

Lyon (1997) analyse the survivor bias effects of the Fama and French (1992) sample and

include those firms outside of the holdout sample. Evidence showed that the inclusion of

stocks excluded from the sample had no significant effect on the SMB and HML

estimates.

3.2. Principal Component Analysis

Principal component analysis (PCA) was first defined by Pearson (1901) and

later developed by Hotelling (1933). With the increase of computational power, principal

component analysis has seen application in many fields such as finance, biology, ecology,

health and architecture. Principal component analysis is a statistical procedure that

reduces the dimensionality of the original dataset of linear correlated variables through

orthogonal transformation, these variables retain the highest variance possible and are

called principal components. Principal components are decomposed linear combinations

of statistically independent random variables, each with a mean of zero and unit

variance. Thus, the number of principal components are less than or equal to the

number of original variables. Figure 1 illustrates the orthogonal transformation of the

original dataset to the independent principal components.

19

Figure 1:

Graphical orthogonal transformation of principal component analysis for a two-

dimensional data set.

The reduced dimensions of the x1 and x2 data set are rotated by onto the first

principal component axis z1 and second principal component axis z2. The variance of the

first eigenvalue dominates the fractional contribution of explained variances by

maximising the variability of returns for z1 and z2, where

.

Mathematically, the two orthogonal transformed projections in terms of the return

vector can be expressed as,

which can be expressed in matrix form,

2

1

2

1

)cos()sin(

)sin()cos(

x

x

z

z

denoted as , where

2221

1211

)cos()sin(

)sin()cos(

aa

aaA

and is the return vector.

20

The orthogonal transformation is demonstrated by the zero values of the anti-diagonal

of the identity matrix,

.

Moreover, principal component analysis used in this study is defined by higher

dimensional vectors. Thus, suppose for p portfolios or variables, the principal

components for are expressed as,

which is written in matrix form as,

ppppp

p

p

p x

x

x

aaa

aaa

aaa

z

z

z

2

1

21

22221

11211

2

1

where and . Generally,

where j is the jth principal component and is the vector.

Thus the variance of the principal components its expected value for the sample of p is,

where denotes the sample covariance matrix and is the expected sample return.

Assuming that the returns are normally distributed, it follows that,

.

The variability of PC1 is maximised using the Lagrangian method of constrained

optimisation for the following function,

21

such that the first principle component is constrained subject to . Additionally,

subsequent preceding principal components are maximised subject to and

for , this constraint shows the orthogonal relationship of

principal components. Taking the first derivative of equation with respect to equal

to zero derives,

which solves for the eigenvectors and the determinant, solves for the

eigenvalues. If S is a p x p matrix, then the above is the pth order polynomial where the

p roots are,

.

Choosing the number of factors is generally agreed to be one of the most

important decision in factor analysis (Jackson, 1993). Accordingly, there is an

abundance of literature with various methods for determining the number of factors for

analysis. Peres-Neto, Jackson and Somers (2005) investigate individual methods for

determining the number of principal components which total to 20 stopping rules. The

authors conclude that the Bartlett's (1954) test of sphericity, which determines whether

the data exhibits multivariate normal distribution with zero covariance, seemed to be

the best method as it maximised the detection of non-trivial components.

One commonly used method is to retain as many factors required such that the

sum of eigenvalues exceed a certain threshold fraction of the total variance,

where is the number of retained eigenvalues and is the total number of eigenvalues.

Researchers typically retain principal components comprising 95% of total variance

(Jackson, 1993).

By far the most popular method is the Kaiser-Guttman rule (Kaiser, 1960),

where the significant principal components retained are defined with eigenvalues

greater than 1 .

22

3.3. Fama and French Five-Factor Model

Extending from the evidence that average stock returns are related to book-to-

market equity, BE/ME. The dividend discount model illustrates that profitability and

investment add to the average returns of BE/ME. The dividend discount model defines a

stocks market price by the discounted value of future dividends per share,

.

In the dividend discount model, is the share price at time t, is the

expected dividend per share at period , and r is the internal rate of return of

expected dividends. If prices of two firms at time t have same expected dividends but

different prices, the firm with a lower price has a higher return. Assuming that the

pricing is rational, the firm with a lower price and higher return must therefore have a

higher risk. However, the models drawn from the dividend discount model and below,

are the same whether pricing is rational or irrational.

Miller and Modigliani (1961) manipulate the dividend discount model to show

the relations of expected returns and expected profitability, expected investment, and

BE/ME. The market value of a stock at time t is given by,

.

In this equation, is the total equity earnings for the period and

is the change in total equity. Dividing the whole equation by book equity gives,

Three statements are made about the expected stock return in relation to

BE/ME, earnings, and investment. Firstly, keeping all else constant except the market

value of a stock, , and the expected returns, r. Lower market value of , or

equivalently a higher, , implies a higher return. Fixing market value, , and all

other values except future earnings, , and expected returns, r. Higher expected

earnings imply a higher expected return. Lastly, fixing market value, , and all other

values except investment, , and expected returns, r. Higher growth in book equity,

or investment, indicated a lower expected return.

Motivated by the empirical evidence of Novy-Marx (2013), Titman, Wei, and Xie

(2004), which shows that the three-factor model fails to capture the variation of average

23

returns related to profitability and investment. Fama and French (2015) add

profitability and investment factors to the three-factor model,

.

In the augmented model, the profitability premium (robust minus weak)

mimics the risk factors related to profitability and is the difference between the average

returns of robust profitability portfolios minus the average returns of weak profitability

portfolios. The profitability measure (OP) is measured as revenues minus cost of goods

sold, minus selling, general, and administrative expenses, minus interest expense all

divided by book equity.

Additionally, the investment premium (conservative minus aggressive)

mimics the risk-returns factors related to investment. The is calculated as the

average returns of conservative investment portfolios minus the average returns of

aggressive investment portfolios. The investment premium (Inv) is measured as the

change in total assets from the fiscal year ending to , divided by total assets.

24

4. Data

Stock return data used in this research ranges over the period of June 2004 to

June 2015. These returns are collected from constituents belonging to the Shanghai

Stock Exchange (SSE) market and sourced from Bloomberg. Hence, the monthly market

portfolio returns used in this study is calculated from the Shanghai Stock Exchange

Composite Index1. Such data includes the bond prices, earnings-before-tax, book-equity,

total assets and the risk-free rate. The maturity profile of Chinese government bond

issuance ranges from 3 months to 50 years. Over time, many securities were added and

also discontinued. Other issuances such as the 15 year were issued only from 2001 to

2009, and the 8 year was only issued once in 1999. Thus, to accurately reflect the

Chinese monthly risk-free rate, a short term government bond with a life spanning over

the data period will be used. However, as China 1 month government bond issue is not

reported by Bloomberg, the shortest next available government bond will serve as a

suitable replacement. The risk-free rate used in the report is the 3 Month Central Bank

Bill obtained from DataStream, expressed as a decomposed effective monthly rate.

Constituents of the Shanghai Stock Exchange are chosen to create portfolios,

which follow a similar stock sample selection criteria similar to that of Fama and French

(1993). Firstly, stocks must be traded and have monthly returns within the last 12

months preceding July of year t, where June of year t is the size cut off point for

portfolio construction. Secondly, stocks must report a positive book value at fiscal year-

end in order to achieve a positive book-to-market ratio. Lastly, stocks that do not satisfy

these conditions will be excluded and replaced by the next successive stock.

Fama and French (1993) allocate 10 size portfolios of NYSE, AMEX, and

NASDAQ stocks based on the NYSE breakpoints. This process, however, is difficult to

replicate in China's emerging and relatively new market environment. In June 2004, the

constituents of the SSE numbered approximately 857 and rapidly grew to 1108

constituents in June 2015. Therefore, the SSE constituents belonging to the bottom size

group referred to as 'microcaps' by Fama and French (2015), would be highly dynamic

and may be unrepresentative of microcap portfolio performance. Additionally, these

newly listed firms that are actively traded within the last 12 months are inherently

more complicated due to unreported values and missing data.

1 Bloomberg Ticker: SHCOMP:IND. The Shanghai Stock Exchange Composite Index is a capitalisation-

weighted index that tracks the daily price performance of all A-shares and B-shares listed on the Shanghai

Stock Exchange developed on 19th December 1990.

25

With consideration to China's oversized monopolies that are majority state-

owned, or state-controlled enterprises throughout sectors such as finance, chemical,

transport, construction, etc. A prime example being the state-owned largest oil and gas

integrated company, China Petroleum and Chemical Corporation, which encompasses

11.9% of the entire Shanghai Stock Exchange market capitalisation in June 2004 and

16.84% in June 2008. The portfolios constructed in the research will consist of the top

100 market capital firms of each year. Table 1 shows the total index weight of the top

100 stocks in each year.

Table 1:

Total index weight of the top 100 market constituents listed in the Shanghai Stock

Exchange from the period June 2004 to June 2015.

2004 2005 2006 2007 2008 2009 2010

57.71% 46.71% 62.00% 76.03% 81.41% 78.44% 72.74%

2011 2012 2013 2014 2015 Average

69.21% 69.50% 69.09% 65.04% 58.79% 66.69%

Due to the monopolistic nature of China's market, selecting a sample of the top

100 stocks will subsequently capture the majority of the market. On average during the

period from June 2004 to June 2015, the top 100 market capital stocks capture 66.69%

of the market index. Thus, the sample is convincingly large enough to represent the

market and also avoids data complications of dynamic microcaps.

By construction, the process of selecting the top 100 firms in each year may be a

logical error by concentrating the sample to firms that exhibit survivor bias and could

possibly yield inaccurate conclusions. However, in extension to the sample used in Fama

and French (1993), Barber and Lyon (1997) analyse the returns of financial firms which

Fama and French excluded from their analysis and show that the "survivor bias"

element does not have a significant effect on the size and value premium estimates.

Thus, the portfolio samples used in the research for factor estimation leans on the

findings of Barber and Lyon (1997) and assumes that the data outside of the holding

sample does not significantly affect the factor premium estimates.

Under the pricing model assumption of normally distributed returns, monthly

returns data is used instead of daily returns to reduce the fluctuations by averaging out

the "noise". Whether these noise-induced biases become insignificant when returns are

observed at a monthly frequency depends on the simplifying assumption of normality in

26

returns. However, if the pricing model captures the significant factors of cross-sectional

returns and is overall representative of the market. The model is still useful for making

rational portfolio decisions even with the small departures from the over-simplified

assumption of normally distributed monthly returns.

Following these normality assumptions, the prices of stocks are also assumed to

be log normally distributed which enables the use of logarithmic returns. Logarithmic

returns allow factors to be compared in a comparable metric, the compound returns over

n periods is additive such that the return over period n is the logarithmic difference

between the final and initial period.

Thus, the return of asset i at period t is expressed as,

where is the return of asset i for month t, denotes the price of a stock at time t

and is the price of the stock one month before period t.

27

5. Methodology

5.1. Fama-French Three-Factor Model

To study the economic fundamentals highlighted in the Fama and French (1993)

model and its performance in the Chinese stock market, six portfolios are formed from

sorts of size and book-to-market equity (BE/ME). The six portfolios formed are exposed

to different levels of size and book-to-market equity effects and therefore mimic the

underlying risk factors in returns related to size and book-to-market. In June of each

year t from 2004 to 2015, constituents of the Shanghai Stock Exchange are ranked

according to size (price times shares or constituent weight) and the top 100 firms are

then chosen. Stocks are then assigned into two size groups (Small (S) and Big (B)) where

the size breakpoint is defined as the size median of the top 100 stocks.

Within the small and big size groups, portfolios are further sorted into three

book-to-market equity groups based on the breakpoints for the top 30% (High), middle

40% (Neutral), and bottom 30% (Low) book-to-market equity (BE/ME) stocks. Moreover,

the decision to sort portfolios into three groups on BE/ME follows the findings of Fama

and French (1992) where book-to-market equity plays a stronger role than size in

capturing the variation of returns. Portfolios created are reformed at June for each year

from 2005 to 2015.

The intersections of two size groups and three book-to-market equity groups form

six portfolios denoted as BH, BN, BL, SH, SN, and SL. The first letter categorises the

portfolios into small (S) or big (B) size groups, and the second letter denotes the

portfolios in different BE/ME groups, high (H), neutral (N), and low (L). For example,

the group denoted as SL represents the portfolio that contains small stocks that have a

low book-to-market equity.

By construction, the six portfolios formed are components used to calculate the

fundamental size and value premiums. The size premium SMB (small minus big)

mimics the risk factors related to size and is the average returns of small stocks (SH,

SN, and SL) minus the average returns of big stocks (BH, BN, and BL). Additionally,

the value premium HML (high minus low) mimics the risk-return factors related to

book-to-market equity. The HML premium is defined similarly, calculated as the

average returns of two high-BE/ME portfolios (SH and BH) minus the average returns

of two low-BE/ME portfolios (SL and BL). Further descriptions of Fama-French factors

28

can be found online at Kenneth French's data library2 and factor constructions are

summarised in Table 2.

Table 2:

Construction of Size and BE/ME factors. Independent sorts are used to assign two Size groups

and three BE/ME groups. The factors are SMB (small minus big) and HML (high minus low).

Factor Breakpoints Factor Construction

SMB Size: Median of top 100 size stocks

HML BE/ME: top 30%, middle 40%, and

bottom 30% of the respective size group

It should be noted that SMB and HML are not themselves state variables,

instead, the factors are just diversified portfolios with different combination of

exposures to state variables. Thus, this allows observations of size and value factors and

their significance in capturing returns without specifically identifying the state

variables.

The market risk premium, SMB and HML estimates are analysed through

regression upon 9 sorted portfolios on size and BE/ME formed from the top 100 size

firms of the SSE, which will be used as the dependant variables. The breakpoints of size

are top 30%, middle 40%, and bottom 30%. Within each size group, portfolios are then

sorted according to the BE/ME breakpoints, top 30%, middle 40%, and bottom 30%.

Fama and French (1993) create 25 portfolios sorted on size and BE/ME based upon three

market indices, NYSE, AMEX and NASDAQ which are well developed and highly

traded markets. However, with consideration to the smaller Shanghai Stock Exchange

index and sample size of 100 firms, the number of dependent portfolios in this study will

be reduced to 9.

5.2. Principal Component Regression

One of the major advantages of principal component regression (PCR) is

overcoming the problem of multicollinearity, which is a result of explanatory variables

being collinear, or highly correlated. As the portfolios and estimates are calculated using

the same underlying data, sorted in various combinations, normal regression of such

portfolios would be statistically erroneous due to multicollinearity. Principal components

2 Description of Fama/French Factors:

http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/Data_Library/f-f_factors.html

29

of the 9 Size and BE/ME sort portfolios are used as the explanatory variables. Thus, the

orthogonal nature of principal components curbs the effects of multicollinearity within

the dataset.

Firstly, the portfolio return matrix of nine portfolios is constructed where each

column represents the time series return of one portfolio, and each row contains the

monthly returns of each respective portfolio. From the period of June 2004 to June

2015, a 132 by 9 return matrix X is formed,

9,1323,1322,1321,132

9,33,32,31,3

9,23,22,21,2

9,13,12,11,1

xxxx

xxxx

xxxx

xxxx

where denotes the first-month return of portfolio 1 of 9 portfolios. Secondly, the 9 by

9 standardised variance-covariance matrix S is formulated on the 9 portfolios through

Excel,

1

1

1

1

3,92,91,9

9,32,31,3

9,23,21,2

9,13,12,1

such that covariance between portfolio 1 and 2 for example is

. To

obtain the eigenvalues and eigenvectors, PC1 is maximised using the Lagrangian

method for constrained optimisation. The Lagrangian equation to maximise is as

follows,

under constraints:

where ,

is the vector and is the time vector of portfolio returns.

Subsequent maximisations of following principal components are subject to the

additional constraint which represents the orthogonal relationship

between and .

30

Taking the first derivative of equation with respect to equal to zero derives,

which solves for the eigenvectors, and the determinant,

solves for the eigenvalues.

Given the points discussed 3.2 in relation to the stopping point of principal

components, in addition to the motivation of using PCR as a comparative tool for the

Fama and French three-factor model. The number of principal components to be

analysed in PCR will be the first three components, resulting in a 9 by 3 principal

component matrix denoted as A. The scree plot of eigenvalues is illustrated in Figure 2,

where the horizontal red line depicts the Kaiser-Guttman rule .

0

1

2

3

4

5

6

7

8

1 2 3 4 5 6 7 8 9

Scree Plot (Ordered Eigenvalues)

Figure 2:

Scree plot of ordered 9 eigenvalues, the horizontal red line where , depicts the

Kaiser-Guttman rule .

According to the Kaiser-Guttman rule, only the first component should be retained.

However, the first principal component only explains 83.24% of the total variance.

Whereas the retention of the first three principal components captures 93.66% of the

total variance. Thus, the analysis of the first three components provides a satisfactory

model where the principal components explain almost 95% of total variance.

31

The principal components used as the explanatory variables denoted as Z, a 132

by 3 matrix, is calculated through the matrix multiplication of X and A . Where

each column of matrix Z are the explanatory variables PC1, PC2, and PC3. The PCR is

written,

By construction, the chosen three principal components regressed against the

excess returns of 9 created portfolios sorted on size and BE/ME should produce the best

three factor model by capturing the maximum variance explained in each principal

component.

5.3. Fama and French Five-Factor Model

The evidence of Novy-Marx (2013), Titman, Wei, and Xie (2004) shows that the

three-factor model was incomplete and overlook much of the variation of returns related

to profitability and investment. Fama and French (2015) motivated by this, design the

five-factor model directed at capturing size, value (BE/ME), profitability (OP) and

investment (Inv) patterns of average stock returns.

Similar to the Fama and French (1993) three-factor model, definitions of the

factors are calculated by various sorts based upon the fundamental variables studied.

The factor constructions closely follow the approach used in Fama and French (2015)

where the three-factor model is augmented with profitability and investment factors,

.

In this augmented model, the profitability premium (robust minus weak) mimics

the risk factors related to profitability and is the difference between the average returns

of robust profitability portfolios minus the average returns of weak profitability

portfolios. The profitability measure (OP) is observed in June for each year and is

revenues minus cost of goods sold, minus selling, general, and administrative expenses,

minus interest expense all divided by book equity.

Additionally, the investment premium (conservative minus aggressive)

mimics the risk-returns factors related to investment. The is calculated as the

average returns of conservative investment portfolios minus the average returns of

aggressive investment portfolios. The investment premium (Inv) is likewise sorted in

June of each year and is the change in total assets from the fiscal year ending to ,

32

divided by total assets. Portfolios created are reformed at June for each year from

2005 to 2015.

For the purpose of evaluation and accommodation of the additional factors, 2 x 3,

2 x 2, and 2 x 2 x 2 x 2 sorts are joint controls to isolate estimates of factor premiums.

Concluded from the evidence of Fama and French (2015), the 2 x 2 sorts Size-BE/ME,

Size-Op, and Size-Inv is preferred over the 2 x 3 sorts. Since the 2 x 3 sort excludes the

middle 40% of the sample whereas the 2 x 2 version produces better diversified

portfolios by using all stocks. Estimates of factors (SMB, HML, RMW, and CMA) are

calculated differently for 2 x 2 and 2 x 2 x 2 x 2 sorts under the breakpoints and

constructions summarised in Table 3.

The 2 x 2 sort portfolios are labelled with two letters, the first letter defines the

Size group, small (S) or big (B). In the 2 x 2 sort, the second letter describes the BE/ME

group, high (H), neutral (N), or low (L), the profitability group, robust (R), neutral (N),

or weak (W), and the investment group, conservative (C), neutral (N), or aggressive (A).

For example, the portfolio denoted as BR contains stocks with big size and robust

profitability.

Notation becomes more intricate in the joint control 2 x 2 x 2 x 2 sort portfolios.

The first letter represents the size group, the second letter is the BE/ME group, the

third is the profitability group, and the fourth is the investment group. For example, the

portfolio denoted as SHRA contains stocks with small size, high BE/ME, robust

profitability and aggressive investment.

It should be noted that for the 2 x 2 sort, 12 portfolios are created containing 50

stocks each as the sorts carried out on two intersections, Size-BE/ME, Size-OP, or Size-

Inv. However, 16 portfolios are created in the 2 x 2 x 2 x 2 sort within the top 100 size

stocks. Thus, the portfolios in the 2 x 2 x 2 x 2 are poorly diversified and must be

considered in the inference of results.

Similar to the Fama and French (1993) three-factor model, the factors SMB,

HML, RMW, and CMA are not themselves state variables, but are instead portfolios

with different combinations of exposures to the state variables. Thus, the significance of

size, value, profitability and investment are captured in the returns without specifically

identifying the state variables.

33

Table 3:

Construction of Size, BE/ME, profitability, and investment factors.

Portfolios are labelled with two or four letters. The first letter describes the Size group small (S) or big (B). In the 2 x 2 sort, the second letter

describes the BE/ME group, high (H), neutral (N), or low (L), the profitability group, robust (R), neutral (N), or weak (W), and the investment group,

conservative (C), neutral (N), or aggressive (A). In the 2 x 2 x 2 x 2 sort, the second letter is the BE/ME group, the third is the profitability group, and

the fourth is the investment group. The factors are SMB (small minus big), HML (high minus low BE/ME), RMW (robust minus weak OP), and CMA

(conservative minus aggressive Inv).

Sort Factor Breakpoints Component Construction

2 x 2 sorts on

Size and BE/ME,

or Size and OP,

or Size and Inv

SMB Size: Median of top 100 size stocks

HML BE/ME: Median of top 100 size stocks

RMW OP: Median of top 100 size stocks

CMA Inv: Median of top 100 size stocks

2 x 2 x 2 x 2 sorts

on Size, BE/ME,

OP, and Inv

SMB Size: Median of respective size portfolio

HML BE/ME: Median of respective BE/ME

portfolio

RMW OP: Median of respective OP portfolio

CMA Inv: Median of respective Inv portfolio

34

6. Results

Table 4:

Average monthly percent returns for portfolios formed on Size and BE/ME, Size and OP, Size and

Inv; June 2004 to June 2015, 132 months.

At the end of each June, stocks are allocated into three small to big size groups (top

30%,middle 40%, and bottom 30%), stocks are then allocated into three BE/ME groups (Low to

High), again using the same percent groups (top 30%, middle 40%, and bottom 30%). The Size-OP

and Size-Inv portfolios are formed in the same way, except that the second sort variable is

operating profitability or investment. Operative profitability, OP, in the sort for June of year t is

measured with accounting data for the fiscal year ending in year t-1 and is revenues minus cost

of goods sold, minus selling, general and administrative expenses, minus interest expense all

divided by book equity. Investment, Inv, is the change in total assets from the fiscal year ending

t-1 to the fiscal year ending t, divided by t-1 total assets. The table shows averages of monthly

returns of the 9 portfolios formed on Size and BE/ME, Size and OP, Size and Inv.

Low Neutral High

Panel A: Size-BE/ME portfolios

Small 0.34% 1.93% 3.23%

Medium -0.07% 1.43% 3.89%

Big 0.11% 0.90% 2.57%

Panel B: Size-OP portfolios

Small 0.96% 1.04% 1.30%

Medium 1.68% 1.66% 1.45%

Big 0.95% 1.33% 1.77%

Panel C: Size-Inv portfolios Small 1.04% 0.95% 1.32%

Medium 0.91% 1.93% 1.86%

Big 0.54% 1.61% 1.93%

35

Table 5:

Summary statistics for average monthly factor returns; June 2004 to June 2015, 132 months. Panel A of the table shows the average monthly returns

(Mean) and the standard deviation of monthly returns (Std dev.) for the average returns. Panel B shows the correlations of the same factor from

different sorts and Panel C shows the correlations for each set of factors. The Fama and French three factor model is notated as FF3, Fama and

French 5 Factor 2 x 2 sort is notated as FF5 2x2, and the Fama and French 5 Factor 2 x 2 x 2 x 2 sort is notated as FF5 2x2x2x2.

Panel A: Averages and standard deviations for monthly returns

Fama-French 3 Factors

Fama-French 5 Factor 2 x 2 Factors

Fama-French 5 Factor 2 x 2 x 2 x 2 Factors

Rm-Rf SMB HML Rm-Rf SMB HML RMW CMA Rm-Rf SMB HML RMW CMA

Mean 0.644 0.428 -3.156

0.644 -0.472 -2.121 0.108 0.498

0.644 -0.392 -2.160 -0.277 -0.029

Std dev. 8.708 3.847 6.078

8.708 4.216 4.109 3.386 3.443

8.708 4.215 4.204 3.084 2.782

Panel B: Correlation between different version of the same factor

SMB

HML

RMW

CMA

FF3 FF5 2x2 FF5 2x2x2x2 FF3 FF5 2x2 FF5 2x2x2x2

FF5

2x2

FF5

2x2x2x2

FF5

2x2

FF5

2x2x2x2

FF3 1.00 0.32 0.35

1.00 0.80 0.81

FF5 2x2 1.00 0.82

1.00 0.84

FF5 2x2 0.32 1.00 0.98

0.80 1.00 0.99

FF5 2x2x2x2 0.82 1.00

0.84 1.00

FF5 2x2x2x2 0.35 0.98 1.00

0.81 0.99 1.00

Panel C: Correlation between different factors

Fama and French 3 Factor Model

Fama and French 5 Factor Model 2 x 2

Fama and French 5 Factor Model 2 x 2 x 2 x 2

Rm-Rf SMB HML Rm-Rf SMB HML RMW CMA Rm-Rf SMB HML RMW CMA

Rm-Rf 1.00 -0.14 0.07

1.00 -0.32 -0.22 -0.05 0.25

1.00 -0.30 -0.19 0.10 0.14

SMB -0.14 1.00 -0.40

-0.32 1.00 0.33 -0.02 -0.23

-0.30 1.00 0.35 -0.20 -0.17

HML 0.07 -0.40 1.00

-0.22 0.33 1.00 0.12 -0.51

-0.19 0.35 1.00 -0.10 -0.24

RMW - - -

-0.05 -0.02 0.12 1.00 -0.33

0.10 -0.20 -0.10 1.00 -0.26

CMA - - - 0.25 -0.23 -0.51 -0.33 1.00 0.14 -0.17 -0.24 -0.26 1.00

36

Table 6: Fama and French 3-Factor model regressions for 9 Size-BE/ME portfolios; June 2004 to June 2015, 132 months.

At the end of June each year, stocks are allocated to three size groups (top 30%, middle 40%, and bottom 30%). Stocks are then independently

sorted into three BE/ME groups (Low, Neutral, and High) again using the top 30%, middle 40%, and bottom 30% sort. The intersection of these two

sorts produce 9 Size-BE/ME portfolios. The RHS variables are excess market return, RM-RF, the Size factor, SMB, and the value factor, HML. Each

variable is uniquely constructed according to the model factor definition.

Dependant variable: excess returns on 9 stock portfolios formed on size and book-to-market equity

Size Book-to-market equity (BE/ME) 30%, 40% and 30% groups

Low Neutral High

Low Neutral High

a

t(a)

Small 0.006 0.004 0.010

Small 1.952 1.198 2.510

Medium 0.004 0.005 0.011

Medium 1.109 1.476 2.579

Big 0.006 0.006 0.004

Big 2.139 0.047 0.976

b

t(b)

Small 1.053 1.198 0.954

Small 32.230 31.515 23.644

Medium 1.104 1.195 1.079

Medium 29.600 31.826 24.066

Big 0.986 0.981 1.068

Big 35.093 29.707 26.343

s

t(s)

Small 0.679 1.141 0.883

Small 8.448 12.187 8.892

Medium 0.362 0.319 0.356

Medium 3.942 3.445 3.224

Big -0.280 -0.517 -0.574

Big -4.039 -6.365 -5.756

h

t(h)

Small 0.462 -0.009 -0.333

Small 9.146 -0.146 -5.334

Medium 0.490 0.071 -0.542

Medium 8.491 1.226 -7.822

Big 0.378 0.113 -0.488

Big 8.685 2.214 -7.784

R2

S(e)

Small 0.898 0.893 0.838

Small 0.032 0.038 0.040

Medium 0.883 0.888 0.835

Medium 0.037 0.037 0.044

Big 0.921 0.890 0.861 Big 0.028 0.033 0.040

37

Table 7:

Principal component regressions for 9 Size-BE/ME portfolios; June 2004 to June 2015, 132 months. Time series principal components are the columns

of matrix where is the 132 by 3 matrix.



Low Neutral High

Low Neutral HIgh

a

t(a)

Small 0.001 -0.003 -0.003 Small 0.288 -1.236 -1.164

Medium -0.001 -0.003 -0.005 Medium -0.367 -1.207 -1.881

Big -0.005 0.001 0.001 Big -2.360 0.228 0.288

B1

t(B1)

Small 0.280 0.353 0.332 Small 30.027 42.609 41.366

Medium 0.328 0.374 0.356 Medium 37.198 45.690 48.521

Big 0.340 0.320 0.314 Big 48.062 44.777 36.975

B2

t(B2)

Small 0.515 0.376 -0.124 Small 15.539 12.761 -4.348

Medium 0.420 -0.025 -0.237 Medium 13.390 -0.874 -9.063

Big -0.036 -0.387 -0.444 Big -1.449 -15.235 -14.672

B3

t(B3)

Small -0.066 -0.406 -0.318 Small -1.495 -10.310 -8.338

Medium 0.372 -0.055 -0.313 Medium 8.866 -1.410 -8.965

Big 0.665 0.221 -0.053 Big 19.766 6.500 -1.322

R2

S(e)

Small 0.906 0.945 0.934 Small 0.030 0.027 0.026

Medium 0.931 0.943 0.951 Medium 0.029 0.027 0.024

Big 0.954 0.944 0.922 Big 0.023 0.023 0.028

38

Table 8:

Fama and French 5-factor 2 x 2 sort regressions for 9 Size-BE/ME portfolios; June 2004 to June

2015, 132 months.

At the end of June each year, stocks are allocated to three size groups (top 30%, middle

40%, and bottom 30%). Stocks are then independently sorted into three BE/ME groups (Low,

Neutral, and High) again using the top 30%, middle 40%, and bottom 30% sort. The intersection

of these two sorts produce 9 Size-BE/ME portfolios. The RHS variables are excess market return,

RM-RF, the Size factor, SMB, the value factor, HML, the profitability factor, RMW, and the

investment factor, CMA. Each variable is uniquely constructed according to the model factor

definition.



Low Neutral High

Low Neutral High

a

t(a)

Small 0.004 0.003 0.008

Small 1.085 0.581 1.504

Medium 0.002 0.004 0.009

Medium 0.600 1.146 1.860

Big 0.004 0.006 0.005

Big 1.167 1.694 1.037

b

t(b)

Small 0.958 1.140 0.932

Small 26.025 19.583 16.461

Medium 0.980 1.100 1.008

Medium 27.606 29.833 19.356

Big 0.942 0.953 1.042

Big 27.901 25.036 22.397

s

t(s)

Small 0.072 0.067 -0.152

Small 0.924 0.546 -1.270

Medium -0.262 -0.427 -0.572

Medium -3.478 -5.464 -5.179

Big -0.162 -0.265 -0.514

Big -2.264 -3.279 -5.209

h

t(h)

Small 0.252 -0.438 -0.749

Small 2.893 -3.179 -5.590

Medium 0.509 0.014 -0.957

Medium 6.059 0.159 -7.766

Big 0.549 0.331 -0.496

Big 6.874 3.680 -4.505

r

t(r)

Small -0.279 -0.375 -0.460

Small -2.967 -2.526 -3.190

Medium -0.201 -0.129 0.053

Medium -2.220 -1.377 0.398

Big 0.232 0.211 0.221

Big 2.695 2.175 1.865

c

t(c)

Small 0.530 0.360 -0.027

Small 4.927 2.117 -0.160

Medium 0.263 0.089 0.371

Medium 2.539 0.822 2.439

Big 0.012 -0.063 0.159

Big 0.120 -0.564 1.171

R2

S(e)

Small 0.888 0.784 0.723

Small 0.034 0.054 0.052

Medium 0.908 0.907 0.807

Medium 0.033 0.034 0.048

Big 0.901 0.874 0.842

Big 0.031 0.035 0.043

39

Table 9:

Fama and French 5-factor 2 x 2 x 2 x 2 sort regressions for 9 Size-BE/ME portfolios; June 2004 to

June 2015, 132 months.

At the end of June each year, stocks are allocated to three size groups (top 30%, middle

40%, and bottom 30%). Stocks are then independently sorted into three BE/ME groups (Low,

Neutral, and High) again using the top 30%, middle 40%, and bottom 30% sort. The intersection

of these two sorts produce 9 Size-BE/ME portfolios. The RHS variables are excess market return,

RM-RF, the Size factor, SMB, the value factor, HML, the profitability factor, RMW, and the

investment factor, CMA. Each variable is uniquely constructed according to the model factor

definition.



Low Neutral High

Low Neutral High

a

t(a)

Small 0.002 0.003 0.007 Small 0.631 0.560 1.316

Medium 0.001 0.003 0.009 Medium 0.157 0.866 1.854

Big 0.004 0.006 0.005 Big 1.318 1.721 1.047

b

t(b)

Small 0.995 1.162 0.938 Small 26.198 19.794 16.745

Medium 1.008 1.118 1.015 Medium 28.412 30.778 20.099

Big 0.952 0.958 1.037 Big 28.617 25.712 23.132

s

t(s)

Small 0.050 0.123 -0.135 Small 0.599 0.950 -1.091

Medium -0.303 -0.449 -0.559 Medium -3.872 -5.616 -5.022

Big -0.162 -0.297 -0.540 Big -2.216 -3.618 -5.462

h

t(h)

Small 0.368 -0.315 -0.734 Small 4.512 -2.499 -6.096

Medium 0.533 0.013 -0.907 Medium 6.992 0.161 -8.356

Big 0.537 0.298 -0.509 Big 7.508 3.725 -5.285

r

t(r)

Small -0.332 -0.168 -0.297 Small -3.029 -0.993 -1.841

Medium -0.226 -0.190 0.125 Medium -2.211 -1.820 0.861

Big 0.116 0.047 0.225 Big 1.209 0.436 1.740

c

t(c)

Small 0.510 0.375 0.053 Small 4.136 1.969 0.291

Medium 0.292 0.046 0.277 Medium 2.538 0.386 1.691

Big -0.128 -0.253 0.059 Big -1.188 -2.091 0.408

R^2

S( e)

Small 0.877 0.772 0.719 Small 0.036 0.055 0.053

Medium 0.905 0.906 0.812 Medium 0.033 0.034 0.048

Big 0.901 0.875 0.848 Big 0.031 0.035 0.042

40

7. Diagnostics

Econometric theory suggests that information obtained from statistical models

can be improved, by identifying their strengths and weaknesses through systematic

testing. Inferences from the empirical analysis must therefore be tested econometrically,

also known as diagnostic model testing. The empirical results are examined for,

multicollinearity, normality, heteroscedasticity, autocorrelation, and stationarity. (The

p-values in the tables which are annotated with an asterisk (*) indicate the rejection of

the null hypothesis.)

7.1. Multicollinearity

Multicollinearity occurs when two or more explanatory variables are highly

correlated. Regressions containing multicollinearity exhibit high R2, inflated standard

errors, and individual effects between two correlated variables become difficult to

distinguish. Multicollinearity can be detected in the matrix of correlations, the

correlation matrices of Fama and French 3 factor model, 2 x 2 Fama and French 5-

factor, and 2 x 2 x 2 x 2 Fama and French 5 factor model, are presented below in Table

10, 11, and 12.

Table 10:

Correlation matrix of Fama and French 3-factor model factors.

RM-RF SMB HML

RM-RF 1 -0.1416 0.0677

SMB -0.1416 1 -0.3988

SMB 0.0677 -0.3988 1

Table 11:

Correlation matrix of Fama and French 5-factor 2 x 2 sort model factors.

RM-RF SMB HML RMW CMA

RM-RF 1 -0.3195 -0.2242 -0.0515 0.2521

SMB -0.3195 1 0.3339 -0.0207 -0.2315

HML -0.2242 0.3339 1 0.1181 -0.5063

RMW -0.0515 -0.0207 0.1181 1 -0.3341

CMA 0.2521 -0.2315 -0.5063 -0.3341 1

41

Table 12:

Correlation matrix of Fama and French 5-factor 2 x 2 x 2 x 2 sort model factors.

RM-RF SMB HML RMW CMA

RM-RF 1 -0.2982 -0.1862 0.1043 0.1423

SMB -0.2982 1 0.3541 -0.1988 -0.1700

HML -0.1862 0.3541 1 -0.0979 -0.2423

RMW 0.1043 -0.1988 -0.0979 1 -0.2629

CMA 0.1423 -0.1700 -0.2423 -0.2629 1

By observing all three matrices, the correlations between explanatory variables

in each model are below 0.40, except for the correlation of CMA and HML factors in the

five factor 2 x 2 sort that has a correlation of 0.5063. The correlation of 0.5063 indicates

a moderate positive relationship, nonetheless, the overall correlations of variables are

weak with values around 0.20 to 0.30. Thus, multicollinearity is assumed to be excluded

from the data.

7.2. Normality

The Jarque-Bera test is employed to test whether the sample returns data

exhibits skewness and kurtosis of a normal distribution. The null of normality is

rejected if the p-value is less than 5%, indicating that the returns are not well-modelled

by a normal distribution. The Jarque-Bera tests of Fama and French 3 factor model, 2 x

2 Fama and French 5 factor, and 2 x 2 x 2 x 2 Fama and French 5-factor model, are

presented below in Table 13, 14, and 15.

Table 13:

Jarque-Bera test of Fama and French 3-factor model.

L N H

Jarque-Bera S

112.8774 3.2765 9.5389

p-value 0* 0.1943 0.0085*

Jarque-Bera M

91.7681 0.5130 31.6161

p-value 0* 0.7738 0*

Jarque-Bera B

54.0863 148.4156 52.7935

p-value 0* 0* 0*

42

Table 14:

Jarque-Bera test of Fama and French 5-factor 2 x 2 sort model.

L N H

Jarque-Bera S

45.0937 25.0976 64.9976

p-value 0* 0* 0*

Jarque-Bera M

73.7371 2.9461 30.7317

p-value 0* 0.2292 0*

Jarque-Bera B

17.2563 55.4682 1.2040

p-value 0.0002* 0* 0.5477

Table 15:

Jarque-Bera test of Fama and French 5-factor 2 x 2 x 2 x 2 sort model.

L N H

Jarque-Bera S

127.7486 35.3914 97.4646

p-value 0* 0* 0*

Jarque-Bera M

117.5979 4.2246 20.7802

p-value 0* 0.1210 0*

Jarque-Bera B

28.2211 53.7411 1.9471

p-value 0* 0* 0.3777

The null hypothesis for residual normality is rejected for all Fama-French three

factor portfolios except SN, MN, and MN, BV, in the Fama-French five factor model.

This indicates that the returns of the portfolios constructed of Chinese stocks mostly

reject the null of a normal distribution.

7.3. Heteroscedasticity

The test of heteroscedasticity is crucial in regression analysis as it evaluates if

the variance of error terms are constant. Consequences of heteroscedasticity include

unbiased estimates of OLS estimation and estimates that are no longer the best linear

unbiased estimators (BLUE). A White's test of heteroscedasticity with no cross terms is

applied to the pricing models. Variance of error terms are constant if the p-value is less

than 0.50, the White's test of Fama and French 3 factor model, 2 x 2 Fama and French 5

factor, and 2 x 2 x 2 x 2 Fama and French 5 factor model, are presented below in Table

16, 17, and 18.

43

Table 16:

White's Heteroscedastic test of Fama and French 3-factor model.

L N H

F-Statistic S

2.2032 2.7221 8.4241

p-value 0.0909 0.0471 0

F-Statistic M

15.8387 17.9736 26.5585

p-value 0 0 0

F-Statistic B

0.6784 7.4666 3.0464

p-value 0.5668 0.0001 0.0312

Table 17:

White's Heteroscedastic test of Fama and French 5-factor 2 x 2 sort model.

L N H

F-Statistic S

2.3793 1.7257 2.9104

p-value 0.0423 0.1334 0.0160

F-Statistic M

2.5320 1.9541 1.2234

p-value 0.0321 0.0900 0.3020

F-Statistic B

1.4565 4.8947 2.8992

p-value 0.2088 0.0040 0.0164

Table 18:

White's Heteroscedastic test of Fama and French 5-factor 2 x 2 sort model.

L N H

F-Statistic S

1.4040 2.5272 3.2895

p-value 0.2273 0.0324 0.0079

F-Statistic M

1.7699 2.3705 2.2878

p-value 0.1237 0.0430 0.0499

F-Statistic B

2.2016 7.2965 2.4047

p-value 0.0582 0.0000 0.0404

The null hypothesis of constant variance of error terms is

accepted as p-values for all models are less than 0.50, except BL of Fama and French

three-factor model. Thus, the variance of error terms in all other portfolios are constant.

7.4. Autocorrelation

Autocorrelation is found in datasets that repeat similar patterns when exposed to

shocks or experience overlapping effects of shocks in a given time period. The

observation in error terms are then found to be correlated, which causes inflated R2

values, inefficient coefficient estimates, and misleading standard errors. The models are

tested according to the Breusch-Godfrey test with 10 lagged residuals. The Breusch-

Godfrey test of Fama and French 3 factor model, 2 x 2 Fama and French 5-factor, and 2

x 2 x 2 x 2 Fama and French 5 factor model, are presented below in Table 19, 20, and 21.

44

Table 19:

Breusch-Godfrey test of Fama and French 3-factor model.

L N H

F-Statistic S

2.4776 1.5008 1.2654

p-value 0.0099* 0.1474 0.2579

F-Statistic M

1.9524 0.4085 1.6412

p-value 0.0447* 0.9403 0.1032

F-Statistic B

0.5617 1.6804 1.3788

p-value 0.8421 0.0932 0.1983

Table 20:

Breusch-Godfrey test of Fama and French 5-factor 2 x 2 sort model.

L N H

F-Statistic S

0.6488 2.3041 1.4276

p-value 0.7691 0.0166* 0.1767

F-Statistic M

1.4856 2.8480 2.3133

p-value 0.1533 0.0033* 0.1610

F-Statistic B

0.8848 2.4680 0.9555

p-value 0.5496 0.0103* 0.4859

Table 21:

Breusch-Godfrey test of Fama and French 5-factor 2 x 2 x 2 x 2 sort model.

L N H

F-Statistic S

0.9726 2.5082 1.6586

p-value 0.4711 0.0091* 0.0989

F-Statistic M

1.9226 2.6843 2.8557

p-value 0.0487* 0.0054* 0.0033*

F-Statistic B

1.0296 3.3373 0.8274

p-value 0.4232 0.0008* 0.6031

The null hypothesis is rejected for, SG, and MG in the three

factor model, SN, MN, and BN in the 2 x 2 sort five factor model, and SN, MG, MN, MV,

and SN in the 2 x 2 x 2 x 2 sort five-factor model. To correct for these autocorrelated

error terms, standard errors are adjusted for using Newey-West standard errors which

allow for autocorrelated residuals.

7.5. Stationarity

Stationarity is a stochastic process such that any subset of the time series data

has a distribution function identical to any other subset. If a regression is non-

stationary, the persistence of shocks will be infinite, the subsequent mean and variance

estimates are then not well defined and inferences from their coefficients are unreliable.

The Augmented Dickey-Fuller (ADF) test is utilised to examine for a unit root with 12

45

lagged differences. The ADF test is carried on the series of each portfolio, illustrated in

Table 22.

Table 22:

Augmented Dickey-Fuller test of all nine portfolio log returns series (SL, SN, SH, ML,

MN, MH, BL, BN, and BH).

L N H

t-Statistic S

-10.3186 -9.9959 -10.3780

p-value 0* 0* 0*

t-Statistic M

-3.3533 -3.1407 -5.8497

p-value 0.0145* 0.0261* 0*

t-Statistic B

-11.3839 -10.3113 -9.7500

p-value 0* 0* 0*

As all p-values are less than 0.05, the null hypothesis that the series contains a

unit root is rejected and the return series for all nine portfolios are stationary.

46

8. Empirical Analysis

8.1. Summary Statistics for factor returns

The empirical tests investigate whether portfolios formed on Size, BE/ME,

profitability, and investment capture the variation in average returns that are exposed

to these factors. Firstly, each factor is examined for the pattern in average returns.

Table 4 shows the average returns of nine portfolios sorted on Size-BE/ME, Size-OP, and

Size-Inv.

In each of the BE/ME columns under Panel A , average returns fall from small

stocks to big stocks in Low and Neutral BE/ME stocks which show the size effect.

However, the Medium size stocks with high BE/ME have a higher average return of

3.89% than that for similar small stocks (3.23%). The relation of stock returns and

BE/ME, known as the value effect is more consistently observed as average returns in

each column from Low to High increases monotonically with higher BE/ME. The results

are similar to those of Fama and French (1993) where the value effect is stronger among

small stocks than in large stocks. The returns of small stocks in the first row increase

from 0.34% per month to 3.23% per month for increasing BE/ME, compared to the

smaller degree of increase for big stocks in the third row, from 0.11% to 2.57%.

Panel B of Table 4 shows the average returns of nine portfolios sorted on Size-

OP. In contrast to the findings of Fama and French (2015), the effect of size on average

returns is not as prevalent in comparison to the size effect of the BE/ME sort. The size

effect in the Size-OP sort is not observed as there is not a pattern of higher premiums

for small size firms. However, higher profitability sorts are associated with higher

returns within small and large stocks, average returns of small stocks increase from

0.96% to 1.30%, and 0.95% to 1.77% for large stocks. The reverse case is seen for

medium size stocks where Low profitability stocks yield higher average returns (1.68%)

than High profitability stocks (1.45%).

The size effect in Panel C of Table 4 is detected only in stocks with Low

investment (conservative stocks) with average returns increasing from big to small

stocks, 0.54% to 1.04%. Similarly, the size effect of Size-OP sort portfolio of small stock

premiums is not observed in neutral and higher investment stocks. Big stocks in

Neutral and High investment have higher returns than those of small stocks. For

example, the returns of medium and big stocks are 1.93% and 1.61% respectively, in

comparison to the small stock average return of 0.95%. The average returns of High

47

investment (aggressive) stocks exceed the average returns of Low investment

(conservative) stocks. This result is interesting. The results diverge from the findings of

Fama and French (2015) such that average returns on portfolios with Low investment

are higher portfolios of High investment. These results show that in the Chinese stock

market, average returns are typically higher for firms that investment more

aggressively.

Building from the findings of Fama and French (1995), high BE/ME stocks tend

to have low profitability and investment, and low BE/ME stocks tend to be profitable

and invest aggressively. Table 4 however does not isolate the value, profitability and

investment effects in average returns. To more cleanly investigate the effects of value,

profitability and investment, the factors in the regressions of the Fama and French

three-factor and five-factor model are examined in section 8.2.

Table 5 shows the average summary statistics for individual factors of each

model. In Panel A, the average SMB returns are 0.428%, -0.472%, and -0.392% for the

three versions of the factor. The SMB correlations between different versions of the

same factor are quite low, 0.32 and 0.35, which explains the differences in SMB average

returns and standard deviations. This is quite surprising since the breakpoints for the

size sort is defined in the same way, and all three versions use utilise all top 100 stocks

in the construction.

The summary statistics for HML, RMW and CMA depend more on the factor

construction as each model constructs these factors in a different way. The results from

the FF5 2x2 and FF5 2x2x2x2 are the easiest to compare. By construction, the factors

from the 2 x 2 sort controls for size and one other variable, whilst the 2 x 2 x 2 x 2 sort

jointly controls for all four variables. In Panel A and B of Table 5, the joint controls have

little effect on HML as the average returns of the FF5 2x2 and FF5 2x2x2x2 are -2.121%

and -2.160% respectively. Moreover, the correlations of the HML between FF5 2x2 and

FF5 2x2x2x2 are high at 0.99, which also accounts for the similarity in standard

deviations of 4.109 and 4.215.

However, the correlations between RMW and CMA in the two five-factor models

are lower, with correlations of 0.82 for RMW, and 0.84 for CMA. The comparison

between these two factors for each model is noteworthy. Firstly, the 2 x 2 sort model

produces positive RMW and CMA of 0.108% and 0.498% respectively, whereas the 2 x 2

x 2 x 2 sort model yields negative returns of RMW and CMA, -0.277 and -0.029

48

respectively. By construction, the 2 x 2 x 2 x 2 sort portfolios are poorly diversified (as

mentioned in 5.3), therefore it is logical to expect the standard deviation of the model to

be higher than that of the 2 x 2 sort. However, despite the similarities in standard

deviation in SMB and HML, the 2 x 2 RMW and CMA has higher standard deviation

(3.386 and 3.443) in contrast to the 2 x 2 x 2 x 2 portfolios, although inherently

possessing better diversification than the 2 x 2 x 2 x 2 model (3.084 and 2.782).

Small stocks normally exhibit high market betas than those of big stocks, such

that increase in excess market return is further positively reflected in higher beta

stocks. Thus, the negative relationship in all models between SMB and Rm-Rf is

counterintuitive. This indicates that in China's emerging market, big stocks perform

better than small stocks when the market performs well.

Most notably from the correlations, the divergence from the findings of Fama and

French (2015) that low investment firms produce higher average returns of high

investment firms (CMA), is further supported by the correlations of RMW and CMA.

The negative correlations between RMW and CMA (-0.33 in FF5 2x2, and -0.26 in FF5

2x2x2x2) shows that as firms with high profitability outperform firms with low

profitability, aggressive investment firms yield higher returns than those of

conservative firms. This empirical result is also illustrated in Panel C of Table 4 where

aggressive investment firms produce higher average returns than conservative firms.

8.2. Regression Analysis

Through examining the R2 of the Fama and French three-factor model in Table 6.

The pricing model captures 83.5% to 92.1% in the variation of average returns, this

result is significantly lower to the 95% R2 observed in Fama and French (1993). In

comparison to the R2 of the principal component regressions defined in Table 7, which

explains 90.6% to 95.4%. The empirical results show that there are variations in average

returns are not sufficiently captured by the three-factor model. These results motivate

the augmentation of the three factor model with profitability and investment factors,

using the Fama and French five-factor model.

Tables 6, 8, and 9 shows the Fama-French intercepts from the 9 Size-BE/ME

portfolios. The Fama and French (2015) "fatal" problem of extreme microcap growth

stocks with large t-statistics that reject the model are not reflected in the results shown

in Table of 8 and 9. The portfolios of small growth stocks produce positive three-factor

and five-factor intercepts. Moreover, a recurring pattern of negative coefficients is

49

observed in high BE/ME (values) stocks and small neutral BE/ME stocks, with the

highest negative coefficients in the medium size stocks (-0.542, -0.957, and -0.907) in the

three-factor and the two five-factor models.

Fama and French (2015) drop the HML factor and find that the five-factor model

never improves the description of average returns from the four-factor model that drops

HML factor. On to contrary, HML is not recognised as a redundant factor within the

SSE market returns from 2004 to 2015. The HML t-statistic is strongly significant in all

models, especially at high and low extremes of BE/ME sort portfolios. Furthermore, the

t-statistic is most significant in SL portfolio of Fama-French three-factor model (9.146),

and in MH portfolio of both Fama-French five-factor models (-7.766 and -8.356).

Analysing the profitability factors in Table 8 and 9, microcap stocks returns

behave like those of unprofitable firms but grow rapidly. Small unprofitable firms are

reflected in the negative coefficients of all small stocks. The most extreme negative

profitability intercept lies in the top right of the matrix (-0.460) for the 2 x 2 sort, and

top left of the matrix (-0.332) for the 2 x 2 x 2 x 2 sort. RMW intercept gradually

becomes positive from small to big stocks. It can be inferred that big stocks are more

profitable than small stocks, especially within high BE/ME stocks. This result is not

surprising as the majority of large size firms are state owned by China and generally

have high BE/ME characteristics.

Aside from the high BE/ME stocks in the 2 x 2 and 2 x 2 x 2 x 2 portfolios which

are majority classified as state-owned firm, investment intercepts (CMA) are largely

positive for small stocks, and decreasing or negative for big stocks. This adds an extra

size dimension to the previous findings that aggressive firms have higher average

returns than conservative firms. Negative CMA coefficients for aggressive big firms

confirms that higher investment firms outperform lower investment firms

Additionally, the extremely high t-statistic (t(b)) of the excess market return

coefficients in all Fama and French models is similar to the results found in Fama and

French (1993, Table 6). The high t-statistic is due to the high explanatory power in these

regressions since the regressors and regressands are both constructed from the same

underlying data, the market constituents of the index. As the nine diversified Size-

BE/ME portfolios are constructed within the constituents of the SSE market index, this

consequently results in high t-statistic for the coefficient of excess market returns.

50

8.3. Principal Component Analysis

The variations in average returns can be analysed through principal components.

The components allow for interpretation of average returns of different portfolios and

the individual effects on portfolios sorted on Size-BE/ME.

Table 23:

Correlation of PCA factors and factors of the Fama-French three-factor model.

RM-RF PC1 PC2 PC3 SMB HML

RM-RF 1 0.966 -0.010 0.015 -0.142 0.068

PC1 0.966 1 0.084 -0.022 -0.039 0.040

PC2 -0.010 0.084 1 -0.002 0.791 -0.804

PC3 0.015 -0.022 -0.002 1 -0.455 -0.508

SMB -0.142 -0.039 0.791 -0.455 1 -0.399

HML 0.068 0.040 -0.804 -0.508 -0.399 1

Table 24:

Weighting coefficients of principal components of average monthly returns for nine

portfolios sorted on Size-BE/ME from June 2004 to June 2015, 132 months.

Portfolio Principal Components

PC 1 PC 2 PC 3 PC 4 PC 5 PC 6 PC 7 PC 8 PC 9

BL 0.330 -0.028 0.662 -0.133 -0.492 0.140 0.290 -0.227 0.192

BN 0.333 -0.398 0.243 0.214 -0.151 -0.239 -0.621 0.369 -0.155

BH 0.325 -0.456 -0.063 0.453 0.357 -0.123 0.560 0.059 0.133

ML 0.324 0.400 0.361 -0.281 0.633 0.061 -0.044 0.331 0.100

MN 0.354 -0.017 -0.055 -0.259 0.145 -0.303 0.035 -0.506 -0.659

MH 0.344 -0.217 -0.316 -0.233 0.128 0.152 -0.365 -0.423 0.573

SL 0.311 0.545 -0.077 0.681 -0.065 0.224 -0.167 -0.224 -0.063

SN 0.332 0.338 -0.374 -0.152 -0.359 -0.561 0.184 0.294 0.223

SH 0.344 -0.118 -0.346 -0.212 -0.190 0.652 0.137 0.357 -0.308

Examining the Table 23 correlation matrix of principal components, it can be

seen that the first principal component is strongly related to the excess market return

(RM-RF) with a correlation of 0.966. This level effect approximately influences all

portfolios in a similar magnitude shown in the column of the first principal component

in Table 24.

The second principal component is strongly correlated to SMB (0.791) and also

strongly negatively related to HML (-0.804). Moreover, the third principal component

seems to be simultaneously affected by both SMB and HML factors, with negative

correlations of -0.455 and -0.508 respectively.

51

8.4. Model Evaluation

Given that the exposures to the three and five-factor, bi, si, hi, ri, and ci, fully

capture all variations in the average expected return, the intercept a i is zero for all

portfolios. Therefore, by definition, effective pricing models should have alphas close to

zero. The Fama and French three-factor model, 2 x 2 five-factor model, 2 x 2 x 2 x 2 five

factor model, and principal component regressions are evaluated by comparison of alpha

values and R2 which are summarised from Tables 6, 7, 8, and 9 in Table 25.

The Fama and French three-factor model have alphas that are close to zero

which range from 0.004 to 0.011. However, small and medium high BE/ME portfolios

(SH and MH) alphas are significant at 2.510 and 2.579 t-statistic respectively. By itself,

the three-factor intercept is sufficient to reject the three-factor model at a 99% one-tail

confidence interval. Not surprisingly the portfolios that reject the three-factor model

have the lowest R2 of 83.8% and 83.5%. Aside from these portfolios, the Fama and

French three-factor model explains the variation in average returns well with R2 of

86.1% to 92.1%.

Table 25:

Summarised alpha and R2 of regressions on nine portfolios sorted on Size-BE/ME from

June 2004 to June 2015, 132 months.

a

R2

Low Neutral High Low Neutral High

Fama and French 3 Factor Model

Small 0.006 0.004 0.010

0.898 0.893 0.838

Medium 0.004 0.005 0.011

0.883 0.888 0.835

Big 0.006 0.006 0.004

0.921 0.890 0.861

Principal Component Regression

Small 0.001 -0.003 -0.003

0.906 0.945 0.934

Medium -0.001 -0.003 -0.005

0.931 0.943 0.951

Big -0.005 0.001 0.001

0.954 0.944 0.922

Fama and French 5 Factor 2x2 Model

Small 0.004 0.003 0.008

0.888 0.784 0.723

Medium 0.002 0.004 0.009

0.908 0.907 0.807

Big 0.004 0.006 0.005

0.901 0.874 0.842

Fama and French 5 Factor 2x2x2x2 Model

Small 0.002 0.003 0.007

0.877 0.772 0.719

Medium 0.001 0.003 0.009

0.905 0.906 0.812

Big 0.004 0.006 0.005

0.901 0.875 0.848

52

As the first three principal components are constructed to maximise the variation

of returns upon increasing dimensions. Principal component regression inevitably

produces the best three factor model. Consequently, alpha intercepts are individually

smaller in all principal component regressions than the Fama and French three-factor

model. Likewise, the average R2 of PCR is 93.7% which exceeds the highest R2 of the

three-factor model.

In terms of R2, both the five-factor models do not improve the explanatory power

of the three-factor model except the ML portfolio (90.8% and 90.5%). This may be due to

the difference in how factors are defined in the augmented models which do not fully

reflect the political, regulatory, and complex legal environment under a transition

economy. Nonetheless, alpha intercepts of the five factor model are all in a lower degree

(except BH) than those of the three factor model. Moreover, the t-statistics of all alphas

for the five factor model do not reject the model.

In summary, the empirical results show that both the three-factor and five-factor

model are useful tools in asset pricing. The inferences drawn from the results suggest

that different factors have contrasting effects and implications upon particular markets.

Such inconsistencies are found in the average returns of aggressive stocks being higher

than the average returns of conservative stocks, in addition to the non-redundant HML

factor for the Chinese stock market.

53

9. Conclusion

The Fama-French three and five-factor model is studied to examine whether

risks associated with emerging markets are identified by the factors and sufficiently

priced within the model. Firstly, patterns of small size premiums are identified in Size-

BE/ME sort portfolios which are in line with the findings of Banz (1981) and Fama and

French (1993). Additionally, the empirical results clearly show the value effect similar to

Fama and French (1993), where these effects are stronger among small stocks than in

large stocks.

Secondly, contrary to the Fama and French (2015), small firms that invest a lot

despite low profitability are not found to be a problem. Such that all portfolio alpha

intercepts in the five-factor sorts do not reject the model. Furthermore, opposite

investment effects are found in which firms that invest aggressively yield higher

average returns than firms that invest conservatively. The investment premiums are

observed to be stronger in big size firms that are characterised by low or negative CMA

intercepts.

Thirdly, Fama and French (2015) suggest that HML is a redundant factor and its

average returns are captured by exposures to RM-RF, SMB, RMW, and CMA. However,

the results show that the HML factor is significant in all models, especially in low and

high BE/ME portfolios. The inferences drawn from the results indicate that different

factors and the way factors are defined have contrasting effects and implications in an

emerging market. Such inconsistencies are found in the average returns of aggressive

stocks being higher than the average returns of conservative stocks, in addition to the

non-redundant HML factor for the Chinese stock market.

By construction, the orthogonal nature of principal components that maximise

variability along each transformed dimension produces the best three-factor model.

Overall, the Fama-French five-factor model is not prominently better than the Fama-

French three-factor model. The three-factor model captures higher average cross-

sectional returns defined by the higher R2. On the other hand, the alpha intercepts of

the five-factor model are all in a lower degree (except BH) than the three-factor model.

Thus, both the three-factor and the two sort five-factor models are useful for making

rational portfolio decisions even with small departure from the over-simplified

assumption of normally distributed returns in the Chinese stock market.

54

Furthermore, the joint controls of the 2 x 2 and 2 x 2 x 2 x 2 sorts are attractive

for isolating and identifying the estimates for factor premiums. However, given the top

100 weight sample, it's not clear whether the poorly diversified 2 x 2 x 2 x 2 sort

effectively isolate size, value, profitability, and investment patterns.

Finally, future studies may consider a larger sample size to accommodate for 16

well-diversified portfolios in the 2 x 2 x 2 x 2 joint control Fama-French five-factor

model. Additional investigations of profitability and investment factors in alternative

emerging markets can be explored to confirm or refute the disparities found contrary to

the findings of Fama and French (2015).

55

References

Banz, Rolf W., 1981. The relationship between return and market value of common stocks,

Journal of Financial Economics 9:1, 3-15.

Barber, Bead M., and John D. Lyon., 1997. Firm Size, Book-To-Market Ratio, And Security

Returns: A Holdout Sample Of Financial Firms. The Journal of Finance 52.2, 875-883.

Bartlett, M. S., 1954. A note on the multiplying factors for various chi square approximations.

Journal of Royal Statistical Society, 16. Series B, 296-298.

Basu, Sanjoy., 1983. The relationship between earnings yield, market value, and return for

NYSE common stocks: Further evidence. Journal of Financial Economics 12, 129-156.

Bhandari, Laxmi Chand., 1988. Debt/Equity ratio and expected common stock returns: empirical

evidence, Journal of Finance 43:2, 507–28.

Black, F., 1993. Beta and Return, Journal of Portfolio Management, 20.1, 8-18.

Black, Fischer, Michael C. Jensen, and Myron Scholes., 1972. The capital asset pricing model:

some empirical tests. Studies in Theory of Capital Markets, 79-121.

Chan, L.K.C., Chen, N.F., and Hsieh, D.A., 1985. An exploratory investigation of the firm size

effect. Journal of Financial Economics 14, 451–471.

Chan, Louis K.C., Yasushi Hamao and Josef Lakonishok, 1991. Fundamentals and stock returns

in Japan, Journal of Finance, 46:5, 1739-1789

Chen, Nai-Fu, Richard Roll, and Stephen A. Ross., 1986. Economic Forces And The Stock

Market. The Journal of Business 59.3, 383.

Douglas, George W., 1968. Risk in the equity markets: an empirical appraisal of market

efficiency, Ann Arbor, Michigan, University Microfilms Inc.

Elton, Edwin J, and Martin J. Gruber., 1997. Modern Portfolio Theory, 1950 To Date. Journal of Banking & Finance 21.11-12, 1743-1759.

Fabozzi, Frank J., and Jack Clark Francis., 1978. Beta as a random coefficient. The Journal of Financial and Quantitative Analysis 13.1, 101.

Fama, Eugene F., and James D. MacBeth., 1973. Risk, return, and equilibrium: empirical tests.

The Journal of Political Economy 81:3, 607-636.

Fama, Eugene F., and Kenneth R. French., 1993. Common risk factors in the returns on stocks

and bonds, Journal of Financial Economics 33, 3-56.

Fama, Eugene F., and Kenneth R. French., 1992. The cross-section of expected stock returns,

Journal of Finance 47:2, 427–465.

Fama Eugene F., and Kenneth R. French., 1995. Size and book-to-market factors in earnings and

returns, Journal of Finance 50, 131-155.

Fama, Eugene F., and Kenneth R. French., 1996. Multifactor explanations of asset pricing

anomalies, Journal of Finance 51.1, 55–84.

Fama, Eugene F., and Kenneth R. French., 2015. A Five-Factor Asset Pricing Model. Journal of Financial Economics 116.1, 1-22.

Graham, John R, and Campbell R. Harvey., 2001. The Theory And Practice Of Corporate

Finance: Evidence From The Field. Journal of Financial Economics 60.2-3, 187-243.

Hotelling, Harold., 1933. Analysis of a Complex of Statistical Variables into Principal

Components. Journal of Educational Psychology, 24 6-7, 417–441 & 498–520.

56

Jackson, D., 1993. Stopping rules in principal components analysis: a comparison of heuristical

and statistical approaches. Ecology 74.8, 2204–2214.

Jagannanthan, Ravi, and Zhenyu Wang., 1996. The conditional CAPM And the cross-section of

expected returns. The Journal of Finance 51.1, 3-53.

Kaiser, H.F., 1960. The application of electronic computers to factor analysis. Educational and Psychological Measurement, 20,141-151.

Kothari, S.P., Jay Shanken, and Richard G. Sloan., 1995. Another look at the cross-section of

expected returns, Journal of Finance 50, 185-224.

Kraus, Alan, and Robert H. Litzenberger., 1976. Skewness Preference And The Valuation Of

Risk Assets. The Journal of Finance 31.4, 1085.

Lakonishok, Josef, and Alan C. Shapiro., 1986. Systematic risk, total risk and size as

determinants of stock market returns. Journal of Banking & Finance 10.1, 115-132.

Lakonishok, Josef, Andrei Schleifer and Robert W. Vishny,. 1994. Contrarian investment,

extrapolation and risk, Journal of Finance 49, 1541-1578.

Lintner, J., 1965. The valuation of risk assets and the selection of risky investments in stock

portfolios and capital budgets, Review of Economics and Statistics 47:1, 13–37

Markowitz, H., 1952. Portfolio Selection. The Journal of Finance 7.1 , 77.

Miller, M., Modigliani, F., 1961. Dividend policy, growth, and the valuation of shares. Journal of Business 34, 411–433.

Mossin, J., 1966. Equilibrium in a capital asset market, Econometrica 34, 768-783.

Novy-Marx, R., 2013. The other side of value: The gross profitability premium. Journal of Financial Economics 108, 1–28.

Pearson, Karl., 1901. LIII. On Lines And Planes Of Closest Fit To Systems Of Points In Space.

Philosophical Magazine Series 6 2.11, 559-572.

Peres-Neto, Pedro R., Donald A. Jackson, and Keith M. Somers., 2005. How Many Principal

Components? Stopping Rules For Determining The Number Of Non-Trivial Axes Revisited.

Computational Statistics & Data Analysis 49.4, 974-997.

Rosenberg, B., Reid, K., & Lanstein, R., 1985. Persuasive evidence of market inefficiency. The Journal of Portfolio Management, 11.3, 9-16.

Ross, Stephen A., 1976. The arbitrage theory of capital asset pricing, Journal of Economic Theory

13. 341-360.

Sharpe, William F., 1964. Capital asset prices: a theory of market equilibrium under conditions

of risk, Journal of Finance 19:3, 425–442.

Stattman, D., 1980. Book values and stock returns, The Chicago MBA: A Journal of Selected Papers, 4:25–45.

Titman, S., Wei, K., Xie, F., 2004. Capital investments and stock returns. Journal of Financial and Quantitative Analysis 39, 677–700.

Comparative tests of Fama-French Three and Five-Factor models using Principal Component Analysis on...

Data & Analytics

Transcript of Comparative tests of Fama-French Three and Five-Factor models using Principal Component Analysis on...