Abnormal Returns in Gold and Silver Exchange Traded...
Transcript of Abnormal Returns in Gold and Silver Exchange Traded...
Abnormal Returns in Gold and Silver
Exchange Traded Funds
Michael J. Naylor*, Udomsak Wongchoti and Chris Gianotti
School of Economics & Finance
College of Business
Massey University
Private Bag 11 222
Palmerston North
New Zealand.
* Corresponding author; [email protected]. T: +64-6-3509099 x7720
___________________________________________________
Acknowledgements
We wish to thank departmental colleagues for comments.
Version 1.0: 8th Feb 2011
~ 2 ~
Abnormal Returns in Gold and Silver Exchange Traded Funds
____________________________________________________
Abstract
Exchange Traded Funds (ETFs) are one of the fastest growing areas of
investing and have significantly changed investor behavior, yet there is limited
academic research on ETFs, with minimal on commodity based ETFs. This paper
is the first to examine whether abnormal returns are available for Gold and
Silver ETFS.
Gold and silver ETFs have attracted substantial investments, with $40.5 billion
USD in gold (GLD) is and $5.22 billion in silver (SLV). Our study shows that
while inefficiency is present in the gold and silver ETF markets for the period
2006-2009, when risk is factored the abnormal returns evaporate. However,
use of simple filter trading rules does allow abnormal returns. GLD and SLV
exhibit similar characteristics to the underlying physical assets and when
ignoring risk are able to provide investors with greater profit than if investing in
the market.
JEL classification: G10, G11, G12, G14
Keywords: Abnormal returns, exchange traded funds, commodities, filter trading,
~ 3 ~
1 Introduction
1.1 Overview
Exchange Traded Funds (ETFs) are one of the fastest growing areas of
financial markets and have significantly changed how investors at both an
institutional and retail level construct their investment portfolios investor
behavior. The rapid growth raises the question as to whether it is due to
superior characteristics or whether an abnormal return is present. However
there is relatively little academic research on ETFs (Anderson, Born &
Schnusenberg, 2008), and nearly no research on commodity ETFs, the fastest
growing area. Very little is known about the dynamics of this market. This paper
is the first to examine whether prices in commodity exchange traded funds in
two main funds GLD (Gold) and SLV (silver) accurately reflect all the relevant
information in the market, exhibit dependence and offer abnormal return on a
risk adjusted basis. We also examine four smaller funds.
The study is also unique in that it applies existing knowledge of physical gold
and silver prices/returns to examine whether the return properties and
characteristics for gold and silver ETFs perform in the same way as the physical
asset.
1.2 The Exchange Traded Fund Market
ETFs are similar to mutual funds, except that shares in an ETF can be bought
and sold throughout the day and ETFs do not sell or redeem their individual
shares at net asset value. Instead, financial institutions purchase and redeem
ETF shares directly from the ETF in large blocks, called "creation units". ETFs
generally provide the easy diversification, low expense ratios1, transparent
portfolios and tax efficiency as index funds, while still maintaining all the
features of ordinary stock, such as limit orders, short selling, and options. ETF
funds also do not have to maintain a cash reserve for redemptions or pay
brokerage expenses so do not charge a load.
EFT prices are set by markets and not based on day-end asset values. If there
is strong investor demand for an ETF, its share price will rise above its net
asset value per share, giving arbitrageurs an incentive to purchase additional
creation units from the ETF and sell the component ETF shares in the open
1 Annual management fees are 0.3-.6%, vs mutual funds at 1-2% (Kelleher, 2010).
~ 4 ~
market. The additional supply of ETF shares increases the ETF's market
capitalization and reduces the market price per share, generally eliminating the
premium over net asset value. ETFs are dependent on the efficacy of the
arbitrage mechanism in order for their share price to track net asset value.
This growth reflects the popularity of the ETF as an investment vehicle for both
retail and institutional investors as an easy to use way to diversify across a
broader range of securities. The value of total ETF assets under management at
January 2010 was US$1,032B, up from US$710B in December 2008 (Fuhr,
2010). At the current growth rate ETFs will rapidly eclipse that of mutual funds.
The fastest growing sector of ETFs has been commodities as the market is
active and offers diversification. The first commodity ETF was established in
2002 and by January 2010 the GLD fund had grown to US$ 40.5B and SLV fund
to US$5.22B (The largest is S&P500 fund SPY with US$78B). Both GLD and SLV
physically own the commodity.
The impact of the rapid growth of commodity gold and silver ETFs is uncertain;
the rapid growth in ETF assets can increase underlying demand, while supply is
unchanged. Alternatively commodity ETFs may be substitutes for existing
demand for the product. Brikner & Collins (2008) argue that growth in ETFs has
boosted speculation in markets and creating unstable price fluctuations. The
ETFs themselves are also speculators.
2 Literature on the Efficiency of the Physical and ETFs markets
Aggarwal & Soenen (1988) examines gold market efficiency vs the S&P 500.
They argue that the strong movement in gold prices offer an opportunity for
strong returns even risk adjusted, though there was significant positive
skewness and daily changes was leptokurtic. However the inefficiencies
diminished under the CAPM model and that when filter tests were applied the
excess return likewise evaporated.
Solt & Swanson (1981) note the importance of including gold and silver in a
risk return framework and note that gold and silver returns show a low
correlation to the market. They thus conclude that gold and silver have a
speculative rather than investment value.
Fama & Blume (1973) developed filter trading rules and examined the returns
of 24 filters concluding that the evidence indicated trading rules do not
consistently earn returns greater to those of a simple passive (buy and hold)
~ 5 ~
approach. The filter trading rules are x%; if daily closing price (of gold or silver)
increases x% , buy and hold the commodity, until it’s price moves down at least
x% from the subsequent high, at which time the investor should simultaneously
sell and go short. The short position is maintained until the daily close price
rises at least x% above the subsequent low at which time the investors buys.
De Jong & Rhee (2008) found that ETFs in general provide abnormal risk
adjusted returns based on Fama and French 3 factor model exceed transaction
costs. De Jong & Rhee do not however distinguish commodity ETFs from other
ETFs.
Buyuksahin, Haigh, and Robe (2008) examined the S&P Goldman Sachs
Commodities Index (S&P GCSI) weekly return data and revealed no correlation
between commodity and equity returns, with commodity returns being lower
and no sustained pricing relationship over the long-run. Nor was evidence found
of an increase in co-movement between equities and commodities during a
period of extreme returns.
Engle & Sarkar (2006) found that where an arbitrage opportunity exists
between the price of a given ETF and its Net Asset Value (NAV), this will be
minor and quickly exploited eliminate any premiums/discounts. Miffre (2007)
shows that higher than average returns can be achieved by investors taking
long or short positions for a given level of risk when investing in iShares MSCI
country funds ETF.
3 Methodology
3.1 Hypotheses
The objective of this study is to determine if by investing in ETFs GLD (gold)
and SLV (silver), investors can achieve abnormal or above market returns due
to inefficiencies. The study assumes gold and silver pricing are driven by
fundamental supply and demand forces in the marketplace.
Time series analysis is used and based on a daily close price/return series
derived from the NYSE Arca between 5th of May 2006 and 31st of December
2009. Within the study period the macro economy has experienced periods of
growth, severe contraction and return to low/mild growth while maintaining
extraordinarily high demand for gold and silver. It is thus an ideal period as for
~ 6 ~
ascertaining whether dependence actually exists in the series and returns are
abnormal.
On this basis our hypothesis is that;
a) Returns between gold backed ETF GLD and silver backed ETF SLV are
independent and inefficient, meaning that an abnormal return is possible if
investing in GLD/SLV.
b) GLD and SLV provide investors with abnormal returns once adjusted for risk.
Expressed as;
CAPM (1)
(RGLD - Rf)t = + β (Rmt − Rf)t + ųt
(RSLV - Rf)t = + β (Rmt − Rf)t + ųt
c) The application of filter trading rules to GLD and SLV will result in an
evaporation of the abnormal return.
3.2 Data
GLD and SLV are the primary funds focused on in the analysis. However four
additional funds on the NYSE Arca, the gold funds IAU, and DGL, and silver
funds DBS and AGQ, were randomly chosen from twenty-one precious metals
ETFs. The key criteria are the fund must trade gold or silver, be an ETF (not an
ETN), use a passive strategy, offer minimum net assets of $50 million and have
begun trading no later than December 2008. This criteria reduced the applicable
funds for research from twenty-one to six. We also included the S&P 500 as a
benchmark. The data has been sourced from the Centre for Research and
Securities Prices (CRSP).
Table 1 about here
The time horizon is from 5 May 2006 to 31 December 2009 for the funds GLD
and SLV. As the inception date of the four remaining ETFs IAU, DGL, DBS and
AGQ varies between January 2005 and December 2008 the dataset for these
funds is adjusted to the each funds applicable lifetime. Data will be based on
daily closing prices of units/shares in each of the funds.
~ 7 ~
3.3 Methodology – Hypothesis a) Abnormal Returns
To prove hypothesis a) we must show that a dependency exists in series
between prices. Conversely, if GLD and SLV were efficient, then our expectation
is that today’s price is not reflective of yesterday’s price, rather it is reflective of
today’s information and that there is no pattern or consistency so that pricing
patterns follow a random walk, or that they are normally distributed.
The method for completing the test of hypothesis a) is to perform econometric
tests across 923 observed price movements for GLD/SLV and between 261 to
925 observations for the remaining funds (IAU/AGL/DBS/AGY) when extending
the test. For consistency with previous tests, the price series for GLD and SLV
are converted to (natural) logarithms, or continuously compounded returns. The
data set for this test is unfiltered as the use of logarithms reduces the affects of
outliers.
Previous work in the study of efficiency in physical gold and silver pricing by
Solt & Swanson (1981) identifies the occurrence of non-normality and non-
stationarity in the means and variances of the price for distributions, concluding
that (despite an element on non-stationarity) there exists a positive
dependence in the series. Snedecor & Cochran (1967) likewise reject the
presence of normality for a price series of gold.
Establishing stationarity is important for ensuring the significance of the series
to predict future prices and will be tested using the autocorrelation function
(ACF) and supported by the augmented Dickey-Fuller test (ADF). The series is
expected to exhibit the following characteristics if stationary;
Mean: E(Y)t = ų (2)
Variance : var(Y)t = E(Yt – ų)2 = 2
Covariance : Yk = E[(Yt – ų) (Yt+k – ų)]
where Yt is the time series.
Under the non-stationary process zero mean and constant variance is
expected. A test of non-normality is then done using the Bera-Jarque method or
to determine whether t are normally distributed. The test will support the
outcome of the ADF unit root test.
~ 8 ~
3.4 Methodology – Hypothesis b) Risk Adjusted Returns
Hypothesis b) tests for risk-adjusted abnormal returns using the Sharp-Linter-
Mossin model based on linear regression, otherwise known as the capital asset
pricing model (CAPM). This follows Aggrawal & Soenen (1988) we use daily
return series rather than convert into monthly returns. The market return is
based on the S&P500 (ETF SPY) and the risk free rate is represented by the US
90 day Treasury bill rate. The beta coefficient (β) measures systematic risk. The
market return (Rm) is represented by the ETF SPY, the broad based S&P 500. An
aggregate of the market being the return on exchanges NYSE, AmEx, NASDAQ
and Arca was also tested in the model to represent (Rm), the outcomes were
comparable to that when using SPY. The CAPM is expressed as;
CAPM: (3)
(RGLD - Rf)t = + β (Rmt − Rf)t + ųt
(RSLV - Rf)t = + β (Rmt − Rf)t + ųt
A consistent risk adjusted return by the applicable gold and silver ETF that
exceeds the market will show the presence of abnormal returns for these
investment vehicles. A Whites test (1980) is performed to ensure constant
standard errors in the coefficients, or that homoskedasticity is present, thus
proving the second assumption of the classic linear regression model (CLRM)
holds;
CLRM Second Assumption: (4)
var(t) =2 and <
3.5 Methodology – Hypothesis c) Filtered Trading Rules
We then use filter trading rules to test for profitable trading strategies. Filter
trading is a strategy determining by percentage movement of the stock price at
which points in time investors should buy or sell stock, or in other words what is
the entry and exit point of the trade, using 2% price movements as buy/sell
signals. This was applied retrospectively to GLD and SLV to ascertain abnormal
returns for the period December 2006 to December 2009. Testing is not
extended to the smaller funds. The benchmark is the stock performance of a
~ 9 ~
passive buy and hold strategy rather than the market return or SPY. The
starting point for the filer will follow whichever movement comes first, either
upward or downward. Log returns are used and applied to filters at 3% and 5%
in regression analysis against the market, following Solt & Swanson (1981).
4 Results
4.1 Preliminary Analysis
We plotted real gold and silver versus GLD and SLV to provide an indication of
the relative performance of each investment over the same time horizon, using
a 100 at 19 November 2004, and an end date at 31 January 2009.
Graph 1 about here
Graph 1 indicates that the movement between GLD and gold is quite close with
very little separation between the two trend lines particularly since August
2008, and coinciding with a steady trend in appreciation of the asset.
Graph 2 about here
Graph 2 shows similar results for SLV and silver, with no significant points of
deviation between the two return series. Comparing the time series of GLD
versus the market, shows GLD has outperformed the market benchmark,
represented by SPY especially since August 2007. A comparison of SLV versus
the market (SPY) shows a smaller outperformance. SLVs performance began
from a lower base relative to the market and comparative to GLD and was not
as resilient when systematic risk increased. In fact SLV found support at a
similar time to the market and began an upward trend with much larger
oscillations.
Graph 3 about here
Graph 4 about here
4.2 Analysis and Results - Hypothesis a) Abnormal Returns, Stationary Testing
Graphical analysis provided a starting point to the analysis, offering a feel for
the nature of the data and indication whether stationarity or autocorrelation is
present before more formal tests are performed. The trend for the GLD
~ 10 ~
distribution offered no indication of a random walk with or without drift, or a
process that is non-stationary. Rather the process for the GLD distribution is
more consistent with white noise, or a stationary process that has a relatively
even distribution over time. The same inference can be drawn for the SLV
distribution.
The graphed series also indicated that negative autocorrelation may exist
given the pattern where the residuals appear to cross the axis on a frequent
basis that would be unusual for a random walk distribution. Detecting a
autocorrelation relationship suggests that there is some inefficiency in the
market and that successive price changes are not independent
The Autocorrelation Function (ACF) is the first important formal test of the
hypothesis and is in itself a relatively simple test for detecting dependence in
the series. The test is expressed as;
ACF: (5)
pk = Yk / Y0
= covariance at lag k / variance
Plotting pk against k will produce the correlogram although determining the
appropriate lag length in the equation is important when performing the test
and requires a trial and error process. Akaike and Schwarz information criterion
has been chosen as the preferred method to assess the appropriate amount of
lags.
Lags were tested from 5 to 35 at intervals of five and the model offering the
lowest value in these criteria selected. The lowest value was achieved at 5 lags
where it gave Akaike Schwarz information values at -5.49 and -5.45. 5 lags
were then applied to both the GLD and SLV ACF and correlogram.
Figure 1 about here
The results for show GLD shows all lags are close to ‘0’, so GLD exhibits a
typical stationary process. The AC range is between -.033 and .021, and the
PAC range is the same, there is very little difference from zero.
Figure 2 about here
~ 11 ~
The correlogram for the SLV distribution shows AC and PAC values are
insignificantly different from zero across all lags. SLV exhibits stationarity.
The analysis is now extended to the augmented Dickey-Fuller (ADF) test which
is also known as the unit root test. The ADF test is suitable for larger sample
consists of the following regression estimate;
ADF: (6)
m
i
ttittt iYYY1
21 1
The ADF test is one sided and offers three critical values at 1%, 5% and 10%
in comparison to a test statistic. When performing the test lags are applied to
ensure that the error terms are uncorrelated and that unbiased estimates can
be obtained preventing the test from to easily accepting whether a unit root is
present.
Figure 3 about here
The test results for the GLD distribution are shown in Figure 3. The ADF for the
GLD logarithm is performed with 5 lags applied to the dependent variable. The
ADF test-statistic does not exceed the critical values at any level and has
determined that the series is stationary; it does not have a unit root.
When extending the ADF test at intervals of five to include up to 35 lags the
result in terms of the ADF statistic relative to the critical values was consistent
which we would expect. The bulk of the coefficients produced at various lags are
also shown to be negative implying a relationship exists between price changes.
The same tests are now performed on the series for SLV. The results suggest
the null hypothesis should also be rejected. The series is stationary with zero
mean.
Figure 4 about here
However, the SLV series is differentiated from the GLD series as even though
the majority of coefficients are less than zero; the amount is fewer than those
applicable to GLD as more lags are applied.
The ADF result is consistent with that obtained by Solt & Swanson (1981) that
the rejection of a unit root in silver is slightly less obvious than that in gold, or
~ 12 ~
applied to this model that the rejection of a unit root in SLV is slightly less
obvious than that evident in GLD. The implication is that not all available
information is reflected in the prices for GLD and SLV as would the case if the
market were truly efficient instead of weak form efficient.
When expanding the ACF, Correlogram and ADF test to funds IAU, DBS, DGL
and AGQ, the tests produced comparable results to those obtained from GLD
and SLV. The logarithmic distribution was found to possess no unit root. As with
GLD and SLV the outcome did not change when increasing or decreasing lags.
The findings held across all the four smaller funds although the number of
observations included in the analysis was in most instances lower than those in
the test of GLD and SLV, with AGQ having the least number of observations at
265. The null hypothesis tested in this case that the residuals have a unit root
was rejected. This suggests that market for gold and silver ETFs is inefficient.
For the purpose of forecasting and predicating price a non-stationary process
is of little value to investors. It only holds for the period for which it was
studied.
4.3 Analysis and Results - Hypothesis a) Abnormal Returns, Normality Testing
To complete the analysis of hypothesis a) a test of normality is undertaken and
applied to each fund. The null hypothesis tested in this case is that the residuals
(ųt) are normally distributed; the alternate hypothesis is that they are non-
normally distributed. Evidence of a non-normal distribution would identify
dependence in the series, or that the series is weak form efficient. Such
information is relevant to investors because it will influence buy and sell
decision making as the investor attempts to predict when to enter or leave the
market. As the data for all funds in the analysis has been transformed into
logarithms and is based on a daily series, it has essentially had the effect of
rescaling the data and pulling in extreme observations hence filters will not be
applied to the test. It is also not necessary to filter in the cases of GLD, SLV and
IAU due to the large amount of observations which encourages a normal
distribution.
Table 2 about here
~ 13 ~
The summary statistics for each fund are available in Table 2 and include
values for mean, median, skewness, kurtosis and the Bera-Jarque statistic. An
examination of the summary statistics across all funds shows the mean and
median for GLD to be the closest, at .0005 and .0008 respectively, suggesting
there is a presence of normality in the series which is typical where the mean is
close to zero. The mean and median for IAU, DGL and AGQ are also quite close
and in some ways similar to GLD, indicating normality as there is not significant
differences between each mean and median value. The variation between the
mean and median values for IAU and DGL is both at .0004 whereas AGQ is at
.004. Conversely, the mean and median for SLV are comparatively further apart
at .0002 and .003, as is that for DBS at .0003 and 003. Both of these funds
trade silver rather than gold and have an initial indication of non-normality.
The table also shows the Bera-Jarque statistic, and the measures of skewness
and kurtosis. Aggarwal & Soenen (1988) found strong kurtosis for the gold
price. We found strong positive kurtosis above the critical threshold of ‘3’ for all
funds except AGQ., with a range between 8.310 for IAU at the lower end to
13.48 for DBS. The primary funds GLD and SLV offer kurtosis values of 7.5385
and 9.9616. The exception to the strongly positive kurtosis identified in the
funds is silver fund AGQ which is below ‘3’, though this is based on fewer
observations. These results suggest dependency exists in the series for GLD,
SLV, IAU, DGL and DBS.
The Bera-Jarque is an asymptotic test based on OLS residuals. The test first
calculates skewness then kurtosis testing whether the coefficients are jointly
zero (Brooks, 2002). The norms of skewness and kurtosis are 0 and 3. The test
statistic is given by;
Bera-Jarque Test: (7)
BJ = n
S2 + (K – 3)2
6 24
The Bera-Jarque output statistics are extremely high and very different from
zero for all funds excluding AGQ. DBS has the highest Bera-Jarque value at
3608 while the lowest excluding AGQ is DBS at 642, comparatively GLS and SLV
give values at 797 and 2026. This value of these statistics means the skewness
~ 14 ~
and kurtosis statistics is not jointly zero suggesting the null of normality should
be rejected.
The p-value of the Bera-Jarque statistics is at the same time very low which
we would expect; it is zero for all funds except AGQ. To not reject the null of
normality the p-value should be larger than .05 at the 5% test level (Brooks,
2002). As the p-value is below .05 normality is conclusively rejected based on
these outputs particularly given that the number of observations is large, at
least for funds GLD, SLV, IAU and DBS. Comparatively the p-value for AGQ is
greater than .05, meaning normality would not be rejected and fitting with the
low Bera-Jarque value of 1.36.
Reviewing the residuals provides method to understand the shape of a
distribution. The curve of the histograms under the formal Bera-Jarque test is
representative of the frequency distribution and is consistent with that of a
leptokurtic rather than a normal distribution. For GLD, SLV, IAU, DGL and DBS
the distribution is shown to be high peaked and steep, so the distributions are
not perfectly normally distributed. The curve for AGQ is conspicuous given is
has a bell shaped curve more consistent with a normal distribution, although
the number of observations in the series is somewhat fewer than those of other
funds.
The existence of non-normality in the GLD and SLV series seems apparent; it
establishes that GLD and SLV ETF prices behave like actual gold and silver
prices giving some assurance around the funds ability to offer an abnormal
return and consistency with prior studies into the distribution characteristics of
the physical assets. The null is likewise rejected for funds IAU, DGL, DBS, but
not for AGQ, although it seems logical that as the AGQ series extends from its
current short base it will experience non-normality as the other funds have
done.
4.4 Analysis and Results - Hypothesis b) Risk Adjusted Returns, CAPM
It is not enough to purely suggest abnormal returns are present in gold and
without also considering the return within a risk return framework. The Capital
Asset Pricing Model (CAPM) is the chosen tool with which to assess and explain
the relationship as to whether GLD and SLV maintain the ability to offer
investors an above market (expected) return relative to the risk of investing in
the funds. The CAPM is a multivariate linear regression using least squares and
~ 15 ~
when performed in this analysis uses the logarithmic return series for GLD and
SLV, both these series and the series of other funds used in the analysis are
represented by (Ri).
Figure 5 about here
The results for GLD are shown in Figure 5. The slope coefficient, which is the
beta coefficient at 0.6095 is less than one, suggesting that the security moves
less than the overall market and is not volatile, it is rather defensive in nature
and suggests GLD would have some risk reducing properties in a portfolio of
stocks. The intercept term at -0.0091 is not significantly different from zero.
This means there is no significant risk adjusted abnormal return for GLD for the
time period.
When applying sub-period analysis to GLD, the horizon of 1 May 2006 through
1 May 2007 are analyzed to assess whether in its relatively immature state if
GLD demonstrated greater inefficiency. The beta coefficient over horizon is
0.5721, marginally less than the beta over the three years showing that GLD
was slightly less risky over this period. The intercept is -0.0211, meaning that
GLD did not offer a risk adjusted abnormal return for the period.
The horizon 1 March 2008 to 1 March 2009 is also tested, this will identify
whether the rapidly increasing systemic risk produced an abnormal return if
investing in GLD during the period of the financial crisis. The beta coefficient for
the period showed a remarkable difference from the three years or one year
immaturity period being 0.0491, meaning that GLD was highly defensive when
market high risk was greatest.
This conclusion is similar to Aggarwal & Soenen (1988), though, it is important
to point out that the asset, GLD in our case, has appreciated significantly during
the period used in the analysis offering investors a strongly positive return on a
simple passive or buy and hold strategy.
Figure 6 about here
The results SLV are in Figure 6 and are similar to those of GLD, with a low beta
at 0.06503 and intercept at -0.0084. The same conclusions can be drawn that
the return does obviously display the characteristics of inefficiencies when risk is
~ 16 ~
considered. The same quantity of daily logarithmic observations, 922 is used as
that used in the GLD CAPM regression.
The SLV beta coefficient is less than one, evidence that the asset is not
particularly volatile and is therefore risk reducing in nature within a portfolio.
This would allow the investor to purchase other more risky assets with a higher
beta and still maintain an acceptable level of chosen risk by including SLV (or
GLD) to help balance the basket of assets. The model has a minor negative
intercept coefficient at -0.0084 signifying that investors into SLV actually earned
a rate less than the market requires. The same is true of course for GLD, and
like GLD, SLV has experienced notable appreciation in underlying asset values
during the GFC although not to the same extent.
When applying sub-period analysis to SLV, the horizon of 1 May 2006 through
1 May 2007 are analyzed to assess whether in its relatively immature state if
SLV exhibited greater inefficiency. The beta coefficient over horizon is 0.4732,
noticeably lower than the beta over the three years showing that SLV was a less
risky investment over this period. The intercept is -0.0084, meaning that SLV
did not offer a risk adjusted abnormal return for the period.
The horizon 1 March 2008 to 1 March 2009 is also tested, this will identify
whether the significant increase in systemic risk produced an abnormal return if
investing in SLV during this horizon. The beta coefficient for the period was
much lower than the three years or one year immaturity period being 0.2844
and suggesting that SLV represented a low risk when market risk was greatest.
What is unexpected when comparing the regression results across all horizons
tested is just how close the SLV intercepts and slope values are to GLD,
especially given the differences in the supply and demand characteristics of the
underlying assets. Another point of difference between the funds is the
enormous disparity between market capitalizations, with GLD $35 billion USD
larger.
As with GLD the standard errors under the SLV CAPM regression offer certainty
around the model as they are reasonably small at .0277. R2 at 0.2791 implies
the model fits the data slightly better than that for GLD and that it explains the
return variability. Nonetheless the generic outcome of the CAPM analysis is that
market inefficiencies are not evident on a risk adjusted basis.
~ 17 ~
Applying the CAPM to IAU, DGL, DBS and AGQ offers a generally consistent
result to that obtained when completing the CAPM regression for GLD and SLV,
one of low market inefficiency.
The supporting regressions to some extent offer an additional element of sub-
period analysis given the shorter listing time frames available for gold backed
DGL and silver back DBS and AGQ. The shorter timeframes for these funds can
be identified by the lesser amount of daily observations in the regressions at
389, 750 and 267 respectively. In terms of actual time horizon these
observations reflect periods starting January 2007, June 2008, December 2008
and ending at 31 December 2009. The data in these sets is likewise converted
in to a logarithm for consistency with the method applied to the GLD and SLV
CAPM.
The beta coefficient of the IAU, DGL, DBS and AGQ beginning at lower end is
in the range of 0.6053 applicable to the larger fund by capitalization IAU, and
0.6625 for the smallest fund by capitalization DBS. The intercept is in the range
of -.0092 for IAU and .0024 for AGQ. With the exception of the AGQ intercept,
the results are not overtly dissimilar from that produced in the regressions for
the GLD and SLV. What is unusual is the very low beta applicable to gold fund
DGL. At 0.1143 it is noticeably lower than that of the other funds and evidence
of extraordinarily low relative volatility, in terms of market cap the fund is much
smaller than GLD only $159.03 million USD.
The intercept is slightly positive on AGQ, whereas in all other tests the
intercept is negative. This indicates AGQ investors earned slightly more than the
risk associated with the investment would require whereas investors in the
other funds earned slightly less. The value may be impacted by the fact that the
fund is relatively immature.
In summary application of the CAPM regression to gold and silver ETF’s GLD,
SLV, IAU DGL, DBS, essentially gave the same outcome as that drawn by
Aggarwal & Soenen’s investigation into the physical asset, that the there is no
indication of market inefficiency when adjusted for risk. In real terms this is
interpreted as non existence of abnormal returns, meaning the null hypothesis
of efficiency is accepted. What is interesting is that as investors have placed
increasing value on investment in gold and silver which is reflected in the
demand, the lack of a risk adjusted return is significant, it suggests investors
~ 18 ~
are willing to accept a return at a level below what the market would expect as
they seek to access the risk reducing properties.
4.5 Analysis and Results - Hypothesis b) Risk Adjusted Returns,
Heteroskedasticity
We then tested for heteroskedasticity or whether the variance of the error
terms is constant. Solt & Swanson (1988) discovered the presence of
heteroskedasticity in physical gold and silver series.
Detecting heteroskedasticity though is usually not easy and often a matter of
speculation (Gujarati & Porter, 2003). The price movements shown in Graphs 1
and 2 show strong movements over time, particularly for GLD, that suggests
the series may actually be affected by homoskedasticity, that the errors are
constant, the antithesis of heteroskedasticity. Graphical analysis though is not a
sufficient test hence the application of Whites (1980) test as will be
demonstrated. The use of Whites test differs from the approach taken by Solt &
Swanson (1981) who employed the Goldfield-Quandt test.
As price/return variability is expected to be a factor throughout the past three
years due to the economic climate and may affect the observations and create
outliers, in order to perform the test analysis logarithms are taken of each
series. Solt & Swanson experienced similar variability with the steeped change
in gold and silver pricing throughout the 1970’s and also used logarithms.
Outlying observations are a frequent driver of heteroskedasticity.
Whites (1980) test does not rely on the normality assumption and is a
conservative test. It is expressed as the following auxiliary regression;
Whites (1980) Test: (8)
viiiU XXXXXX iiiii
326
2
5
2
2433221
2
3
Disproving heteroskedasticity improves the validity of the analysis by ensuring
the OLS estimates give unbiased and consistent coefficient estimates (Brooks,
2002). However, as prior testing of physical gold and silver series behavior by
Solt & Swanson encountered the presence of a non-constant finite variance, it is
reasonable to expect the same phenomenon may occur in the ETF series given
the equivalent underlying properties of the asset.
~ 19 ~
Figure 7 about here
Figure 8 about here
The test results derived from the application of Whites (1980) test are
illustrated in Figures 7 and 8. Both outputs have a large critical value at 98.1 for
GLD and 32.1 for SLV. The implication of this test output is that it is likely the
two series possess a non-constant finite variance, especially as the given p-
value is zero. Note that heteroskedasticity is still present despite the conversion
of the series in to logarithms. For GLD and SLV the null hypothesis of constant
variance of the error terms is rejected.
IAU also evidences a very high critical value at 98.92 and a zero p-value, this
suggests the presence of heteroskedasticity. The behavior of the series is
essentially the same as that of GLD, which is not unexpected given the
underlying properties, maturity and number of observations that are all
consistent with GLD and that similar test result between the funds has been a
regular trait in the prior analyses. DGL, DBS and AGQ offer far lower critical
values that GLS, SLV and IAU, with the highest of these lower values being AGQ
at 14.73 and lowest being DBS at 1.68. The result is interesting because DBS
has a large number of observations at 750, which although less than GLD, SLV
and IAU, would generally give an expectation that the fund would provide as
similar outcome to those just mentioned yet it strongly suggests the presence
of homoskedasticity, so the null hypothesis of homoskedasticity would be
accepted for DBS.
Meanwhile DGL offers a critical value at 12.61 which would also suggest the
null is accepted for this fund. What is obviously different of the DGL, DBS and
AGQ series compared to GLD, SLV and IAU is that the p-values are above zero.
Particularly, the p-value for DBS which at .4304 relative to the other funds DGL
and AGQ at .0022 and .0006 is quite large. The p-value is important to note
because it indicates that randomness could explain the output whereas there
was not ability for randomness to explain the output for the funds.
The null hypothesis tested in the instance is that the variables have a constant
finite variance (homoskedasticity); the alternate hypothesis is that the variance
is non-constant (heteroskedasticity). Analysis of Whites test allows the null
hypothesis to be accepted for the funds DGL, DBS and AGQ only and rejected
for GLD, SLV and IAU. The funds accepting the null have far fewer observations
~ 20 ~
than those applicable in Whites test for GLD, SLV and IAU, indicating that over
shorter periods the presence of heteroskedasticity is not existent or at least it is
less evident and more difficult to detect.
4.6 Analysis and Results – Hypothesis c) Filtered Trading Rules
Note that when interpreting the results it is important to remember the large
increase in gold prices during the study period (2006-2009) and wider
macroeconomic factors caused by the Global Financial Crisis, which would have
increased systematic risk and induced a flight to gold and to a lesser extent
silver as investors sought to diversify assets. By December 2009 gold had at the
time reached new highs after earlier exceeding the $1,000 per ounce mark that
once appeared unobtainable. The trend is common to that of the gold
appreciation during the 1970’s ‘stagflation’ identified in Solt & Swanson’s (1981)
work. The two periods are both times of increased liquidity in gold.
An investor employing a passive (buy and hold) strategy could have easily
made large returns simply due to the wider factors mentioned. However we wil
test whether there was an opportunity to generate an even greater return
relative by active filter trading or momentum strategies.
Aggarwal & Soenen (1988) found that under Fama & Blume’s (1973) filters
gold performed poorly and at no point did the return from filter trading exceed
what the investor could have obtained via a passive strategy. Solt & Swanson
(1988) found that on a gross basis applying filters to gold and silver likewise
provides a poor result and that the investor would have been better to employ
the buy and hold strategy. Solt & Swanson note that the result did not differ
when managing for transaction costs and concluded there appeared to be no
inefficiency to be exploited. Comparatively, Fama & Blume (1973) used 24
different filters finding that filters were inferior in almost all cases even when
ignoring transaction costs.
We tested market inefficiency through filter trading at 3% and 5% levels.
Given that a filter of 7%, 10% or up to 50% used by the other authors seems
extreme for a daily series these will not be tested. It would be unusual for a
stock to move 7%, 10% or 50% in one day. It has happened of course but the
infrequency means the filter is unsuitable for a practical test of a daily series.
Even over a month a movement 10% would equate to substantial market
~ 21 ~
movement, although Aggarwal & Soenen apply this filter to their monthly series
as does Solt & Swanson to their weekly series.
Undoubtedly during the period under consideration the affects GFC have
caused large downward and upward movements in values, nonetheless the
study stresses that these are less important when based on a daily series and
considers 3% and 5% more appropriate. The filters are applied to both upward
and downward movements. The series is converted to logarithms excluding
transaction costs which are ignored in the analysis. Note that as the funds are
based on capital growth a divided is not applicable in the analysis.
Table 3 about here
The results of filter trading applied to GLD and SLV are shown in Table 5. The
start point is a ‘sell’ for both funds. The results show for GLD that at the 3%
and 5% level filtered trading offers a better return than that applicable to buy
and hold or passive investing. The same result is shown when filters are applied
to SLV.
Out results differ from prior studies as filters marginally exceed the buy and
hold strategy. The filtered return implies that an investor can generate a return
greater than that available by having no trading strategy. The result is unique
as it differs from that applicable to physical gold and silver returns when filters
are implemented and is of value to investors seeking to trade the gold and
silver ETF market.
5 Conclusion
Study has examined the series of three gold-backed and three silver-backed
ETFs and tested their return performance against existing knowledge of physical
gold and silver behavior. The primary focus of the study was the largest gold
and silver funds by AUM being GLD and SLV with an extension of the tests
applied to IAU, DGL, DBS and AGQ.
The tests evidenced a remarkable consistency between the return properties of
the ETFs and the physical assets of gold and silver identified in prior studies.
The log of GLD and SLV is shown to be consistent with a stationary, white noise,
distribution, or that the market for GLD and SLV are inefficient and that prices
do not reflect all available information. Non-normality was also proven for all
funds except AGQ although some caution should be exercised given that the
~ 22 ~
funds are still maturing meaning inefficiency is not unexpected. This information
is important to investors as it shows that an abnormal return can be achieved at
today.
CAPM analysis revealed an abnormal return does not exist when considered in
a risk-return framework. For investors this means that when risk is factored
that cannot be assured to exceed the market’s risk adjusted performance.
Application of filter trading rules suggested a marginal above market return is
possibly at the 3% and 5% filter level, implying that an investor using a trading
strategy has a small opportunity to outperform a passive investor.
In summary the same fundamental behavior applicable to physical gold and
silver returns also applies to GLD and SLV exchange traded fund prices/returns.
Specifically, their price movements do not follow a random walk. However, we
show that such inefficiency which was not exploitable on physical gold and silver
in the past now provides an opportunity for abnormal returns through a simple
filter trading rule. This finding is especially important in lights of David
Swensen’s Yale model. Gold and silver EFTS may serve as the other asset class
which is particularly worthwhile in the portfolio as they are inefficient (and thus
profitable) and liquid, the best of both worlds. Our finding is also valuable to
investors as the markets interest in gold and silver ETFs has made the metals
tangible substitutes to other more traditional securities and is unlikely to abate
in the near future. As the ETF vehicle and especially interest in gold continues to
grow in an exponential manner undoubtedly more investigation into funds
performance will be of interest.
+++++++++++++++++++++
References
Aggarwal, R., and Soenen, L., (1988). “The nature and efficiency of the gold market”,
Journal of Portfolio Management, 1988, Spring, Vol 14, no.3 page 18 to 21
Anderson, S., Born, J., and Schnusenberg, O., (2010). “Closed-End Funds, Exchange
Traded Funds and Hedge Funds”. Springer: New York.
Birkner, C., and Collins, D., (2008). “Futures: News, Analysis & Strategies for Futures,
Options & Derivatives Traders”; August 2008, Vol. 37 Issue 9, p54-61, 3p
Brooks, C. (2002) “Introductory Econometrics for Finance”. Cambridge Univ. Press
~ 23 ~
Cohn, L., (2010). “It’s Not Just Gold That Glitters”. Kiplinger's Personal Finance;
Apr2010, Vol. 64 Issue 4, p49-49, 1p
De Jong Jr. J., and Rhee, S., (2008). “Abnormal returns with momentum/contrarian
strategies using exchange traded funds”. Journal of Asset Management; Dec2008, Vol.
9 Issue 4, p289-299, 11p
Delcoure, N., and Zhong, M., (2007). “On the premiums of iShares”. Journal of
Empirical Finance 14: 168-195.
Engle, R., and Sarkar, D., (2006). “Premiums-discounts and Exchange Traded Funds”,
Journal of Derivatives, 13, 26-47.
Fama, E., (1970). “Efficient Capital Markets: A Review of Theory and Empirical Work”.
Journal of Finance, May 1970, pp. 383-417
Fama, E., and Blume, M., (1973). “Filter Rules and Stock Market Trading”. Journal of
Business, April 1973, pp 226-261.
Fuhr, D., (2010). “Global ETF Assets Breakthrough $1 Trillion Milestone” Available from
http://uk.ishares.com/publish/repository/in_the_media/en/downloads/uk_press_71.pd
f accessed on April 2 2010, BlackRock Inc.
Hegde, S., and McDermott, J., (2004). “The Market Liquidity of Diamonds, Q’s and their
Underlying Stocks”. Journal of Banking and Finance, 23, 1043-1067.
Kelleher, E., (2010). “Routes to lower management charges”, Financial Times, February
26 2010.
Jarecki, H., (2007) “Commodities create the right mix”, Futures Magazine, February
2007
Miffre, J. (2007). “Country-specific ETFs: An efficient approach to global assets
allocation”. Journal of Asset Management 8(2): 112-122.
Murdock, M., Richie, N., (2008). “The United States Oil Fund as a hedging instrument”,
Journal of Asset Management, Volume 9, Number 5, December 2008 , pp. 333-
346(14) Palgrave Macmillan
Murphy, Rob and Wright, Colbrin, (2010) An Empirical Investigation of the Performance
of Commodity-Based Leveraged ETFs Available at SSRN: http://ssrn.com/abstract
=1668614
Nichols, J., (2010). “Silver underperforming”, Coin World, 8/2/10, Vol. 51 Issue 2600,
p38-38
Richie, N., and Madura, J., (2007). “Impact of the QQQ on Liquidity and Risk of the
Underlying Stocks”. Quarterly Review of Economics and Finance, 47, 411-421.
Solt, M., and Swanson, P., (1981). “On the Efficiency of the Markets for Gold and
Silver”, Journal of Business, vol. 54 no. 3 p453-478
~ 24 ~
Appendix
Tables:
Table 1: Commodity ETFs Used in the Data Set
Table 1 lists the selected funds and the benchmark SPY by; ticker, name, funds family,
investment asset (either gold or silver), total net assets (USD), inception and the exchange the
fund is traded on (note the funds are often not limited to trading on one exchange).
Summary Table: Commodity ETFs Used in the Data Set
Ticker Name Fund Family Investment
Asset
Total
Net Assets
USD
Inception Date
Exchange
GLD
SPDR
Gold Shares
SPDRs Gold $40.5bn Nov
2004 NYSE Arca
IAU Comex Gold Trust
iShares Gold $2.75bn Jan
2005
NYSE
Arca
DGL DB Gold PowerShares Gold $159.03
m Jan
2007 NYSE Arca
SLV Silver Trust
iShares Silver $5.22bn Apr
2006 NYSE Arca
DBS DB
Silver PowerShares Silver $62.31m
June 2008
NYSE Arca
AGQ Ultra
Silver ProShares Silver
$171.23m
Dec 2008
NYSE Arca
Benchmark Equity ETF
SPY SPDR
S&P 500 SPDRs Equities $77.82bn 1993
NYSE Arca
(NYSE Arca, 2010)
Graph 1: GLD Return vs. Gold Return
GLD Return vs. Gold Return
0
50
100
150
200
250
300
19/1
1/2
004
19/0
2/2
005
19/0
5/2
005
19/0
8/2
005
19/1
1/2
005
19/0
2/2
006
19/0
5/2
006
19/0
8/2
006
19/1
1/2
006
19/0
2/2
007
19/0
5/2
007
19/0
8/2
007
19/1
1/2
007
19/0
2/2
008
19/0
5/2
008
19/0
8/2
008
19/1
1/2
008
19/0
2/2
009
19/0
5/2
009
19/0
8/2
009
19/1
1/2
009
GLD Return Gold Return
~ 25 ~
Graph 2: SLV Return vs. Silver Return
Graph 3:
Graph 4:
Figure 1: GLD Correlogram
SLV Return vs. Silver Return
0
50
100
150
200
250
300
19/1
1/2
004
19/0
2/2
005
19/0
5/2
005
19/0
8/2
005
19/1
1/2
005
19/0
2/2
006
19/0
5/2
006
19/0
8/2
006
19/1
1/2
006
19/0
2/2
007
19/0
5/2
007
19/0
8/2
007
19/1
1/2
007
19/0
2/2
008
19/0
5/2
008
19/0
8/2
008
19/1
1/2
008
19/0
2/2
009
19/0
5/2
009
19/0
8/2
009
19/1
1/2
009
SLV Return Silver Return
GLD Return vs. The Market and SPY
0
50
100
150
200
250
300
19/1
1/2
004
19/0
2/2
005
19/0
5/2
005
19/0
8/2
005
19/1
1/2
005
19/0
2/2
006
19/0
5/2
006
19/0
8/2
006
19/1
1/2
006
19/0
2/2
007
19/0
5/2
007
19/0
8/2
007
19/1
1/2
007
19/0
2/2
008
19/0
5/2
008
19/0
8/2
008
19/1
1/2
008
19/0
2/2
009
19/0
5/2
009
19/0
8/2
009
19/1
1/2
009
GLD Return Market Return SPY Return
SLV Return vs. The Market and SPY
0
20
40
60
80
100
120
140
160
26/0
4/2
006
26/0
6/2
006
26/0
8/2
006
26/1
0/2
006
26/1
2/2
006
26/0
2/2
007
26/0
4/2
007
26/0
6/2
007
26/0
8/2
007
26/1
0/2
007
26/1
2/2
007
26/0
2/2
008
26/0
4/2
008
26/0
6/2
008
26/0
8/2
008
26/1
0/2
008
26/1
2/2
008
26/0
2/2
009
26/0
4/2
009
26/0
6/2
009
26/0
8/2
009
26/1
0/2
009
26/1
2/2
009
SLV Return Market Return SPY Return
~ 26 ~
Figure 2: SLV Correlogram
Figure 3: GLD ADF Test
Figure 4: SLV ADF Test
~ 27 ~
Table 2: Summary Statistics
Figure 5: GLD CAPM
~ 28 ~
Figure 6: SLV CAPM
Figure 7: IAU CAPM
Figure 8: DGL CAPM
~ 29 ~
Figure 9: DBS CAPM
Figure 10: AGQ CAPM
Figure 7: Whites Test GLD
~ 30 ~
Figure 8: Whites Test SLV
Figure 13: IAU
~ 31 ~
Figure 14: DGL
Figure 15: DBS Figure 16: AGQ
~ 32 ~
Histogram 1: RGLD Histogram 2: RSLV
Histogram 3: RIAU Histogram 4: RDGL
~ 33 ~
Histogram 5: RDBS Histogram 6: RAGQ
Table 3: Filtered Trading Results
Buy and Hold Vs. Filtered Trading Average Return
~ 34 ~
Fund Period
3% Filter 5% Filter
Buy and
Hold Filtered
Buy and
Hold Filtered
GLD
1 May 2006 –
31 Dec 20009 -0.000118 0.001642 -0.000118 0.000864
SLV 1 May 2006 –
31 Dec 20009 0.0000 0.006430 0.0000 0.005630