2008TurtonF

36
1 A Study of Inefficiency and Arbitrage Opportunity: An empirical analysis of the fixed and demand-based betting market Author: Felix Turton Advisor: Indradeep Ghosh

Transcript of 2008TurtonF

Page 1: 2008TurtonF

1

A Study of Inefficiency and Arbitrage Opportunity: An empirical analysis of the fixed and demand-based betting market

Author: Felix Turton

Advisor: Indradeep Ghosh

Page 2: 2008TurtonF

2

A Study of Inefficiency and Arbitrage Opportunity: An empirical analysis of the fixed and demand-based betting market

Abstract

This paper examines the fixed and demand-based betting market for The English Premier League top scorer. This investigation involves a predictive model that aims to pin down the non-random element of odds movements across time. With this model the Efficient Market Hypothesis (EMH) is tested and rejected. Given the information in the predictive model, the paper concludes with an explication of what on optimal betting strategy would consist in.

Introduction Betting markets are ubiquitous in the UK and exist for practically every major public event for which the outcome is uncertain. They range from the more classic horse racing market to a newly developed market where one can even place a bet on the outcome of a political event. Such markets share, to varying degrees, similarities with financial markets and can be modeled as such. Our particular interest, however, concerns the betting market for top scorer (TSM) in the premier English Soccer League, a major subset of the football betting market.

The analysis in this paper will be divided as follows: section one reviews the football betting market. Section two explains both the reason for and implications of treating football markets as financial ones. Section three briefly discusses previous literature concerning football markets and accounts for relevant distinctions between TSM and other football betting markets. Section four reviews the data set. Section five develops the model and displays the results. Section six concludes the paper by suggesting a method for making systematic positive returns. Section seven concludes the paper.

I) Football Betting Markets Background on the Classic Market: Football betting markets, a subset of the overall UK betting market, take many different forms. In aggregate, according to Kuypers1, the yearly turnover of this entire tax-free market is £300m. The classic market (and I suspect the first), is based on the outcome of a particular

1 Kuypers(1353)

Page 3: 2008TurtonF

3

game. In such a market, for any bet there are two parties: The bookmaker (BM), who offers the bet, and the punter, who takes the bet. Prior to the game, the BM sets odds whose ratios reflect the relative likelihood of each outcome. For any game there are only three possibilities: home win, home loss or tie. One is allowed to place a bet on one or more of the possible outcomes at some n to m odds. The payout, conditional upon a correct prediction, can be easily calculated as follows: Let s be the stake (amount bet) Let r be the payout

𝒓 =𝒏

𝒎∗ 𝒔 + 𝒔

This simply means that upon winning, the punter receives his initial stake plus the stake scaled by the return ratio. Probabilities are also easily calculated. The probability that outcome i will occur, based on the odds, is:

𝒑𝒊 =𝒎

𝒏 + 𝒎

Assuming that the odds reflect the true probability, it is easy to see that the punter’s expected return is equal to his initial stake, i.e. the punter makes no additional return:

𝑬𝑹 =𝒎

𝒏 + 𝒎∗

𝒏

𝒎∗ 𝒔 + [𝟏 −

𝒎

𝒏 + 𝒎] ∗ −𝒔 = 𝟎

However, this calculation of expected return is predicated upon the following assumption:

𝒑𝟏,𝒑𝟐,…..,𝒑𝒏 = 𝟏

𝒏

𝒊=𝟏

That is, since there are only n finite possible outcomes (in this case 3), the outcome space is spanned by n probabilities that sum to 1. But if this equality holds, the bookmaker’s returns will also equal 0 and will not be able to remain in business. To guarantee systematic positive returns, the BM factors in “Over Roundedness”(OR). In this classic market, the BM sets odds such that the probabilities derived from the odds rule this rule:

𝒑𝟏,𝒑𝟐,…..,𝒑𝒏 > 1

𝒏

𝒊=𝟏

The only additional condition for positive profit on behalf of the BM is that equal amounts are placed on each outcome. Consider a numerical example. Suppose for sake of simplicity, that the BM offers the following odds for the three possible outcomes of a match and that a dollar is

placed on each outcome: 1 1

1, 2

1

1, (3)

2

1, with respective probabilities .5,.5,.33. If (1) or (2)

Page 4: 2008TurtonF

4

occur, the BM owes $1 but makes $2 from the other two bets offered, making a profit of $1. If (3) occurs the BM owes $2 but makes $2 back from (1) and (2), losing no money. We can now compute the BM’s expected return. However, though the ratio of these probabilities is valid, since they do not sum to 1, they cannot be the true probabilities. To calculate expected return we must linearly scale them down. Let w be the scaling factor

.𝟓𝒘+.𝟓𝒘+.𝟑𝟑𝒘 = 𝟏

𝒘 ≈.𝟕𝟓 Using w, expected return can be calculated

𝑬𝑹 = 𝟏−.𝟓 ∗.𝟕𝟓 ∗ 𝟏 + −𝟏 ∗.𝟓 ∗.𝟕𝟓 + 𝟏−.𝟓 ∗.𝟕𝟓 ∗ 𝟏 + −𝟏 ∗.𝟓 ∗.𝟕𝟓

+ 𝟏−.𝟑𝟑 ∗.𝟕𝟓 + −𝟏 ∗.𝟑𝟑 ∗.𝟕𝟓 = 𝟏.𝟎𝟓 If the BM set odds in such a way, he would make $1.05 on average. An intuitive story tells us that the BM gives his best guess on what the value of the fair probabilities,2 scales them up by some OR factor and then derives the odds. Clearly if the BM sets odds so that the probabilities summed to less than one, it would only take a half-awake punter to realize that they can bet on every outcome and make the difference. If the sum of the probabilities equals one then the BM, like the punter, breaks even over time (his returns converge to 0). Seeing that neither possibility is the case, the BM’s profit from OR, called the “Handle”, can be represented as follows:

𝑯 = 𝒑𝟏,𝒑𝟐,….,𝒑𝒏 − 𝟏𝒏𝒊=𝟏

𝒑𝟏,𝒑𝟐,….,𝒑𝒏 𝒏𝒊=𝟏

From this we now know what the expected value of a punters bet in such a market would be:

𝑬𝑹 = 𝒎

𝟏 + 𝑯 ∗ 𝒎 + 𝒏 ∗

𝒏

𝒎∗ 𝒔 + 𝟏 −

𝒎

𝟏 + 𝑯 ∗ 𝒎 + 𝒏 ∗ −𝒔 < 0

More generally, the punter is expected to make, on average, the negative of the handle. As Kuypers notes, the BM’s handle in the classic match-outcome market is “remarkably constant” at near 11.5% with a standard deviation of .34%3. Based on the principle of equating marginal returns it would not be outlandish to suppose that the OR would not be far off for our TSM4.

2 By fair I mean the odds such that the public does not favor one side of a bet over another.

3 The Royal Commission on Gambling (1978), See Kuypers.

4 The reason I cannot compute OR for TSM is that it requires data on all players who can be bet upon. Since my

data only includes the top 10 players OR would be close but not be exact.

Page 5: 2008TurtonF

5

Top Scorer Market and Modern Betting Markets: In order to begin our analysis we need a description of TSM. TSM is the betting market for the top goal scorer in the English Premier League, the top soccer league in England. The market works in the following way: there are a group of about 20 plus players for whom the BMs offer bets. Betting begins a month before the start of the season and ends almost immediately before the last game is complete. As the season moves forward, odds change based on the information given from games, i.e. goals scored by a particular player. When a

player scores, odds decrease (e.g. from 5

1 to

3

1) to reflect the increased probability of that player

winning the competition. If the player selected by the punter wins, the punter receives the return given by the odds ratio5. Many of the properties of the classic match-outcome market apply to TSM and modern markets, but there are a few distinguishing factors necessary to note. Firstly, in TSM bets are paid out at the end of the season. If the player bet on by the punter has the greatest total goals tally at the end of the season, then the punter is paid in the same fashion as above. The difference comes from the fact that one can place a bet at any point during the season, so the betting spans a much greater period than the days leading up to a particular game. The interesting part is that the odds vary after each game is played, thereby incorporating new information. In more generic terms, a change in the odds can be thought of as a change in price. Just as in bond markets, one hopes that the price of a bet increases in the time period after one buys the bet, i.e. the odds shrink in a subsequent time period. Suppose for example that the punter takes a bet for player i at 9 to 1 odds, with probability (less OR) equal to 1 . Suppose further that player i scores the next week and thus increases his chances of winning TS. The BM recognizes that his subjective assessment of the true probability that player i will ultimately win has increased. The BM then finds the odds that match this probability and factors in some OR. Forgetting OR, suppose he now sets odds at 4 to 1, probability .2. Looking at the punter’s initial bet, he still keeps the return ratio given by the odds (9 to 1) but gets the new, better informed probability. No longer is his expected value equal to his initial stake, rather, it is now greater. In this numerical example, less OR, his expected return would be 10%. We can generalize this as follows: Let θ ∈ ℝ be the change in the odds from t, when the bet was placed, to t+1, the next time period. New odds and probabilities (with OR) will look like this:

𝑶𝒊𝒕+𝟏 =𝒏

𝒎+ 𝜽, 𝒑𝒊𝒕+𝟏 =

𝒎

𝟏 + 𝑯 ∗ (𝒎 𝟏 + 𝜽 + 𝒏)

When we compute expected return, instead of deriving the probabilities from the odds in t, when the bet is placed, the probability used is derived from the odds in t+1. The more accurate expected return equation uses the odds bought in t and the probability in t+1:

5 I am describing the classic TSM market where the BM only allows bets that “back” the given player to win. As we

will see, TSM has expanded to include exchange markets where the punter can bet on the opposite side as well, that the player will not win.

Page 6: 2008TurtonF

6

𝑬𝑹 = 𝒎

𝟏 + 𝑯 ∗ (𝒎 𝟏 + 𝜽 + 𝒏) ∗

𝒏

𝒎∗ 𝒔 + 𝟏 −

𝒎

𝟏 + 𝑯 ∗ (𝒎 𝟏 + 𝜽 + 𝒏) ∗ −𝒔

𝑊𝑕𝑒𝑟𝑒 𝐸𝑅 > 0 𝑖𝑓:

𝜽 < −𝑯 𝒏 + 𝒎

𝒎(𝑯 + 𝟏)

𝑊𝑕𝑒𝑟𝑒 𝐸𝑅 > −𝐻 𝑖𝑓:

𝜽 < 0 If we differentiate ER w.r.t 𝜃 we get the marginal impact of an increase in the odds change on the expected return:

𝝏𝑬𝑹

𝝏𝜽= −

𝒎 𝒎 + 𝒏 𝒔

𝟏 + 𝑯 (𝒎 + 𝒏 + 𝜽𝒎)𝟐

Simply put, the smaller the value of 𝜃, the greater the punter’s expected return. Conditional on betting that a player will win TSM, the more the odds shrink in the next period, the greater the expected return. The idea behind looking at expected return this way is that the probability derived from the odds is not the true probability. The true probability is derived from the odds for the same bet in the next period. It is crucial that one understands this concept as all subsequent analysis revolves around it. As will be explained more fully later, the only way the punter can systematically make above average returns is if 𝜃 has a non-random, and therefore predictable, component. If 𝜃 is entirely random then the punter’s expected return will equal the negative of the handle.

Secondly, the above gives us a model of the classic BM who sets the odds and then does not change them until new information is given. Historically, there is very little, if none at all, change in odds set by BM as a result of demand by the punters. BM’s rely on the fact that they are good predictors of how bets will be placed and how outcomes will follow. However, this is only partially true for TSM and modern markets. Just as recent as 2000 a new type of betting came into effect. In 2000, Betfair6, the first online betting exchange, was developed. It spans a wide array of markets and differs in nature from a classic BM. In the classic TSM, the punter can only bet on one side for some odds fixed by the BM. You could never bet that a player would not be the TS, only that they would. With the creation of Bet Fair and Bet Daq (2001) punters could both “back” and “lay” a bet. Backing is betting, as above, that player i will be TS while “laying” is betting on the other side (that the player will not win). If the player wins, the punter who layed a bet must pay his counterpart in the same fashion that the BM pays the person who

backed the bet (i.e 𝑠 ∗𝑛

𝑚). If the player does not become TS the laying punter receives 𝑠 ∗ 𝑚.

For example, on a 4 to 1 lay bet, if the player loses, the punter gets his initial stake times 1.

6 Overall Weekly turnover is estimated at $100m

Page 7: 2008TurtonF

7

Additionally, in these new exchange markets, there actually is no BM. Odds prices are set purely on the basis of demand, matching backers and layers in the market. This being the case, much like bond markets, there are continual fluctuations in odds prices, even days after a game is played.

So how do the owners of these exchange markets profit? With these markets the handle takes a different form: the difference between the back and lay price: Suppose the punter takes a bet at some odds:

𝒏

𝒎

In the presence of such a difference, the “true” odds and corresponding probabilities would be7: 𝐵𝑎𝑐𝑘 𝑜𝑑𝑑𝑠:

𝑶𝒊 =𝒏

𝒎+𝒅

𝟐, 𝒑𝒊 =

𝟐𝒎

𝟐𝒏 + 𝒎(𝟐 + 𝒅)

𝐿𝑎𝑦 𝑜𝑑𝑑𝑠:

𝑶!𝒌 =𝒏

𝒎−𝒅

𝟐, 𝒑!𝒌 = 𝟏 −

𝟐𝒎

𝟐𝒏 + 𝒎(𝟐 − 𝒅)

Imagine that there exists a true probability for some bet on player i. The betting exchange

manager then alters the odds for a backing bet by subtracting some factor (i.e. 𝑑

2), making the

odds smaller (i.e. with a lower return). For laying, the exchange manager adds the same factor to the odds so as to make the payout larger if the player wins (decreasing the ER for betting against the player). The equations above represent the true odds. To get to the odds offered on

a back one simply subtracts 𝑑

2 thus yielding

𝑛

𝑚 , the odds actually offered. The same can be done

with a lay by adding 𝑑

2.

The respective expected returns without considering 𝜃 would be8:

𝑬𝑹𝒊 = 𝟐𝒎

𝟐𝒏 + 𝒎 𝟐 + 𝒅 ∗

𝒏

𝒎∗ 𝒔 + 𝟏 −

𝟐𝒎

𝟐𝒏 + 𝒎 𝟐 + 𝒅 ∗ −𝒔 < 0

𝑬𝑹!𝒌 = 𝟐𝒎

𝟐𝒏 + 𝒎 𝟐 − 𝒅 ∗ −

𝒏

𝒎∗ 𝒔 + 𝟏 −

𝟐𝒎

𝟐𝒏 + 𝒎 𝟐 − 𝒅 ∗ 𝒔 < 0

7 It is important to note I am not representing two sides of the same bet. These are the two sides of two

completely different bets, they are not reciprocal bets. Constructing both the back and lay side for the same bet makes the equation longer and might obfuscate the point. The n to m odds are not the same. The subscripts i and k are used to indicate that these are different bets. 8 The reason the laying bet has a negative sign in front of the odds is that the layer will lose this amount if the

player wins.

Page 8: 2008TurtonF

8

The respective expected returns considering 𝜃 would be:

𝑬𝑹𝒊 = 𝟐𝒎

𝟐𝒏 + 𝒎 𝟐 𝜽 + 𝟏 + 𝒅 ∗

𝒏

𝒎∗ 𝒔 + 𝟏 −

𝟐𝒎

𝟐𝒏 + 𝒎 𝟐 𝜽 + 𝟏 + 𝒅 ∗ −𝒔 < 0

𝑬𝑹!𝒌 = 𝟐𝒎

𝟐𝒏 + 𝒎 𝟐 𝜽 + 𝟏 − 𝒅 ∗ −

𝒏

𝒎∗ 𝒔 + 𝟏 −

𝟐𝒎

𝟐𝒏 + 𝒎 𝟐 𝜽 + 𝟏 − 𝒅 ∗ 𝒔 < 0

𝑊𝑕𝑒𝑟𝑒 𝑓𝑜𝑟 𝑏𝑎𝑐𝑘𝑖𝑛𝑔: 𝑬𝑹 > 0 ; 𝑬𝑹 > −𝒅

𝟐: 𝒊𝒇

𝜽 < −𝒅

𝟐𝒎; 𝜽 < 0

𝑊𝑕𝑒𝑟𝑒 𝑓𝑜𝑟 𝑙𝑎𝑦𝑖𝑛𝑔: 𝑬𝑹 > 0 ; 𝐸𝑅 > −𝒅

𝟐: 𝒊𝒇

𝜽 >𝒅

𝟐𝒎; 𝜽 > 0

The punter, on average, is supposed to lose half the difference between the back and lay price. If we differentiate ER w.r.t 𝜃 we get the marginal impact of an increase in the odds change on the expected return: 𝐹𝑜𝑟 𝐵𝑎𝑐𝑘𝑖𝑛𝑔:

𝝏𝑬𝑹𝒊

𝝏𝜽= −

𝟒𝒎 𝒏 + 𝒎 𝒔

(𝟐𝒏 + 𝒎 𝒅 + 𝟐 + 𝟐𝜽 )𝟐

For laying:

𝝏𝑬𝑹!𝒊

𝝏𝜽= −

𝟒𝒎 𝒏 + 𝒎 𝒔

(−𝟐𝒏 + 𝒎 𝒅 − 𝟐 − 𝟐𝜽 )𝟐

We can now summarize the implications of the expected value equations derived from

betting with the classic BM and exchange market. In both cases the expected value is negative. The magnitude of the negativity is determined by the handle for the BM, and the difference between the back and lay price in the exchange market. Both of these determinants are fixed and completely out of the punter’s control. Both types of organizations decide before hand how much they will set these factors at. An educated guess would be that the OR and difference between back and lay prices are set by a combination of the target profit level and a competitive process. However, though the way this is calculated might be interesting to investigate it is ultimately not useful for our purposes.

Page 9: 2008TurtonF

9

In comparing these two types of betting markets, they are similar in one respect but different in another. The similarity regards θ. If θ is assumed to be random it is mean equals zero. This being the case, θ is not considered when calculating probabilities. That is, it is impossible to predict changes in the odds because they are just as likely to increase as decrease.

For the BM, this implies that if the ratio of the odds on the bets offered across all the players reflects the true relative probabilities, no one bet is better than another. The punter can make average returns by arbitrarily selecting a bet. More formally:

𝑬𝑹𝟏 = 𝑬𝑹𝟐 = 𝑬𝑹𝟑… = 𝑬𝑹𝒏 ;∀𝒊

The same idea persists for the exchange only with the addition of another dimension: the ability to lay a bet.

𝑬𝑹𝟏 = 𝑬𝑹!𝟏 = 𝑬𝑹𝟐 = 𝑬𝑹!𝟐… = 𝑬𝑹𝒏 = 𝑬𝑹!𝒏 ;∀𝒊

This tells us that when θ is random it does not matter which player you bet on, or if you

back or lay.

The major factor that separates these organizations regards their ability to set market clearing prices. The process in the exchange markets works by attempting to match backers with layers. This process continues until an equilibrium price is reached. This implies that each price at which a trade is made is a market clearing price.

Recall for the BM, profit is made based on OR. But the condition on making this systematic profit is that equal amounts are placed on all bets. There is no guarantee that the BM will receive equal bets on all players9 because the BM is setting odds independently of demand. The odds are simply the opinion of the relevant members of the betting organizations and are not necessarily aligned with the views of the public. This means that the BM market is usually not in equilibrium as it does not necessarily offer market clearing prices. To compensate for this potential loss the BM might incur, the handle is set notably higher than half the difference between the back and lay price. This hypothesis is confirmed by the data as we can reject the null, at 95% confidence, that the mean of BMs’ odds are not 15% less than the mean of the two exchanges (appendix a).

II) Treating TSM as a Financial Market: Comparing TSM to Financial Markets:

9 We can call this an equilibrium. Suppose that there are differential amounts placed on different players. This

would mean that the marginal impact (as in an increase in the number of bets) of an increase in the odds of a player with few bets will be less than the marginal impact of an increase in the odds of a player with many bets. This violates the efficient principle of equating marginal returns. I.e. one bet is better than another.

Page 10: 2008TurtonF

10

The approach of this paper will require that we treat both types of betting markets as financial markets. However, it is important that we note relevant differences and decide whether these differences matter for our purposes. Two of the most important concepts surrounding financial markets are future value and the risk to return relation. Take the US bond market as a paragon of financial markets. In such a market, one buys a bond at some interest rate. The interest rate is simply the return on the bond purchased. The value of that bond moves with the interest rates offered on similar bonds. If after buying a bond at some market-wide interest rate, the interest rate decreases then the value of that bond increases. The value of the bond relative to other bonds being offered is higher. It might then be advantageous for the bond holder to sell the bond at a higher value. The concept of future value ties in because the classic bond trader, seeking to maximize return, buys a bond with the expectation that it will be worth more in the future, where he will sell it off at a premium. This notion of future value is no different in the case of a punter in the TSM. The punter purchases a bet with the hope that the bet will increase in value relative to bets offered in the future. If the odds on the same bet decrease, then as I have shown above, the value of the bet will increase because the relative return is higher. The only exception in TSM is that the punter cannot resell his bet to the bookmaker. Rather, if the odds have shrunk he can hedge his bet by selling it in a betting exchange market. The fact that he cannot sell a bet to the BM does not matter because he always go to the exchange. We can think of the process of hedging as follows: Let 𝑂𝑡 be the odds bought by the punter in time t. Let 𝑂𝑡+1be the current odds in the market. Let 𝑝𝑖 be the probability derived from t+1 odds.

𝑬𝑹 = 𝒑𝒊 (𝑶𝒕+𝟏 − 𝑶𝒕) ∗ 𝒔 + 𝟏 − 𝒑𝒊 ∗ 𝒔 + −𝒔

𝑊𝑕𝑒𝑟𝑒 𝐸𝑅 > 0 𝑖𝑓:

𝑶𝒕+𝟏 − 𝑶𝒕 = 𝜽 < 0

Term on the very right side of the equation represents the fact that upon losing the initial bet the punter loses his initial stake but regains it with the bet he layed off. If we apply what we learned about 𝜃 and the difference between the back and lay price we get the following ER equations: For backing and then laying: Let s be the stake on the back in time t Let z be the stake on the lay in time t+1

𝑬𝑹 = 𝟐𝒎

𝟐𝒏 + 𝒎 𝟐 𝜽 + 𝟏 + 𝒅 ∗ 𝒔

𝒏

𝒎− 𝒛

𝒏

𝒎+ 𝜽 + 𝟏 −

𝟐𝒎

𝟐𝒏 + 𝒎 𝟐 𝜽 + 𝟏 + 𝒅 [𝒛 − 𝒔]

Page 11: 2008TurtonF

11

𝑊𝑕𝑒𝑟𝑒 𝐸𝑅 > 0 𝑖𝑓:

𝜽 < −𝒅𝒔 + 𝒅𝒛

𝟐𝒔

If we differentiate w.r.t θ we get the marginal impact of a change in θ on the ER of a hedge.

𝝏𝑬𝑹

𝝏𝜽= −

𝟐𝒎(𝟐 𝒎 + 𝒏 𝒔 + 𝒅𝒎𝒛)

𝟐𝒏 + 𝒎 𝟐 + 𝒅 + 𝟐𝜽 𝟐

For laying and then backing: Let s be the stake on the lay in time t Let z be the stake on the back in time t+1

𝑬𝑹 = 𝟐𝒎

𝟐𝒏 + 𝒎 𝟐 𝜽 + 𝟏 + 𝒅 𝒛

𝒏

𝒎+ 𝜽 − 𝒔

𝒏

𝒎 + 𝟏 −

𝟐𝒎

𝟐𝒏 + 𝒎 𝟐 𝜽 + 𝟏 + 𝒅 [𝒔 − 𝒛]

𝑊𝑕𝑒𝑟𝑒 𝐸𝑅 > 0 𝑖𝑓:

𝜽 > −𝒅𝒔 + 𝒅𝒛

𝟐𝒔

If we differentiate w.r.t θ we get the marginal impact of a change in θ on the ER of a hedge.

𝝏𝑬𝑹

𝝏𝜽=𝟐𝒎(𝟐 𝒎 + 𝒏 𝒔 + 𝒅𝒎𝒛)

𝟐𝒏 + 𝒎 𝟐 + 𝒅 + 𝟐𝜽 𝟐

The existence of the odds exchange market allows for punter to hedge a bet. The punter can profit from a hedge if the odds in the next period change in a favorable direction: an increase for a lay and a decrease for a back. Profit, therefore, depends on the value θ. It is immaterial whether θ is positive or negative so long as the punter knows the most likely direction. We can now state more clearly the punter’s objective: to predict the direction and magnitude of θ. Such behavior is indistinct from a bond or currency speculator.

The second important concept is the risk to return relation. As in financial markets, the riskier an investment is the higher the return on that investment. In an efficient market the risk to return ratio should be equal for all investments. The same is true for TSM. The risk is reflected in the odds (and obviously the return) by the fact that for a higher return the probability of winning is lower. This condition has already been stated in our exchange market:

Page 12: 2008TurtonF

12

𝑬𝑹𝟏 = 𝑬𝑹!𝟏 = 𝑬𝑹𝟐 = 𝑬𝑹!𝟐… = 𝑬𝑹𝒊 = 𝑬𝑹!𝒊 ;∀𝒊

The major difference between a well-behaved financial market and TSM comes from the fact that the classic BM still exists. The BM’s, though having decades of practice predicting odds, still cannot ensure that they set market clearing odds. This is the major property that sets the two markets apart. However, this difference is only important if it affects our ability to test the Efficient Market Hypothesis. Another, and quite interesting difference, is that in a financial market like a securities exchange prices to not converge to a true value. Prices on traded assets exist indefinitely. There is no end to the game. Traders continue to buy and sell goods in an attempt to make profit off of price fluctuations. A good trader is one who can predict what the price will be in some future period but there is no final period. Since there is no final value, in a sense there is no true price. In TSM however, there actually is a true price of a bet. Theoretically speaking, it is the odds given, by the betting exchange, at a period which is infinitely close to the end of the season. It cannot be the exact end of the season because once a player has been crowned TS his probability goes to 1 while all the other players’ probabilities go to zero, impossible based on the way probabilities are constructed. This entails that odds are always moving towards their true value. The final difference is something that would never occur in a well behaved financial market: there is significant odds variation, for the same bet, across organizations. The null hypothesis that the mean coefficient of variation for odds across organizations is less than .28 can be rejected at 95% confidence.10 The significance of variation holds for the top three players, though it is much less: .18, at 95% confidence (appendix b). This small statistic is a remarkable finding as it would reflect poorly on the competitiveness of such an old and large market. This means that for any bet one can simply shop around for the best odds, with substantial success. The only reasonable explanation is that betting in England is a tradition rather than a ruthless calculated endeavor. Punters probably have a favorite BM that they frequent, usually the same one as their father11. These betting lounges are like social sanctuaries for the lower-middle class, where punters can smoke and drink with friends during a game. The BMs themselves encourage such atmosphere by offering cheap drinks and free bets. This fact should not affect our ability to compare TSM to a financial market. The Efficient Market Hypothesis (EMH): EMH is a concept developed in the 1960’s by the UChicago business professor Eugene Fama. The hypothesis asserts that prices in financial markets are “Informationally Efficient”. Prices reflect all the known information in the market and are therefore unbiased. Prices tell us the market players’ implicit aggregate beliefs of future prospects in the market. The implication

10

The coefficient of variation is mean/variance. It was computed by taking the average of CV for each player’s odds in a given time period. I then meaned across players for each time period and then took the mean of all the time periods. 11

I say father because consistent gambling is usually done by men. In fact, it is considered to be a well known fact that BMs purposefully cover the windows and doors of their betting lounges in order to prevent the wives of their customers from seeing in.

Page 13: 2008TurtonF

13

is that one cannot consistently outperform the average returns of the market by using publically available information. That is, it is not possible to use analytical tools (e.g. regressions) on present or past information to develop a strategy that beats the average returns of the market. The factors that affect price movements are conditioned by occurrences in the future (e.g. buyouts, bankruptcies, mergers, etc.) which are by definition unknowable. An attempt to predict price changes with various analytical tools will not help because such tools are part of public information. If there existed an analytical apparatus to partially calculate future price changes then it would be used by every market player and average returns would not change. Prices would reflect this knowledge and one would no longer be able to predict them. To illustrate this idea, one can split future price movements in to two categories: predictable and random. Every price change is made up of these two components. If every market player has access to tools that yield the former, then the predictable part of price would drop out. The predictable part of the price change would be realized before one has a chance to profit from it. For example, imagine there was one trader who knows that the price on oil is likely to drop in the next period (the non-random component), based on some known qualities in the current period. He could then profit by shorting the stock12. If the whole market has access to this information then the price at which the stock could be sold, before its decrease, would continue to drop until there is no predictable difference to be made. The only part of the price change left would be the random element. Extending this idea tells us that in an efficient market any regression aimed at predicting future prices, based on known values, would not yield significant coefficients because the only remaining component of the price change is the random one. In the above section I examined similarities differences between financial markets and TSM. The differences only matter to the extent that they affect our ability to test EMH. The first difference is that classic BM does not set market clearing prices. Though interesting, this difference is immaterial for our purposes. The BM’s, just like the punters, are attempting to predict future probabilities. The fact that they might not set odds that equate marginal returns does not mean that we cannot test the hypothesis that one can make excess returns. The only difference between the classic BM and the odds exchange is that, in the former, one can only bet against the BM at a price set by the BM. Moreover, one can always hedge a bet placed at the BM by betting on the other side in the exchange. The second difference, the idea of a final value, does not matter either because excess returns can be made in both markets by predicting by the odds in the next period. In fact, the existence of an obvious objective means that it is easier to model such odds movements. There are factors which we know have a consistent and significant impact on odds movements, making prediction much easier. Now that we know we can test the EMH, we should define exactly what an excess return would be in TSM. Recall the expected return equations for the respective TS markets: For the classic BM:

12

Shorting is where one borrows stock (to be returned in some subsequent period) in time period t and immediately sells it. When the price drops the trader buys it back at a lower price and returns the stock. The profit of the trader comes from the fact that he can buy the same quantity at a lower price than what he sold it at. His profit is this difference.

Page 14: 2008TurtonF

14

𝑬𝑹 = 𝒎

𝟏 + 𝑯 ∗ (𝒎 𝟏 + 𝜽 + 𝒏) ∗

𝒏

𝒎∗ 𝒔 + 𝟏 −

𝒎

𝟏 + 𝑯 ∗ (𝒎 𝟏 + 𝜽 + 𝒏) ∗ −𝒔

For the exchange:

𝑬𝑹𝒊 = 𝟐𝒎

𝟐𝒏 + 𝒎 𝟐 𝜽 + 𝟏 + 𝒅 ∗

𝒏

𝒎∗ 𝒔 + 𝟏 −

𝟐𝒎

𝟐𝒏 + 𝒎 𝟐 𝜽 + 𝟏 + 𝒅 ∗ −𝒔

𝑬𝑹!𝒌 = 𝟐𝒎

𝟐𝒏 + 𝒎 𝟐 𝜽 + 𝟏 − 𝒅 ∗ −

𝒏

𝒎∗ 𝒔 + 𝟏 −

𝟐𝒎

𝟐𝒏 + 𝒎 𝟐 𝜽 + 𝟏 − 𝒅 ∗ 𝒔

Recall that the BM’s handle and the difference between the back and lay price are exogenously fixed. The expected returns, conditions for positive profit and above expected profit are respectively given as follows: For the classic BM:

𝑬𝑹 = −𝑯, 𝜽 < −𝑯 𝒏 + 𝒎

𝒎(𝑯 + 𝟏), 𝜽 < 0

For the exchange:

𝑬𝑹 = −𝒅

𝟐, 𝜽𝒊 < −

𝒅

𝟐𝒎, 𝜽!𝒊 <

𝒅

𝟐𝒎, 𝜽𝒊 < 0, 𝜽!𝒊 > 0

We can now define what an inefficient and strongly inefficient market would be. An inefficient one would be one where the punter systematically makes above the negative of the handle/difference between back and lay price. A strongly inefficient market would be where the punter could systematically make positive profit.

Notice that the ability to make excess returns depends upon θ. As mentioned above, it does not enter into typical ER equations because it is assumed to be random. If it is not random, i.e. predictable, then an astute punter can systematically make excess returns. This is where the thesis begins to take off. If we can construct an equation that meaningfully models θ’s movements then we can reject the EMH for TSM. That is, if we find that θ’s movements are non-random then TSM is inefficient and systematic abnormal returns can be made.

III) Previous Literature There have been a few papers written on the efficiency of the English football betting market. However, they all differ in at least two important respects: the betting market examined and the purpose of the models. With respect to the former, to date there have been no studies on the TSM. The studies have only focused on the market for game outcomes. This

Page 15: 2008TurtonF

15

difference is important because the natures of the markets are distinct. As mentioned at the beginning, in the one-game market there is only one odds announcement. This announcement reflects all information prior to the game but is not updated with new information.13Since there is no odds variation across time it is impossible to model odds movements. These markets are much further away from financial markets than TSM. There would be no point in predicting odds movements in the next period because the next period is simply the end of the game where there are no more bets offered. Moreover, there is also no point in predicting how the BM will set odds before the game because one cannot profit from this knowledge. With respect to the models used in the one-game market, there is no relevant similarity. When the previous studies tested the efficiency of the fixed-odds betting market they did so by constructing models aimed at predicting game outcomes rather than the actual odds themselves. Their goal was to construct a model that could out-predict the BM’s by estimating probabilities of various game outcomes. If the odds constructed from these probabilities were less than the odds offered by the BM then it would be optimal to purchase such odds. These economists then retroactively, with little or no success, tested their models on past games. The model in this paper does not aim at predicting future events that determine odds movements. Rather, it goes directly to the heart of the matter and predicts the movements themselves. The thinking behind such a method, is that, rather than predicting the outcome of a future event, which by definition is random, it is more useful to predict the reaction to that event based on quantities known in the period before the odds are announced. More useful because, from the punter’s perspective (assuming they can bet on both sides), all that matters is the magnitude and direction of the change in the odds.

IV) The Data The data was collected from two online sources: soccerbase.com, where goals data was obtained, and bestodds.com, where odds announcements were obtained. The data ranges from the beginning of the 07/08 season to the present where each team has 5 games remaining, practically the full 07/08 season. There are 2906 complete observations.

So there are two data types: goals and odds. Goals scored is broken down into two dimensions while odds announcements are broken into three. The dimensions for goals are player and time (so it is a panel data set) while the dimensions for odds are player, time and betting organization. The goals data is structured as follows: for each time period the goals are given by a vector which contains the goals scored by each player during that time period. By time period, I simply mean the time after a game is played but before updated odds are announced14. The odds data is structured as follows: after each game in time t odds are announced in t+1 for each player and organization, bar missing data points. There are 10 players and 12 organizations. The players were chosen on the following basis: either they began the season with the shortest odds or they are currently in the set of players with the most goals.

13

This excludes exchange markets in which the odds change during the game. However, all papers written on this subject only deal with classic BM. 14

There need not actually be any distance between the game played and odds announcements. The two period are kept separate for ease of interpretation.

Page 16: 2008TurtonF

16

Doing this means that the data set contains all the players with the potential, to varying degrees, of actually winning the TS title. The data set follows these 10 players throughout the season. The 10 players are Emmanuel Adebayor, Christiano Ronaldo, Fernando Torres, Didier Drogba, Roque Santa Cruz, Bejani, Yakubu, Carlos Tevez, Robbie Keane and Nicholas Anelka. The 12 betting organizations are Bet 365, Bet Chronicle, Bet Daq, Betfair, Bet Fred, Coral, Ladbrokes, Portland, Paddy Power, Sky Bet, Totesport and Victor Chandler. As mentioned, the betting market is broken up into betting exchanges and classic BM’s. Every organization except Bet Daq and Betfair are traditional BM’s.

There are two important qualifications to note about the data set. Firstly, as the season neared the end, certain players were deemed by the BM’s highly unlikely to win so BM’s stopped reporting odds for them. This means that there are a few discontinuities in the data set. However, Bet Fair and Bet Daq are online exchanges and, like the stock market, do not have any discontinuities. For density plots of odds observations over time and odds value see appendix c.

The second qualification is that for the first month or so of the season, Bestbetting only displayed odds every other week. This meant that I had to combine two weeks of goals data for the set of odds announcements after the second week. I do not think this is at all problematic because it simply treats one game as 180 minutes instead of 90 minutes. Such a grouping would only decrease, but not bias, the sample. It would only serve to make my results less robust, rather than bias them in a flattering direction. Potential Violations of OLS:

Upon running a panel regression it is reasonable to question whether the conditions of

OLS have been met, namely heteroskedasticity, serial correlation and multicolinearity. To fix for Heteroskedasticity I ran robust variance regressions. As heteroskedasticity yields incorrect standard errors, the formula for computing standard errors must be altered. Robust variance regressions solve this problem by employing the correct formula. Moreover, if there is no heteroskedasticity robust variances will produce the same standard errors as in their absence. Serial correlation can be a problem in any panel data regression so it is necessary to check for it. However, the typical tests given in statistical packages do not work for my entire data set. This is because of the repeated time values that stem from using multiple odds announcements (given by the 10 organizations) for the same player and time period. To fix this problem, I eliminated all but one organization from the data. Once in proper panel form, I ran the Durbin-Watson test for serial correlation. The statistic produced is bounded between 0 and 2, where a value of 2 indicates no apparent serial correlation. Fortunately, the Durbin-Watson statistic was 1.9 for the unrestricted model, implying that serial correlation is not a problem. To test for multicolinearity, I computed the variance inflation factor for the full model. Its mean was 1.34 so there should be no problems with colinearity (see appendix: d for full table).

V) The Model The Purpose of the First Equation:

Page 17: 2008TurtonF

17

As stated before, the aim of this thesis is to test the EMH. Upon reflection, this meant

that we had to determine if θ was predictable. Clearly, θ has a random component because it moves based on events in subsequent time periods. The question at hand is whether θ has a non-random component that can be predicted. Recall that if the market is efficient there can be no given information, in the current time period, that will explain odds movements in the next. To further orient this analysis we can define two types of predictability. We can call the first type “Strongly Predictive.” A strongly predictive variable is one where known quantities in the current time period can be used to explain variation in odds movements in the subsequent time period. That is, simply by knowing the value of a given variable in the period before the game is played one can predict the partial change in odds. This is a clear violation of the EMH. If such knowledge is available, then one can out-perform the market by constructing a betting strategy that optimizes with respect to this variable. More specifically, recall that in an efficient market the ER on every bet is the same. The analog in a financial market is that the risk to return ratio is the same for all financial instruments. If there is a time when this condition does not hold then traders or punters will continue to shift toward the undervalued stock or bet until the risk to return ratio is equal across all stocks or bets. In light of our question, if there exists strongly predictive variables and the value of these variables varies across players, then for any group of players, there exists a player whose ER is greater than the rest, greater than the market average. Such a finding would therefore violate the EMH.

The second type of predictability is “Semi-Strongly Predictive.” A variable is semi-strongly predictive if its outcome in a future period is random but the magnitude of its effect on odds, conditional upon it occurring, is able to be estimated. Furthermore, if a semi-strongly predictive variable has a differential impact over time, then it has an even higher level predictive status. Though this type of variable does not damage the EMH to extent that the former does, it still violates the hypothesis. There are two ways that a semi-strongly predictive variable might violate the EMH: its magnitude relative to the strongly predictive variables and its differential impact over time. With respect to the former, the larger the magnitude of such a variable relative to the non-random variables, the more volatile the bet it is. It serves as a gage for the riskiness of a bet. It allows the punter to make a better informed decision. For example, every individual who places a bet in TSM knows that a goal in the next game will decrease odds. But probably no one knows the extent of the decrease. This gives the punter an informational advantage. Regarding the relative magnitude overtime, if the punter can know the impact of a random of a variable at different time periods then he can optimize with respect to time, a luxury not afforded to intuition-based punter.

The Equation:

The purpose of this paper’s main model is to predict the change in odds from t to t+1.

The r.h.s is composed of variables that are both strongly predictive and semi-strongly predictive. From the above discussion, if any coefficient that measures the impact of a strongly predictive variable is significant, then we can outright reject the EMH. That is, if we can use known information to predict the change in odds in the next period then the EMH does not hold. If any

Page 18: 2008TurtonF

18

of the semi-strongly predictive variables are significant, then their magnitude can help inform the punter how to optimally bet. The model is as follows:

(𝑶(𝒕+𝟏)𝒊𝒋 − 𝑶𝒕𝒊𝒍)

𝑶𝒕𝒊𝒋=𝝏[𝝏𝒍𝒏𝑶𝝏𝒕

]

𝝏𝒙𝒊= 𝜶𝒕𝒊𝒋 + 𝜷 𝑮𝒕𝒊 + 𝜹 𝑮𝒕𝒌 + 𝝀[𝑴𝒂𝒙𝒊 𝑮𝒊(𝒕−𝟏)

𝒌

𝒕=𝟎

− 𝑮𝒊(𝒕−𝟏)] + 𝝁(𝒕) + 𝝋𝑶(𝒕−𝟏)𝒊𝒋 + 𝜺𝒕𝒊𝒋

𝒌

𝒕=𝟎

The equation yields an OLS regression over three dimensions: time (t), player (i) and betting organization (j). The values in the equation span three time periods: t-1,t and t+1. The value of every variable in t-1 is known as it is based upon data from past matches. Period t is the period when the subsequent game is played. Any variable in t is random because the punter must bet before the match. Period t+1 is the time when the odds are announced. There is actually no distance between t and t+1 because the odds immediately adjust after the game15.

The dependent variable is the percent change in odds from t to t+1 for player i and organization j. Every coefficient on the r.h.s can be written generally as:

𝝏[𝝏𝒍𝒏𝑶𝝏𝒕

]

𝝏𝒙𝒊

or more simply as the change in the percent odds change from t to t+1, with respect to each r.h.s variable. 𝛼𝑡𝑖𝑗 is the constant term. We can group the rest of the r.h.s in to our two

categories: semi-strongly and strongly predictive. Regarding the former, 𝜷 represents the impact of a goal scored by player i in time period t while 𝜹 represents the impact of a goal scored in t by the leading goal scorer in t-1. The intuition behind including goals of the leader is that before the match he is the most likely to win the award. If he scores in time period t then the chance that player i will win has diminished and therefore the odds will increase (as odds move in the opposite direction as probabilities). Both of these variables are random because they occur after a bet can be placed and the punter cannot maximize their values when betting. The rest of the r.h.s is made up of variables whose quantities are known by the punter in the period before the game, i.e. in t-1. 𝝀 represents the effect of distance of player i, with respect to goals, to the leading goal scorer k in period t-1. It makes sense to include this variable because the further away player i is from leader k the less likely he is to win the competition. 𝝁 is the coefficient for time period. As time progresses, it becomes clearer who the most likely winners are. 𝝋 is a lag variable equal to the odds in the period t-1. All the above variables are strongly predictive variables and their significance would entail a rejection of the EMH. Particular Regressions:

15

In the exchange market odds actually change during the game when a goals is scored. This does not actually matter for our purposes because the only thing that matters is that the odds adjust after the random event occurs. One cannot capitalize on a lag in odds adjustments.

Page 19: 2008TurtonF

19

In order to deal with potential non-linearities across different subsections of the data, I ran several regressions within four distinct sets: (1) No odds constraints, (2) odds less than 40, (3) no odds constraints and only exchange markets, (4) and odds less than 40 and only exchange markets. The reasoning behind this is three-fold: firstly, I noticed that at very high odds levels, where players don’t have a realistic chance of winning, Classic BM’s don’t even bother changing the odds when they should because few will bet on this player. This could skew results because there is no variation in odds where there should be. Secondly, this thesis has a practical as well as academic bent. The players of interest are the one’s capable of winning the TS title. Most trading action revolves around them because they are the players whose odds will shrink in the future. A player is given 100 to 1 odds for a reason. They are of secondary interest. As it turns out, the 𝑅2 is higher when the odds are constrained. Thirdly, for the reasons discussed in section 2, the future of serious gambling is most likely to revolve around the exchange markets. Moreover, since these markets produce prices that are more sensitive to demand than the BM, price changes can be measured with less noise. Of the 2906 total observations, exchanges only comprise 487, a mere 17% of the data. Surprisingly, however, the 𝑅2 on these regressions were on average slightly greater than those on the model with no odds constraint, only the confidence band around the coefficients is quite wide. For all tables see appendix d. Below is a set of unconstrained (w.r.t. odds) regressions, for both the full set of players and the top three, over three contiguous time segments of data. Given the space constraints of this paper, it is best that only the most general findings are presented in the body. I suggest that the reader examines the appendix for fuller account. Below is the most generic set of regressions.

COEFFICIENT Unrestricted t<13 12<t<20 t>19 Top 3

Players t<13 and Top

3 Players 12<t<20 and

Top 3 Players

t>19 and Top 3

Players

Time Periodt 0.00759*** -0.00962 0.00514 0.0282*** 0.00183 -0.0162*** -0.0139*** -0.0179**

-0.00187 -0.00759 -0.00571 -0.00947 -0.00148 -0.00598 -0.00506 -0.00751

Goalst -0.274*** -0.235*** -0.238*** -0.397*** -0.314*** -0.180*** -0.276*** -0.485***

-0.0128 -0.0166 -0.00949 -0.04 -0.0132 -0.015 -0.0172 -0.0244

Goals by Leadert 0.113*** 0.0404** 0.0784*** 0.339*** 0.116*** -0.0178 0.116*** 0.305***

-0.0135 -0.0163 -0.00835 -0.05 -0.0126 -0.0134 -0.0146 -0.0258

Dist to Leadert-1 0.0339*** 0.0563*** 0.0256*** 0.0402*** 0.0574*** 0.0208** 0.0402*** 0.0843***

-0.00494 -0.0125 -0.00465 -0.0087 -0.0116 -0.00918 -0.00797 -0.021

Lagt-1 -0.000650*** -0.00270*** -0.00298*** -0.000894*** -0.00523*** -0.00901*** -0.0360*** -0.0115***

-0.0000838 -0.000701 -0.000793 -0.000167 -0.00147 -0.00156 -0.00862 -0.00416

Constant -0.0864** 0.142* -0.0443 -0.709** 0.0238 0.287*** 0.362*** 0.505***

-0.0439 -0.0819 -0.0841 -0.28 -0.0294 -0.0602 -0.1 -0.193

Observations 2622 793 788 1041 990 265 247 478

R-squared 0.173 0.186 0.441 0.179 0.35 0.484 0.591 0.408

Page 20: 2008TurtonF

20

Linear Combination of

Goals and Leader Goals -0.16083 -0.1949871 -0.1595103 -0.058382 -0.1974277 -0.1978172 -0.1595024 -0.1805032

SE 0.0138593 0.01888 0.0085589 0.0394052 0.0112356 0.0190326 0.0095494 0.0261967

Robust standard errors in

parentheses with a dash

*** p<0.01, ** p<0.05, * p<0.1

Note that the fourth and third rows from the bottom report the linear combination of goals for player i and goals for leader k and the standard errors of the combination. This provides us with a convenient way of grouping the two random variables so that we can see if they have a differential impact overtime.

Solely based on this regression, we can reject the efficient market hypothesis. We should first observe the significance and direction of the strongly predictive variables. Beginning with the regression that combines all players: for t<13, the distance to the leader is also positive and significant which implies that each goal that the leader has over player i, in t-1, increases odds in t+1 by around 6%. That is, simply by knowing that the distance of player i from the TS the punter can predict the partial change in odds in t+1. The lag is also significant which means, with the odds for player i in the previous period, we can compute the partial change in odds in the next period. The constant term’s significance is another factor we can use to predict odds. In 12<t<20, we find that distance to the leader inflates odds in the next period by a statistically smaller factor, around 3% as opposed to 6%. The lag is not statistically different from its value in t<13. In the last part of the season (t>19), time period is now significant and positive meaning that a simple increase in the time inflates odds for this group. Distance to the leader jumps up again because as the final rounds approach (4%), odds should increase (probabilities decrease) by a greater factor the further player i is from TS. The lag and the constant are also significant. When we examine the strongly-predictive variables for the top three players we get a slightly different story. Firstly, the 𝑅2 values are much larger. This confirms my hypothesis that odds movements are more sensitive to players who are most likely to win the prize. The first point of note is that time period becomes significant and negative when in t<13 when the players are restricted. Simply by an increase in time period for these players the odds are expected to shrink by slightly less than 2%. Distance to the leader is also positive and significant at just above 2%, smaller in magnitude than the full player sample. The lag is negative as in all other regressions but much larger in this sample. Notice that in the middle time period all coefficients are significant implying that one can predict a large portion of the odds movements in the next period simply based on the values in the current period. The above information serves as a convincing argument that TSM is inefficient. We can now investigate the importance of the semi-strongly significant variables. They are the two random variables on the r.h.s: goals and goals of the leader in time period t. As a way of condensing the information of the two coefficients the above table has a variable which

Page 21: 2008TurtonF

21

is their linear combination. If we are to assume nothing about the expected probabilities of these variables in period t, this specification is fine. If goal scoring is random then we can assume that they have the same likelihood of scoring in period t. If one has reason to believe otherwise, then it makes more sense to interpret them separately. For heuristic purposes we can interpret them as their linear combination. Recall that every semi-strongly predictive variable can benefit the punter. They can either give a measure of the riskiness of a bet or inform the punter about the time sensitivity. We will look into the former in the next section but for now we can focus on the latter. Determining whether this linear combination differs overtime can be done by comparing the confidence interval (95%) across the time horizons. Beginning with the all players regression, we see that there is large differential impact from periods t<13 and 12<t<20 to t>20. The first two periods have much lower (more negative) values than does the final period. This means that the impact of a goal scored by player i in t>20 relative to a goal from leader k is much lower than the same situation in the first two time horizons. Interestingly, we find no statistical difference in the top three samples: the relative magnitude of an additional goal for player i, with respect to leader k, is no different across time. This might make sense if we consider that there might be a bias towards the top three players. That is, there is a greater weight assigned to a goal from a player in the top three, who is not the leader, than in the full set of players. Conversely, at least in the punter’s mind, a goal from the leader has a less damaging impact on a player in the top three’s probability of winning. Below is one more table containing the coefficients for the unrestricted model and the regressions that constrain the odds at less than 40. In the interest of remaining in the special constraints of this paper, the table only presents the full season and the season divided over two time horizons (t<16 and t>16). I leave it to the reader to more thoroughly examine the coefficients given in the appendix:

COEFFICIENT Unrestricted Odds<40 t<16 t<16 and Odds<40 t>16 t>16 and Odds<40

Time Periodt 0.00759*** -0.00194 -0.00908* -0.0109* 0.0350*** 0.0100***

-0.00187 -0.00226 -0.00524 -0.006 -0.00718 -0.00269

Goalst -0.274*** -0.252*** -0.229*** -0.206*** -0.332*** -0.325***

-0.0128 -0.00758 -0.0108 -0.0101 -0.0254 -0.0124

Goals by Leadert 0.113*** 0.0795*** 0.0435*** 0.0454*** 0.252*** 0.185***

-0.0135 -0.00813 -0.00997 -0.0102 -0.0331 -0.0142

Dist to Leadert-1 0.0339*** 0.0388*** 0.0484*** 0.0451*** 0.0391*** 0.0522***

-0.00494 -0.00493 -0.00807 -0.0087 -0.00735 -0.007

Lagt-1 -0.000650*** -0.00792*** -0.00288*** -0.00848*** -0.000912*** -0.0122***

-0.0000838 -0.00148 -0.00066 -0.00171 -0.000148 -0.00201

Constant -0.0864** 0.132*** 0.152** 0.227*** -0.851*** -0.208***

-0.0439 -0.0499 -0.0647 -0.0823 -0.206 -0.0669

Observations 2622 1972 1124 1001 1386 868

R-squared 0.173 0.309 0.231 0.247 0.18 0.465

Page 22: 2008TurtonF

22

Linear Combination of

Goals and Leader Goals -0.16083 -0.1725492 -0.1855336 -0.1604207 -0.0799694 -0.1402971

SE 0.0138593 0.008405 0.0114457 0.0099447 0.0299345 0.0118751

Robust standard errors in

parentheses with a dash

*** p<0.01, ** p<0.05, * p<0.1

There are a few things of note here. Firstly, for the full season regression time period is positive for the unconstrained model and while insignificant for odds less than 40, the two coefficients are also statistically different. Goals by the leader in t-1 are also statistically different, at 95%, such that the coefficient is larger for the unconstrained model. This means that odds increase by a smaller factor as a result of distance to the leader in t-1. The lag and the constant are statistically different, at 95%, such that both coefficients are larger in absolute magnitude for odds less than 40. In the t>16 period, time period has a greater, and statistically different, effect on odds for the unconstrained model. The lags are also statistically different such that the negative magnitude of the odds less than 40 is greater than the unconstrained model. The most interesting point of note in these regressions is the coefficients on the linear combination of goals and leader goals. The coefficient on the less than 40 regression has a much greater negative magnitude than on the unconstrained model. This is consistent with the above findings that the same coefficient for the top three players has a large negative magnitude than for the full sample. In this case, the high magnitude of player i’s goals is the force driving the linear combination coefficient. The Relation to Efficiency and Arbitrage: Since most of these coefficients are intuitive, one might then claim that this information is obvious and therefore already employed by punters. However, this misses the point. The fact that these coefficients are significant means that not enough punters are betting optimally for above average systematic returns to dry up. Though the next section explains optimal betting more thoroughly, we can already see how a sophisticated punter might decide to bet. Given a set of possible bets, he would choose players who have the characteristics that, based on the above model, maximize the change in odds (in either direction). But if it is possible to maximize this change then the EMH fails. The assumption is that the ER of all bets is equal, one bet cannot be better than the next because the change is assumed to be entirely random. Moreover, if every punter applied this strategy then the price would immediately be bid up or down in an amount equal to the predicted change and there would be no more predicable change. To give a simple example, suppose that a punter knows that the odds on a given bet will decrease by 20% in the next period. He would then back the bet with some stake. When the prices changes, depending on his risk preference, he makes the difference by either hedging his bet (betting an amount against the player) or holding the bet thinking it will continue to

Page 23: 2008TurtonF

23

decrease. But now suppose half of the market players catch on. Demand for the bet would increase and the odds would begin to shrink because the punters laying the bet realize they can get a higher price (i.e. their payout is smaller). The profit from such information decreases as the bet begins to move in the direction of the predicted change. Further suppose that everyone in the market catches on, the price would quickly converge to old price plus the downward change. The key is that this will occur before the game even begins. There is nothing left to predict. Therefore, any model like the above one will not yield significant coefficients of the quantities known in the previous period. It would be impossible to maximize ER because expected returns would be equal across all bets. Now it should be clear why the above regression shows that TSM is an inefficient Market.

VI) Optimal Betting General Method:

The purpose of the above model is to observe and predict tendencies of the market and capitalize on them when possible. The coefficients on the regression, if significant, tell the punter the magnitude of each variable’s effect on the change in odds from t to t+1. To actually compute the change in odds one simply plugs in values for each variable. However, including the appendix there are 336 coefficients moving in different directions both within and across time periods. This is a lot of information to synthesize. To simplify matters, this section is dedicated to exploring the general form of a betting strategy rather than finding the optimal bet, an exercise left to the reader. We can begin by imagining ourselves as a punter with access to this sort of regression equation. For any time period, there will be a set of possible bets offered, either by the exchange or the BM. For each player for whom a bet is offered, there will be easily accessible information on the above variables: distance from the leader, odds in the previous period and obviously the time period of the next match. In addition to these values, this punter also has the linear combination of the two random variables. The punter must determine his optimal bet based on the coefficients in the relevant regression and given the values of these variables for each player at a particular time. The bet place could be in either direction. The punter must synthesize the information and decide both which player(s) to bet on and in what direction. We can call the best bet at the time a “relative optimal”. Moreover, it might actually be the case that the punter ought not bet in the given time period because the highest possible change in odds does not make up for the BM’s or the exchanges difference between the back and lay price, something we will explore shortly. This concern aside, we can represent the punter’s choice graphically by making each betting parameter (coefficient) a line on a graph where the slope is the coefficient value. For illustrative purposes we can use the unconstrained regression for the full season:

Page 24: 2008TurtonF

24

Time(dark blue); Dist to Leader(red); Lag(yellow); Linear Comb(light blue); Constant(green)

The x-axis represents the value of each variable. The predicted odds change for each variable is the distance from the line to the x-axis. The total predicted change from t to t+1 can be found by summing all the distances, with respect the each x-value, from the lines to the x-axis. The fact that the linear combination of player i’s goals and leader k’s goals is a random variable means that this element of the decision to back or lay must be up to the discretion of the punter. Below are the graphs of the significant coefficients for t<13 and t>19 for both the top three players all players:

Time(dark blue); Dist to Leader(red); Lag(yellow); Linear Comb(light blue); Constant(green)

Page 25: 2008TurtonF

25

Time(dark blue); Dist to Leader(red); Lag(yellow); Linear Comb(light blue); Constant(green)

Time(not significant); Dist to Leader(dark blue); Lag(red); Linear Comb(green); Constant(yellow)

Page 26: 2008TurtonF

26

Time(dark blue); Dist to Leader(red); Lag(yellow); Linear Comb(light blue); Constant(green)

A More Specific Condition for Betting: The idea that an efficient market is one where the punter can partially predict the change in odds from t to t+1 is implied by a more general condition: the ability to make systematic above average returns. In section 3 we defined two terms: “Inefficient” and “Strongly Inefficient.” Just to take the exchange market as an example. The two conditions can be represented as follows: Inefficient:

𝜽𝒊 < 0, 𝜽!𝒊 > 0 Strongly Inefficient:

𝜽𝒊 < −𝒅

𝟐𝒎, 𝜽!𝒊 <

𝒅

𝟐𝒎

Recall that it does not matter which direction the change goes in, just that the punter can predict it. Clearly, if a market is strongly inefficient it is inefficient. We can now show directly that this market is strongly inefficient. For simplicity, suppose m=1(e.g. 4 to1 odds). To make systematic profit the only condition is that the change is greater than half the difference between the back and lay price. The punter can directly estimate this from the above model. Assuming randomness with regard to goals and goals of the leader, he simply computes the expected change based on the given the values of the strongly predictive variables of a player at a particular time. If this change is greater in absolute value than the difference between the

Page 27: 2008TurtonF

27

back and lay price then his expected profit is greater than zero. By joining the ER equations and the regression model it is easy to compute expected profit. He would just obtain the predicted change from the regression model and then plug them into the ER equation. I leave it to the reader to use the section 2 and 4 expected value equations and the regression model, with supposed realistic odds and strongly predictive variable values, to find that many bets can make well above zero profit.

VII) Conclusion To begin concluding the paper, I would like to say something about the model used. Though it produced healthy 𝑅2 values, it should be clear by the variation in coefficients across time that the movement across time is not linear. Moreover, if it is possible to obtain a sufficient amount of data, it would be interesting to run regressions that smooth over the time periods. Segmenting is helpful but only catches glimpse of the true movements in the coefficients. It would be interesting to see what results a semi-parametric approach might yield. As there has been no previous literature on this topic, more thoroughly investigating this issue might be worthwhile. If these results are valid, we are presented with a major arbitrage opportunity. Furthermore, this analysis allows one to compute expected returns from t to t+1, from one week to the next. Imagine the compound effect of multiple bets over an entire season of 38 games. One might wonder why this market is still inefficient. The only explanation I see fit is that football gambling in England is not a calculated affair. Rather, it is partially a cultural practice of the lower-middle class. In many cases, and perhaps sadly, an individual’s identity is inextricably linked with the team he supports. This being the case, punters are more inclined to take an unfavorable bet if it supports a player on their team and avoid an optimal bet if it supports a player on an opposing team. Though the main objective is to make money, their decisions are biased by many factors. In fact, even the punters who bet without attachment use intuition as their primary guide. It is safe to say they do not run regressions. The competition in this market is not composed of mathematics majors from MIT. The key, and most interesting, feature of the method I have employed is that it does not require the punter to even have any football knowledge. Unlike the previous literature on game-outcome betting, there is no need to predict what actually happens in the game. Profit comes from predicting the reaction to the game, where bias can be exploited. Moreover, TSM is a multi-million dollar market so large profits can be made without significantly moving prices. The optimal punter simply observes, waits and selects. As an economics student, it is difficult to understand exactly why excess returns in this market have not yet dried up. Perhaps it is simply a question of when, something I myself will not idly wait for.

Page 28: 2008TurtonF

28

Dedications and Acknowledgments

I would like to thank:

Indradeep Ghosh for his sagacious guidance

And

Rachel Newman for her help in data entry

And

On a more general level, Linda Bell for her time and invaluable support

Page 29: 2008TurtonF

29

Appendix:a

All Players

Time Period Mean Odds Difference

0 19.58

1 .

2 15.63726

3 .

4 25.59103

5 23.35173

6 8.106295

7 10.92629

8 3.009239

9 0.0744648

10 3.067053

11 3.572309

12 4.86599

13 2.12934

14 1.893603

15 0.4305353

16 0.2713604

17 1.841724

18 2.121897

19 3.004137

20 10.95394

21 9.461369

22 4.072369

23 10.91062

24 10.2901

25 16.48055

26 10.2901

27 30.18926

28 61.3932

29 199.2108

30 140.3409

31 140.7383

32 80.48117

33 77.95572

Page 30: 2008TurtonF

30

Appendix:b

All Players: Variation in Odds Top 3 Players: Variation in Odds

Time Period Coefficient of

variation Skewedness Time Period Coefficient of

variation Skewedness

0 0.2192431 -0.150461778 0 0.195968567 -0.899703633

1 0.297443867 0.252651089 1 0.281332767 0.4209092

2 0.30397507 0.030302433 2 0.1861329 -0.79340185

3 0.212277578 -0.12841455 3 0.249166467 -0.508897967

4 0.34709815 -0.12484113 4 0.1972987 0.0715265

5 0.29118093 -0.16956438 5 0.1978117 -0.341365533

6 0.23389769 0.16136006 6 0.1347797 0.074514333

7 0.31941829 0.6460868 7 0.215566333 0.385027967

8 0.24139456 0.24904791 8 0.187595233 -0.033034533

9 0.22192148 -0.34973846 9 0.123313467 -1.248892833

10 0.26367213 0.19197613 10 0.168135233 -0.126470033

11 0.24840682 0.08518743 11 0.153753733 0.306497933

12 0.25035077 0.31781629 12 0.1304784 0.048318067

13 0.22045728 0.42881593 13 0.1279584 0.4450068

14 0.21660959 0.43235035 14 0.182740467 1.2165101

15 0.18338488 0.12421871 15 0.1259289 -0.278300633

16 0.20256392 0.20780432 16 0.132124633 0.117058967

17 0.22240909 0.12178414 17 0.1538747 0.2666871

18 0.24403812 0.63069816 18 0.185111433 0.6227062

19 0.29533976 0.53445646 19 0.151482833 0.276962233

20 0.3339152 0.499084313 20 0.200864 0.168641133

21 0.34871166 0.52830877 21 0.194766267 0.3497804

22 0.44645165 0.52086258 22 0.369905267 0.18194

23 0.41186568 0.5148849 23 0.286604567 0.943665633

24 0.38752855 0.34830669 24 0.1732339 0.203732933

25 0.34367969 0.27984607 25 0.166282333 0.28243

26 0.30252394 0.04347404 26 0.185652467 0.127457567

27 0.24035502 -0.16344246 27 0.104935367 0.8602084

28 0.54381011 0.93791097 28 0.2718022 1.418642433

29 0.57496767 0.03256645 29 0.199678933 -0.120448367

30 0.47931532 -0.24929883 30 0.2007715 -0.1385138

31 0.38116251 -0.27282967 31 0.224283167 0.267449233

32 0.273803883 -0.40521685 32 0.320554267 -0.053748333

33 0.90626747 1.67948667 33 0.309186733 0.5499966

Page 31: 2008TurtonF

31

Appendix:c

0

.01

.02

.03

.04

Obse

rvatio

n D

ensity

0 10 20 30 40Time Period

kernel = epanechnikov, bandwidth = 1.55

Odds Distribution Over Time Periods

0

.00

5.0

1.0

15

.02

Obse

rvatio

n D

ensi

ty

0 200 400 600 800 1000Odds Values

kernel = epanechnikov, bandwidth = 4.48

Odds Distribution by Value

Page 32: 2008TurtonF

32

Appendix: d

Restricted Regression: Odds<40

COEFFICIENT Odds<40 t<16 and Odds<40

t>16 and Odds<40

t<13 and Odds<40

12<t<20 and Odds<40

t>19 and Odds<40

Top 3 Players and Odds<40

t<16 and Top 3 Players and Odds<40

t>16 and Top 3 Players and Odds<40

t<13 and Top 3 Players and Odds<40

12<t<20 and Top 3 Players and Odds<40

t>20 and Top 3 Players and Odds<40

tmpr -0.00194 -0.0109* 0.0100*** -0.0169* 0.000198 -0.0035 0.00197 -0.00918** 0.00414 -0.0164*** -0.0139*** -0.0118***

-0.00226 -0.006 -0.00269 -0.00942 -0.00464 -0.00386 -0.00145 -0.00442 -0.00302 -0.00622 -0.00506 -0.00449

goals -0.252*** -0.206*** -0.325*** -0.202*** -0.228*** -0.415*** -0.306*** -0.160*** -0.420*** -0.172*** -0.276*** -0.469***

-0.00758 -0.0101 -0.0124 -0.0159 -0.00913 -0.0177 -0.0121 -0.0129 -0.0164 -0.015 -0.0172 -0.0213

goals_ts 0.0795*** 0.0454*** 0.185*** 0.0497*** 0.0699*** 0.268*** 0.117*** 0.00576 0.255*** -0.0198 0.116*** 0.312***

-0.00813 -0.0102 -0.0142 -0.0174 -0.00764 -0.0218 -0.0126 -0.0101 -0.0192 -0.0132 -0.0146 -0.0256

lag_dist_tp 0.0388*** 0.0451*** 0.0522*** 0.0505*** 0.0242*** 0.0616*** 0.0487*** 0.0243*** 0.0554*** 0.0252*** 0.0402*** 0.0683***

-0.00493 -0.0087 -0.007 -0.0133 -0.00453 -0.00853 -0.00691 -0.00712 -0.00977 -0.00934 -0.00797 -0.0117

lag -0.00792*** -0.00848*** -0.0122*** -0.00972*** -0.00373*** -0.0182*** -0.00558*** -0.0113*** -0.00833*** -0.0114*** -0.0360*** -0.0105***

-0.00148 -0.00171 -0.00201 -0.00226 -0.00109 -0.00248 -0.00155 -0.00198 -0.00284 -0.00202 -0.00862 -0.0033

Constant 0.132*** 0.227*** -0.208*** 0.278** 0.0364 0.16 0.0265 0.218*** -0.0648 0.299*** 0.362*** 0.348***

-0.0499 -0.0823 -0.0669 -0.115 -0.0746 -0.102 -0.0298 -0.0566 -0.074 -0.064 -0.1 -0.115

Observations 1972 1001 868 676 704 592 979 364 580 259 247 473

R-squared 0.309 0.247 0.465 0.207 0.517 0.523 0.454 0.451 0.55 0.5 0.591 0.574 Linear Combination of Goals and Leader Goals -0.1725492 -0.1604207 -0.1402971 -0.152372 -0.1580304 -0.1471269 -0.1887157 -0.1537732 -0.1647063 -0.1914174 -0.1595024 -0.1572479

SE 0.008405 0.0099447 0.0118751 0.0164251 0.0076319 0.016702 0.0089302 0.0121473 0.0140762 0.0190665 0.0095494 0.0181857

*** p<0.01, ** p<0.05, * p<0.1

Page 33: 2008TurtonF

33

No Restrictions

COEFFICIENT Unrestricted t<16 t>16 t<13 12<t<20 t>19 Top 3 Players t<16 and Top 3 Players

t>16 and Top 3 Players

t<13 and Top 3 Players

12<t<20 and Top 3 Players

t>19 and Top 3 Players

Time Periodt 0.00759*** -0.00908* 0.0350*** -0.00962 0.00514 0.0282*** 0.00183 -0.00852** 0.00114 -0.0162*** -0.0139*** -0.0179**

-0.00187 -0.00524 -0.00718 -0.00759 -0.00571 -0.00947 -0.00148 -0.00418 -0.00417 -0.00598 -0.00506 -0.00751

Goalst -0.274*** -0.229*** -0.332*** -0.235*** -0.238*** -0.397*** -0.314*** -0.168*** -0.429*** -0.180*** -0.276*** -0.485***

-0.0128 -0.0108 -0.0254 -0.0166 -0.00949 -0.04 -0.0132 -0.013 -0.0172 -0.015 -0.0172 -0.0244

Goals by Leadert 0.113*** 0.0435*** 0.252*** 0.0404** 0.0784*** 0.339*** 0.116*** 0.0095 0.246*** -0.0178 0.116*** 0.305***

-0.0135 -0.00997 -0.0331 -0.0163 -0.00835 -0.05 -0.0126 -0.0102 -0.0205 -0.0134 -0.0146 -0.0258

Dist to Leadert-1 0.0339*** 0.0484*** 0.0391*** 0.0563*** 0.0256*** 0.0402*** 0.0574*** 0.0204*** 0.0662*** 0.0208** 0.0402*** 0.0843***

-0.00494 -0.00807 -0.00735 -0.0125 -0.00465 -0.0087 -0.0116 -0.00702 -0.016 -0.00918 -0.00797 -0.021

Lagt-1 -0.000650*** -0.00288*** -0.000912*** -0.00270*** -0.00298*** -0.000894*** -0.00523*** -0.00870*** -0.00788** -0.00901*** -0.0360*** -0.0115***

-0.0000838 -0.00066 -0.000148 -0.000701 -0.000793 -0.000167 -0.00147 -0.0015 -0.00313 -0.00156 -0.00862 -0.00416

Constant -0.0864** 0.152** -0.851*** 0.142* -0.0443 -0.709** 0.0238 0.200*** 0.00838 0.287*** 0.362*** 0.505***

-0.0439 -0.0647 -0.206 -0.0819 -0.0841 -0.28 -0.0294 -0.0525 -0.102 -0.0602 -0.1 -0.193

Observations 2622 1124 1386 793 788 1041 990 370 585 265 247 478

R-squared 0.173 0.231 0.18 0.186 0.441 0.179 0.35 0.436 0.392 0.484 0.591 0.408 Linear Combination of Goals and Leader Goals -0.16083 -0.1855336 -0.0799694 -0.1949871 -0.1595103 -0.058382 -0.1974277 -0.1823649 -0.1584209 -0.1978172 -0.1595024 -0.1805032

SE 0.0138593 0.0114457 0.0299345 0.01888 0.0085589 0.0394052 0.0112356 0.019631 0.0121547 0.0190326 0.0095494 0.0261967

Robust standard errors in parentheses

*** p<0.01, ** p<0.05, * p<0.1

Page 34: 2008TurtonF

34

No Odds Constraint: Betting Exchanges

COEFFICIENT Unrestricted t<16 t>16 t<13 12<t<20 t>19 Top 3 Players t<16 and Top 3 Players

t>16 and Top 3 Players

t<13 and Top 3 Players

12<t<20 and Top 3 Players

t>19 and Top 3 Players

Time Periodt 0.00808** 0.000235 0.0298*** 0.0125 0.00165 0.0208* 0.00106 -0.00955 0.00191 -0.0227 -0.00562 -0.0117

-0.00312 -0.00808 -0.009 -0.0149 -0.0169 -0.0117 -0.00379 -0.015 -0.00687 -0.0254 -0.0115 -0.0104

Goalst -0.266*** -0.215*** -0.331*** -0.213*** -0.235*** -0.393*** -0.306*** -0.139*** -0.413*** -0.151*** -0.292*** -0.451***

-0.0224 -0.0224 -0.0401 -0.0322 -0.0226 -0.0591 -0.0274 -0.0356 -0.0309 -0.0486 -0.0386 -0.0406

Goals by Leadert 0.103*** -0.0128 0.263*** -0.0328 0.0647*** 0.353*** 0.101*** -0.0333 0.238*** -0.0495 0.110*** 0.283***

-0.025 -0.0215 -0.0521 -0.0323 -0.0213 -0.0808 -0.0306 -0.0206 -0.044 -0.0357 -0.0315 -0.0612

Dist to Leadert-1 0.0265*** 0.0434*** 0.0299*** 0.0437** 0.0281** 0.0278** 0.0549*** 0.0303* 0.0552** 0.0305 0.0335** 0.0709**

-0.00782 -0.0131 -0.0101 -0.0194 -0.0125 -0.0112 -0.0162 -0.0175 -0.0239 -0.0268 -0.016 -0.0281

Lagt-1 -0.000455*** -0.00172*** -0.000636*** -0.00121*** -0.00278 -0.000577*** -0.00459 -0.0120** -0.00376 -0.0126** -0.0299* -0.00735

-0.000107 -0.000427 -0.000162 -0.000425 -0.00172 -0.000173 -0.00301 -0.00484 -0.00667 -0.00515 -0.0149 -0.00737

Constant -0.0824 0.0944 -0.714*** -0.00946 0.0128 -0.505 0.0418 0.256 -0.0183 0.4 0.237 0.333

-0.07 -0.0851 -0.252 -0.136 -0.27 -0.333 -0.0828 -0.196 -0.168 -0.266 -0.218 -0.263

Observations 448 176 252 118 138 192 162 54 102 36 42 84

R-squared 0.245 0.356 0.283 0.299 0.47 0.294 0.466 0.466 0.569 0.495 0.661 0.581 Linear Combination of Goals and Leader Goals -0.1630552 -0.2278335 -0.067477 -0.2455389 -0.169952 -0.0400348 -0.2045896 -0.1721978 -0.1748612 -0.2001623 -0.1817425 -0.1683053

SE 0.024626 0.0311372 0.0451265 0.0475133 0.0181465 0.0615625 0.0226078 0.0368466 0.0341774 0.0689806 0.0205935 0.0437722

Robust standard errors in parentheses

*** p<0.01, ** p<0.05, * p<0.1

Page 35: 2008TurtonF

35

Betting Exchanges: Odds<40

COEFFICIENT Odds<40 t<16 and Odds<40

t>16 and Odds<40

t<13 and Odds<40

12<t<20 and Odds<40

t>19 and Odds<40

Top 3 Players and Odds<40

t<16 and Top 3 Players and Odds<40

t>16 and Top 3 Players and Odds<40

t<13 and Top 3 Players and Odds<40

12<t<20 and Top 3 Players and Odds<40

t>20 and Top 3 Players and Odds<40

Time Periodt 0.000973 0.00138 0.00949 -0.00526 -0.00894 0.00132 0.000827 -0.00955 0.00244 -0.0227 -0.00562 -0.0115

-0.00314 -0.00819 -0.0066 -0.0124 -0.0171 -0.00977 -0.00391 -0.015 -0.00706 -0.0254 -0.0115 -0.0109

Goalst -0.237*** -0.189*** -0.306*** -0.169*** -0.225*** -0.368*** -0.303*** -0.139*** -0.408*** -0.151*** -0.292*** -0.446***

-0.0162 -0.0209 -0.0257 -0.0321 -0.0191 -0.0371 -0.028 -0.0356 -0.0314 -0.0486 -0.0386 -0.0412

Goals by Leadert 0.0539*** -0.000479 0.160*** 0.00221 0.0517*** 0.221*** 0.101*** -0.0333 0.241*** -0.0495 0.110*** 0.286***

-0.0147 -0.0197 -0.0325 -0.0284 -0.0169 -0.0558 -0.0308 -0.0206 -0.044 -0.0357 -0.0315 -0.0616

Dist to Leadert-1 0.0302*** 0.0364*** 0.0307* 0.0247 0.0356** 0.0402* 0.0543*** 0.0303* 0.0590** 0.0305 0.0335** 0.0749***

-0.00763 -0.013 -0.0156 -0.0174 -0.0147 -0.0209 -0.0164 -0.0175 -0.0237 -0.0268 -0.016 -0.0281

Lagt-1 -0.00473*** -0.00565*** -0.00507 -0.00751*** -0.00376* -0.0118 -0.00525 -0.0120** -0.00703 -0.0126** -0.0299* -0.0107

-0.00166 -0.00157 -0.00469 -0.00192 -0.00198 -0.00782 -0.00325 -0.00484 -0.00681 -0.00515 -0.0149 -0.0078

Constant 0.0739 0.103 -0.18 0.207 0.173 0.0442 0.0485 0.256 -0.0335 0.4 0.237 0.325

-0.0598 -0.0922 -0.163 -0.142 -0.268 -0.25 -0.0859 -0.196 -0.173 -0.266 -0.218 -0.274

Observations 321 155 147 98 124 99 160 54 100 36 42 82

R-squared 0.408 0.458 0.446 0.407 0.535 0.452 0.46 0.466 0.566 0.495 0.661 0.579 Linear Combination of Goals and Leader Goals

-0.1833856 -0.1458805 -0.1665027 -0.1729686 -0.1467636 -0.2023057 -0.1721978 -0.167358 -0.2001623 -0.1817425 -0.1596819

SE

0.0165884 0.0238385 0.0326331 0.0177759 0.0457883 0.0226047 0.0368466 0.0351154 0.0689806 0.0205935 0.0455986

Robust standard errors in parentheses

*** p<0.01, ** p<0.05, * p<0.1

Page 36: 2008TurtonF

36

Works Cited Kuypers, Tim. Information and Efficiency: an empirical study of a fixed odds betting market;

University College London, UK: Journal of Applied Economics, 2000.