Efficient Architectures for Low Latency and High Throughput Trading Systems on the JVM
Need For Speed? Low Latency Trading and Adverse Selection · 2017-11-03 · Need For Speed? Low...
Transcript of Need For Speed? Low Latency Trading and Adverse Selection · 2017-11-03 · Need For Speed? Low...
Need For Speed?
Low Latency Trading and Adverse Selection
Albert J. Menkveld and Marius A. Zoican ∗
April 22, 2013
- very preliminary and incomplete. comments and suggestions are welcome -
First version: 1 February 2013
∗Both authors are affiliated with VU University Amsterdam, the Tinbergen Institute and Duisenberg Schoolof Finance. Address: FEWEB, De Boelelaan 1105, 1081 HV Amsterdam, Netherlands. Albert Menkveld can becontacted at [email protected]. Marius Zoican can be contacted at [email protected]. The authors wouldlike to extend their gratitude to EMCF for data sponsoring this project. We have greatly benefited from discussionsonthis research with Istvan Barra, Alejandro Bernales, Peter Hoffmann, Olga Lebedeva, Emiliano Pagnotta and BartYueshen Zhou. Albert Menkveld gratefully acknowledges NWO for a VIDI grant. Marius Zoican additionally thanksthe participants from the Tinbergen Institute and DSF Brown Bag Seminars for insightful comments.
1
Need For Speed? Low Latency Trading and Adverse Selection
Abstract
This paper investigates the impact of market-wide low latency trading technologies on
informational asymmetries between traders. It develops a model with two types of high-frequency
traders: market makers (HFT-M) and ”bandits” (HFT-B), who profit by trading on stale quotes.
The HFTs endogenously decide on information acquisition. Competitive HFT market-makers
face higher adverse selection risks when latency drops, as the conditional probability of a liquidity
motivated trade decreases. In equilibrium, they will charge higher spreads to compensate for
the additional risk. Lower market latencies also provide incentives for market makers to gather
information. Informed market makers reduce expected losses on stale quotes while still attracting
liquidity demand from uninformed traders. We find empirical support for the model implications:
the adverse selection component of the bid-ask spread increases by 15% (0.35 bps) on NASDAQ
OMX after introducing low-latency technology. The effect is stronger in more volatile securities.
Keywords: market microstructure, trading speed, information asymmetry, high frequency
trading
JEL Codes: G11, G12, G14
2
1 Introduction
Market venues have invested considerable effort and financial resources in recent years to improve
the speed with which traders can submit limit or market orders. Pagnotta and Philippon (2011)
list the most important low-latency investments around the globe from 2008 to 2012: for example,
in 2009, NYSE reduced its latency to 5 down from 150 milliseconds. Similar investments were
undertaken by stock exchanges in Tokyo, Singapore, London or Johannesburg. As of October 2012,
the fastest trading system in the world belongs to ALGO Technologies, with a round trip latency of
16 microseconds. The trading world already faces the lower bound of the speed of light as potentially
binding1.
Hasbrouck and Saar (2010) define trading latency as ”the time it takes to observe a market event
(e.g., a new bid price in the limit order book), through the time it takes to analyze this event and send
an order to the exchange that responds to the event”. This paper is primarily focused on changes in
the latter component of market latency, which depends on the trading platform’s technology rather
than the individual traders’ algorithms. Hence, in this paper, we define by ”market latency” the
time it takes an order to reach the market and confirmation is received back by the trader. For the
high frequency traders, who usually co-locate their computers with the venue’s servers, the latency
is already very low (a few milliseconds), and any improvement in market latency is likely to have a
large impact on trading times. On the other hand, for lower frequency traders, such a drop would
likely have little effect on their market access times.
The main research focus of this paper is analysing the effects of low latency-trading on the
information asymmetries across traders, adverse selection risks, spreads and ultimately social welfare.
Foucault (2012) argues that ”the jury is still out” on this issue, as the ultimate effect of low-latency
trading on adverse selection depends on HFT specialisation. In low-latency environments, high-
frequency market-makers are able to quickly update the quotes before better informed speculators
can trade against them. On the other hand, high-frequency speculators symmetrically improve
their ability to reach the market faster, thus being equally able to act on information as the market
maker.
The net effect on information asymmetries is non-trivial as the HFT sector does not specialise
only in passive or speculative strategies. The Securities and Exchange Commission identify various
strategies of high frequency traders, including passive market-making , directional trading, arbitrag-
ing, and structural trading - taking advantage of market frictions (”vulnerabilities”) such as stale
quotes2. Hagstromer and Norden (2013) find supporting evidence of order type specialisation among
1See How Low Can You Go?, HFT Review, April 20102 See SEC Concept Release on Equity Market Structure
3
high frequency traders on the Swedish markets; Baron, Brogaard, and Kirilenko (2012) additionally
show that these strategies are persistent through time.
In our model, we model the complexity of the environment by assuming high frequency traders
engage both in market-making activities as well as speculating short-term trends. We allow for
the fast market maker to endogenously decide to become informed or not. In this respect, the
paper extends the implications of Foucault, Hombert, and Rosu (2012) who investigate latency
effects when market-makers are both slow and uninformed. We ask what would happen to spreads,
information acquisition and welfare if the market suddenly offers a new technology that improve the
HFTs’ reaction time even further - regardless of their strategy.
We build a limit order market model with costly monitoring, endogenous information acquisition
and deterministic market latency. We extend Foucault, Roell, and Sandas (2003) by allowing for
trader heterogeneity (between HFT and non-HFT) in the time necessary to reach the market and
then studying the equilibrium as we widen the gap between the two categories’ market response
times.
In the competitive equilibrium, HFT market makers earn zero rents and face a larger conditional
probability of meeting an informed trader - hence a higher adverse selection risk. The additional risk
is partly compensated by larger spreads, and partly by market markets withdrawing quotes. The
unconditional probability of a liquidity trader (assumed to be a market taker) realising a trade is
falling for larger market speeds, as the market is increasingly dominated by high-frequency traders.
This result is robust to changing the competition on the market between the dealers; we consider
the polar cases of Bertrand competition and a monopolist dealer.
To empirically identify the effect of lower market latency on trading outcomes and adverse
selection, we use as an instrument the introduction of the INET Core Technology on NASDAQ
OMX (the incumbent market in the Nordics) on February 8, 2010. Prior to this event, NASDAQ
OMX had a round-trip latency of 2.5 ms, lagging behind its competitors: Chi-X Europe with 0.400
ms or BATS with 0.270 ms. The new technology allowed NASDAQ OMX to reduce its round-trip
latency 10 times, to 0.250 ms, effectively making the incumbent market the fastest venue for Nordic
securities.
This paper focuses on the precise channel between market latency and adverse selection given
trading environment with exogenous order types. A potential extension in future versions of the paper
is to endogenize the make/take decision of high frequency traders and study the interaction between
such choices and market latency, especially through competition between limit order submitters.
The structure of trading fees with limit order rebates is also an important factor influencing the
endogenous order type decision. In our current model, spreads fall when limit orders are allowed to
execute at lower latencies than market orders - which generates a positive welfare effect in periods
4
of little volatility.
The rest of the paper is structured as follows: Section 2 briefly reviews the literature on the
advantages and disadvantage of low-latency trading. Section 3 develops a theoretical model of the
market using a competitive dealer assumption (Bertrand-like competition). We find that the optimal
spread increases with market speed, whereas welfare is reduced. The perfect competition assumption
is relaxed in Section 4, where we consider a monopolist dealer who can freely maximise his profits,
and we obtain the same qualitative results. In Section 5 we present the dataset, formulate the
econometric specification testable hypotheses of the model, and an identification strategy based on
a natural experiment. Section 6 discusses the empirical results. Section 7 concludes.
2 Related Literature
The academic literature has brought forward a number of arguments in favour of low latency trading.
Faster access to the market is argued to lead to competitive liquidity supply - as in Hendershott,
Jones, and Menkveld (2011). Limit order submitters are able to react quicker on each other’s quotes,
and the risk of being undercut by a different passive trader before a market order arrives is larger.
This reduces the market power of a limit order submitter and provides the incentives to set tighter
spreads. However, as Biais, Foucault, and Moinas (2013) also argue, the algorithmic trading proxy
in Hendershott, Jones, and Menkveld (2011) could capture changes that go beyond fast trading - for
instance, algorithmic trading might have lowered search costs across markets. 3
A second advantage of trading speed is that low latency trading can result in better price
discovery - as argued by Hendershott and Riordan (2011). Improving the reaction times to new
information implies that quotes and trading prices are also incorporating innovations faster.
Pagnotta and Philippon (2011) claim that speed can be used as an instrument by markets to
vertically differentiate in trading speed and attract different clienteles. Hence, fast markets would
charge a premium to the traders with volatile private values, who value speed most. The other
market participants, with a low preference for speed, would be able to use the slower market’s
services for a lower fee.
On the other hand, low latency trading may also have negative effects for the markets. High
frequency traders with better information can have an unfair advantage in adversely selecting other
market participants, especially since they can process news faster - see Foucault, Hombert, and
Rosu (2012). Several papers point out to the positive relationship between adverse selection and
3 A Swedish government report - see Finansinspektionen Report 2012, states than 24 companies use algorithms totrade, while only 3 use high frequency trading in the operations. Thus, there is a clear distinction in practice betweenalgorithmic and HF trading, with the latter being a strictly smaller subset of the first.
5
high frequency trading. Hendershott and Riordan (2011) find a larger permanent impact of higher
frequency market orders compared with slower ones. The price impact is actually found large enough
to overcome the bid-ask spread. In the same line, Baron, Brogaard, and Kirilenko (2012) show
that high frequency traders earn short-time profits through aggressive orders, which is consistent
with adversely selecting other market participants. Brogaard, Hendershott, and Riordan (2012)
also find results consistent with low-latency traders imposing adverse selection costs on the other
market participants. Biais, Foucault, and Moinas (2013) develop a theoretical model of low-latency
trading showing that when some agents become fast, all traders incur higher adverse selection costs.
Hoffmann (2010) focuses on adverse selection differentials across market venues. He finds that
the adverse selection component is larger on the entrant venues, and positively related to market
volatility.
Furthermore, there is an ongoing debate about the possibility of low-latency trading inducing
unnecessary market volatility, leading to events such as the Flash Crash. However, evidence so far
leans against the hypothesis that high frequency traders were responsible for the Flash Crash of
May 2010 -Kirilenko, Kyle, Samadi, and Tuzun (2011).
Compared to Jovanovic and Menkveld (2011), this paper focuses on the informational aspects
of faster markets. We endogenize the arrival times to capture the latency changes but we fix the
order type by assuming liquidity traders to be takers. We also explicitly introduce high frequency
traders who trade only on information (as in Foucault, Roell, and Sandas (2003)), rather than for a
market-making reason.
In low-latency environments, both liquidity suppliers and demanders (potentially with better
information) take less time to access the market: limit orders submitters can withdraw/update quotes
faster before being adversely selected, whereas speculators can act quicker on private signals and
earn rents by adversely selecting the market makers. Our contribution is to analyse the implications
of this symmetry in latency innovations between different types of high-frequency traders engaged
in passive and short-term speculative strategies.
3 Market Latency with a Competitive Market Maker
In this section, we develop a model of costly monitoring in a limit order market with stochastic times
to market, based on Foucault, Roell, and Sandas (2003), to generate hypotheses than can be tested
empirically. We are interested how a drop in market latency, affecting mostly both high-frequency
market-makers and speculators influences information acquisition by dealers, adverse selection
probabilities, bid-ask spreads and welfare.
6
3.1 Primitives
This section presents the model’s primitives. Extensive motivation for these primitives is left to
subsection 3.2.
3.1.1 Asset and market
There is a single risky asset in the economy, with a stochastic value v. Define τ as an exponential
random variable with mean 1α and N (t) = It>τ . The dynamics of the asset value are given by the
following process:
v (t) = v (0) + Y ×N (t) (1)
In equation 1, Y is a random variable capturing the size and sign of news. When a jump in the
asset value occurs, it can either be interpreted as good or bad news of magnitude σ, where σ > 1.
Hence, the distribution of Σ is given by: P (Y = σ) = P (Y = −σ) = 12 . After a jump at time t, the
asset value is: vt+ = vt− ± σ.
One can think about the asset dynamics as a compound Poisson jump process with intensity α,
truncated after the first arrival: once the first jump occurs, there are no further changes in the asset
value.
The asset is traded on a limit order market with price priority - the largest bid and smallest ask
execute first.
3.1.2 Agents
In the market for the risky asset, there are 2 types of high-frequency risk-neutral agents: a
representative competitive market maker: HFT-M, who posts limit orders, and a representative
speculator or ”bandit”: HFT-B, who can post market orders - in the terminology of Foucault, Roell,
and Sandas (2003)). There is also a representative low frequency trader (LFT) who experiences
liquidity shocks at random times. The HFTs arrive to market with a deterministic delay ∆H ,
whereas the LFT has a deterministic delay of ∆L.
The high frequency traders have no private valuation for the asset and are risk neutral. HFT-B
has no monitoring costs and perfectly observes the asset value at any time t, whereas HFT-M can
invest in a monitoring technology allowing perfect tracking of the asset value by paying a positive
cost c (the difference between the monitoring costs is motivated in subsection 3.2).
The LFT has additional private values for the asset uniformly distributed between [−θ, θ], such
that θ < σ, but no technology to track jumps in the asset before the HFTs do4. The LFT receives a
4The liquidity trader observes the information after the HFTs (they are an order of magnitude slower)
7
liquidity shock according to a Poisson process with intensity µ.
Both HFT have a reservation utility of 0 (can decide to not participate on the market).
The competitive market-maker HFT-M earns his reservation utility - his participation constraint
holds strictly, as in Bertrand competition.
3.1.3 Timeline
Monitoring Stage (T=0) The market-maker chooses whether to acquire information (and pay
the fixed cost c) - strategy I, or stay uninformed (thus saving the fixed investment) - strategy U .
The bandit makes a similar decision, but has zero costs of monitoring. It follows immediately from
the assumptions that it is optimal for him to always monitor the news process.
Quoting Stage (T=1) The value of the asset is publicly observable (the initial condition for the
Poisson process): v0. The market-maker posts bid and ask quotes for the asset. As in Foucault,
Roell, and Sandas (2003), the bid quote is set to v0 − s and the ask quote is v0 + s, where s is the
half spread. Since HFT-M is risk neutral (has no inventory concerns) and the liquidity traders are
uniformly distributed and they are informed only at T = 3, s = sa = sb5
Trading Stage (T=2) The trading game starts once quotes are posted in the market and runs
in continuous time until a quote is either consumed or withdrawn. Thus, three events can happen:
1. The LFT receives a liquidity shock arrives to the market before any news and executes a
market order f his private value θi is larger than the half-spread |θi| ≥ s.
2. There is news before the LFT arrives to the market. Then, either:
(a) The LFT executes a market order before any HFT arrives to the market (during the ∆H
interval - HFT latency)
(b) If the LFT does not trade in the ∆H interval following news, an HFT would arrive to
the market first:
i. With probability γ, HFT-M arrives before HFT-B and he will withdraw the stale
quotes to avoid a loss.
ii. With probability 1− γ, HFT-B is first and he will execute a market order to make a
profit on the stale quote.
5As Foucault, Roell, and Sandas (2003) argues, since the problem is symmetric, sa 6= sb would make no difference.
8
For the remainder of the paper, we will take γ = 12 : conditional on an HFT arriving first
after news, the HFT-M and HFT-B arrivals are equiprobable.
The game tree and utility functions associated with this model is presented in Figure 1:
[Figure 1 here]
3.1.4 Strategy Spaces
The market maker and the bandit have 3 possible actions, presented in the following table. In
addition, the dealer chooses the equilibrium spread corresponding to each of these pure strategies:
there is a one-to-one correspondence between the set of strategies below (excluding s) and the
set of dealer strategies including the spread. This stems from the fact that, as we prove in the
next subsection, the optimal monitoring strategy for the bandit is always to purchase information.
Hence, between T0 and T1, the information set of the dealer is unchanged, which is equivalent to
the monitoring and spread decisions being taken at the same time.
Strategy Monitoring Decision after asset value jump
IR become Informed Rush to market
IN become Informed Do Not Rush to market
UN become Uninformed
3.2 Discussion of the assumptions
The trading environment, asset value dynamics and trader types are largely based on Foucault, Roell,
and Sandas (2003). We change the focus from externalities to the market speed game, and thus we
do not model the interactions between dealers. Hence, we build our model around a ”representative”
HFT from each type: one bandit and one market maker. We can think of the representative high
frequency traders as the aggregate of a continuum of HFT-M and HFT-B.
The perfect competition assumption, which assumes that market makers constantly undercut
each other until there are no profit opportunities in expectations is relaxed in the next section,
where we consider the opposite polar case: a monopolist HFT-M who can extract maximum rents
from liquidity traders. Our main results are qualitatively robust with respect to this change of the
competitive environment.
Since we abstract from externalities among dealers, which in Foucault, Roell, and Sandas (2003)
were necessary to keep the spreads from exploding, we introduce a private value distribution for
9
the liquidity traders, leading to a downward sloping demand function, as in Ho and Stoll (1983) or
Hendershott and Menkveld (2011). This ensures that the dealer cannot completely eliminate the
adverse selection risk through spreads, as he will then lose all profitable trading opportunities with
the liquidity traders.
Monitoring strategies are decided upon before the trading starts, as in Foucault, Roell, and Sandas
(2003), but discrete in the effort/signal probability set {cost,P (Information)} ∈ {{0, 0} , {c, 1}}.We argue that investments in monitoring algorithms are done in larger update batches (a fixed cost
c) than being fine-tuned for each trade.
We model our game with the HFT-B having zero monitoring cost. This assumption is meant
to capture the idea that while any bandit can act on a signal and adversely select the dealer, the
market maker has to monitor the news and defend against all bandits. Our model can be thought
of a reduced version for the following environment: a ”representative” bandit stands in front of many
speculators who might observe a signal with low frequency. If there are enough of these ”bandits”,
then a representative HFT-B will obtain such a signal at high frequency. When deciding upon
monitoring, the dealer has to be able to outsmart any speculator if he was to avoid adverse selection.
Our results are robust to relaxing the assumption of γ = 12 to any γ ∈ [0, 1). The comparative
statics with respect to the HFT latency (∆H) are qualitatively the same for any interior probability
γ. The results are strongest for the case γ = 0. This corresponds to the situation when the HFT-M
cannot use any information to update his quotes, so he is practically an uninformed market-maker (as
in Foucault, Hombert, and Rosu (2012) for example). The other extreme, when γ = 1 corresponds
to the case where the HFT-B can never speculate on information; this is a trivial case where there
is no adverse selection and therefore the latency is irrelevant to the model.
3.3 Expected Payoffs of HFTs
3.3.1 Irrelevance of the LFT market delay ∆L
We define the private value process of the LFTs by N ′ (t) - a Poisson process with intensity µ. Then
the LFT market arrival process can be defined as M (t) where it holds that:
M (t+ ∆L) = N ′ (t) (2)
Thus, if a liquidity trader receives his private value at time t (jump in the N ′(t) process), it
will arrive on the market at time t + ∆L (a jump in the M(t) process). It is trivial that if the
inter-arrival jump time in the N ′ process is exponentially distributed, then the same property holds
for the M(t) process, as all arrival times are shifted by the same deterministic quantity.
10
Lemma 1. The market arrival process of the liquidity traders is equivalent in distribution to a
Poisson process of intensity µ if the private value shock process starts at least ∆L before the quotes
are posted. Hence, the deterministic delay ∆L is irrelevant for the distribution of LFT market
arrival times.
Proof. See appendix A.
3.3.2 Outcomes of the game
We consider now all the potential outcomes of the game and their respective probabilities, as well as
payoffs for the HF traders.
LFT arrival before news If a LFT arrives at τLFT before any jump in asset value, he will trade
if his private value exceeds the half-spread in absolute value. Since both news and market arrival of
liquidity traders can be modelled as Poisson processes with intensities α and µ, the probability of
this outcome is given by:
P {τLFT < τNews} =µ
µ+ α(3)
Given the game ends with a LFT arrival, the market marker earns in expectation s (1− s) - the
half spread times the probability of trade and the HFT-B earns 0 (no trade).
LFT arrival after news The probability of the jump in asset value arriving first is the complement
of the previous outcome:
P {τLFT > τNews} =α
µ+ α(4)
Given this event, there are 3 potential outcomes:
No LFT arrives in ∆H , HFT-M withdraws quote This outcome is only possible if HFT-
M monitors the asset value. The probability of no jump in M(t) during an interval of ∆H is given by
exp (−µ∆H), from the proprieties of Poisson processes. Conditional on it, HFT-M has a probability
of 12 of arriving before the bandit and withdrawing his quote - we prove in the following section that
HFT-B has IR as a dominant strategy. The full probability of the outcome is thus given by:
P (HFT −M first) =α
µ+ α︸ ︷︷ ︸News beforeLFT
× exp (−µ∆H)︸ ︷︷ ︸HFT first
× 1
2︸︷︷︸HFT−M first
(5)
11
In this case, both the market marker and the bandit earn zero profits.
No LFT arrives in ∆H , HFT-B executes market order In the case HFT-M monitors
the news, the outcome probability is symmetrical to HFT-M arriving first to the market:
P (HFT −M first) =α
µ+ α︸ ︷︷ ︸News beforeLFT
× exp (−µ∆H)︸ ︷︷ ︸HFT first
× 1
2︸︷︷︸HFT−B first
(6)
In the case HFT-M is uninformed, the outcome probability is double, since there is no more
competition between the bandit and the market-maker to arrive on the market.
Given the game ends with a HFT-B arrival, the bandit gains σ − s (the extent to which the
quote is stale) whereas the market-maker loses the same amount.
One LFT arrives in ∆H , LFT executes market order The probability of a jump in M(t)
during an interval of ∆H is given by 1− exp (−µ∆H). The outcome probability is thus:
P (HFT −M first) =α
µ+ α︸ ︷︷ ︸News beforeLFT
× (1− exp (−µ∆H))︸ ︷︷ ︸LFT first
(7)
The payoffs in this case are identical with the first outcome described: the market marker earns
in expectation s (1− s) - the half spread times the probability of trade and the HFT-B earns 0 (no
trade).
3.4 Dominated/Dominating Strategies
In this subsection, we seek to restrict the strategy spaces of the market maker and bandit, by
eliminating the strategies which are not optimal and cannot be part of an equilibrium under any
parameter values.
3.4.1 The ”Bandit” : HFT-B
We begin by analysing the HFT-B’s dominated strategies. We show that the bandit’s optimal
strategy is to always monitor and always submit a market order after news. With costless information,
this is intuitive: the bandit does not risk any losses and faces a positive probability of earning an
adverse selection profit - hence he has the incentive to post a market order if the dealer’s quotes no
longer reflect the asset’s true value.
12
Lemma 2. (1) The bandit will always choose to monitor and submit a market order after news
arrive to market - strategy IR is a strictly dominant strategy, for any s < σ.
(2) The bandit will never rush to the market if s ≥ σ.
Proof. See appendix A.
The first result is natural since the HFT-B can obtain information costlessly. Remaining
uninformed and taking an action is equivalent then to monitoring and taking the same action
(practically not acting on information). Thus, HFT-B cannot be worse off if he monitors and
potentially he can become better off by switching his rush/do not rush decision.
The second result implies that the bandit has a single unique strategy: always post a market
order when the quote available allows him to make a profit - when the value of the asset changes
before the market maker gets to update his quotes. The worst case scenario is that he will not
arrive first at the market and thus miss the opportunity. When there are no news, HFT-B has no
incentive to post market orders, as he has no private value for the asset and thus no incentive to
pay the spread.
The intuition for the final part of the lemma is that the potential adverse selection profit for
HFT-B is the difference between the size of the news and the stale half-spread, thus σ − s. If this
quantity is negative, the bandit ends up paying more in trading costs than he will earn by the
change in asset value.
3.4.2 The Market-Maker: HFT-M
Lemma 3. (1) For the HFT-M, the strategy IN is strictly dominated in the monitoring-trading
subgame. That is, the market-maker will never choose to pay for information and then never rush
to withdraw his quotes. (2) It is never optimal for the dealer to set a half-spread s ≥ 1 or s = 0.
Hence, the half-spread strategy space is reduced from R+ to (0, 1).
The market-maker’s positive cost deters him from monitoring if his strategy is to not to act
on the information bought. Monitoring is only useful in a separating equilibrium: HFT-M takes a
different action if there is news than if there is no new information.
In the third part, we claim that the market-maker will never set a spread so high that it will drive
out all the demand from liquidity traders - his only rationale for being in the market. Since private
values never exceed 1 in absolute value, any higher spread will deter the liquidity traders from
posting market orders. Conversely, a spread of 0 will give HFT-M no profits, while still exposing
him to adverse selection risk from the bandit.
13
3.5 Equilibrium results under Bertrand competition
Using Lemmas 2-3, we know that the equilibrium of the game is of the general form:
• at T = 0: the bandit always monitors the news. The market maker can either monitor or not.
• at T = 1: HFT-M sets a half-spread s ∈ (0, 1)
• at T = 2: HFT-B always rushes to the market after news, never if there are no news. If the
market-maker monitored at T = 0, he will behave in the same manner as the bandit.
The market maker’s utility functions for the 2 remaining potentially optimal HFT-M strategies
are as follows:
IR : EUHFT−M =µ
µ+ αs (1− s)+ α
µ+ α
[1
2exp (−µ∆H) (s− σ) + (1− exp (−µ∆H)) s (1− s)
]−c
(8)
UN : EUHFT−M =µ
µ+ αs (1− s) +
α
µ+ α[exp (−µ∆H) (s− σ) + (1− exp (−µ∆H)) s (1− s)]
(9)
Both these functions are strictly concave in s (negative second derivatives):
∂2EU(IR)
∂s2=∂2EU(UN)
∂s2= 2
(α
α+ µexp (−µ∆H)− 1
)< 0 (10)
and thus admit a single maximum. Note that for both strategies, the expected utility for s = 1
is larger than the expected utility for s = 0 (zero profits from liquidity traders, but a smaller loss if
adverse selection occurs).
Equilibrium Result We will show that for high enough latencies (small ∆H) and positive
information costs (c > 0), the equilibrium spread will be set at the lowest value which makes the
UN strategy break off - the smallest solution on (0, 1) of the equation EUD (s|UN,α, σ, c,∆H) = 0:
s∗UN(C) = min0<s<1
{s|EUHFT−M (s|UN,α, σ, c,∆H) = 0} (11)
No market-maker can set a lower spread and earn a positive profit, under any non-dominated
strategy available. As the market latency drops (∆H decreases), the adverse selection risks increases
and the IR strategy becomes profitable even for spreads lower than s∗UN(C). Hence, competitor
14
market-makers have an incentive to undercut s∗UN(C), which is no longer a ”competition proof”
spread. The new equilibrium spread will now be given by the value which makes the informed
strategy break off:
s∗IR(C) = min0<s<1
{s|EUHFT−M (s|IR, α, σ, c,∆H) = 0} (12)
Further, it can be proven that if they exist, both equilibrium competitive spreads are increasing
in the HFT market latency (decreasing in the delay):∂s∗
UN(C)
∂∆H< 0 and
∂s∗IR(C)
∂∆H< 0.
3.5.1 Optimal spreads under competition
To study which strategy is actually chosen in equilibrium, we introduce a new concept, the
competition-proof spread.
Definition 1. We define as the competition-proof spread a spread sCP set by HFT-M such that
there is no s < sCP that can undercut it and allow a market-maker to earn positive expected profits,
regardless of the strategy he is choosing: EUHFT−M (s|·, ·) ≥ 0.
In other words, a market-maker cannot set a strategy (UN or IR) and a spread such that it
undercuts the competition-proof spread and still at least break even. This new concept is illustrated
for some parameter values and different k in Figure 2.
[Figure 2 here]
In equilibrium, with Bertrand competition for a single market order, the spread set on the
market will be the competition-proof spread, the smallest spread possible that makes a strategy
break even. Were a HFT-M to set a higher spread, a competitor could either make a profit by
posting a lower spread and using the same information/market rush strategy or select another
strategy and make a non-negative profit. Hence, the equilibrium spread will either be s∗UN(C) or
s∗IR(C) or the market will break down (the utility function is negative for spreads in (0, 1) regardless
of the strategy considered).
Lemma 4. For each of the UN and IR market maker strategies, either both or none of the solutions
to the equation EUHFT−M (s) = 0 are on the interval (0, 1). In the latter case, the strategies are
strictly dominated by not participating on the market.
Proof. Note that EUHFT−M (0) < 0 and EUHFT−M (1) < 0 for both strategies. Hence, if the strictly
concave utility function has a root between 0 and 1, it has to hold that the other root is also on the
15
same interval. Hence, either none or both solutions to the equations are on (0, 1). If there is no root
on this interval, the utility function is always negative and hence dominated by posting no quote at
all.
Corollary 1. The competitive spread for each strategy is smaller than the monopoly spread.
Proof. If there is a root of the equation EUHFT−M (s) = 0 on (0, 1) then both solutions are on this
interval. Hence, the maximum of the function is reached also on (0, 1) and the smallest root is
logically lower than the argument that maximises the function.
Corollary 1 is a natural result: the competitive spread is lower than the one a market-maker can
post when it can extract rents due to monopoly power. Having stated these results, we can turn to
the potential equilibrium spreads of the trading game.
First, we define an adverse selection risk function A (α, µ,∆H):
Definition 2. We define the function A (α, µ,∆H) = αµ+α exp (−µ∆H). Note that A expresses the
probability of HFT-M being adversely selected by a bandit when uninformed, as well as double the
adverse selection probability if he is monitoring the news. Also, it is trivial to check that:
1. A (α, µ,∆H) is strictly decreasing in ∆H
2. A (α, µ,∆H) is strictly decreasing in µ
3. A (α, µ,∆H) is strictly increasing in α
4. In the limit: lim∆H→∞A (α, µ,∆H) = 0 and lim∆H→0A (α, µ,∆H) = αµ+α
5. For any combination of parameters it holds that: 0 ≤ A (α, µ,∆H) ≤ 1
The optimal spreads for any given strategy can be easily expressed in terms of the function A,
as detailed in Proposition 1:
Proposition 1. The strategy-contingent potential equilibrium spreads are, for the uninformed
market-maker (UN) strategy:
s∗UN(C) =1−
√1− 4A (α, µ,∆H) (1−A (α, µ,∆H))σ
2 (1−A (α, µ,∆H))(13)
and for the informed strategy IR:
s∗IR(C) =1− A2 −
√(1− A2
)2 − 4 (1−A)(A
2 σ + c)
2 (1−A)(14)
16
The actual equilibrium spread is the minimum of the two: s∗(C) = min(s∗UN(C), s
∗IR(C)
).
Proof. Solving the equations EUHFT−M (s|UN,α, σ, c,∆H) = 0 and EUD (s|IR, α, σ, c,∆H) = 0
and taking the minimum solutions.
We prove next that the spreads given in equations 13 and 14 are larger when the market latency
of high-frequency traders decreases, due to increased adverse selection risks. We state thus the
following lemma:
Lemma 5. The strategy-contingent potential equilibrium spreads s∗UN(C) and s∗IR(C) are increasing
in the market latency or equivalently decreasing with ∆H .
Proof. See appendix A.
If the deterministic HFT time to market ∆H is large enough for some fixed cost c of information,
then the uninformed (UN) HFT-M strategy is always preferred. If the latency drops, the informed
IR strategy may become optimal.
Lower market latencies and larger news intensities α relax the condition that needs to hold for
information gathering by market makers to be an equilibrium, whereas higher information costs c
are tightening it. We state the following important lemmas:
Lemma 6. For any trading speed and monitoring costs that satisfy c > σ2A (α, µ,∆H), being
uninformed and never rushing to market is the optimal strategy of the competitive HFT-M.
Proof. See appendix A
The next proposition summarises the condition under which monitoring is an optimal for HFT-M
and shows there is an unique latency below which monitoring is always optimal for the market-maker.
Proposition 2. The informed strategy IR is optimal for the market-maker and the equilibrium
spread is s∗IR(C) if the following condition holds:
A (α, µ,∆H)
2
(σ − s∗IR(C)
)− c ≥ 0 (15)
The above restriction is relaxed if ∆H drops and tightened if c increases. Also, if there is any
∆H ∈ R+ for which this condition holds, there is an unique threshold ∆CH > 0 such that it is true for
any ∆H < ∆CH . If the condition holds with equality, then the market-maker is indifferent between
the UN and IR strategies and any mixed strategy between the two.
Proof. See appendix A
17
3.5.2 Welfare
The welfare benchmark we are considering is the situation where all private values are realised. That
is, each liquidity trader who arrives at the market is willing to trade and will sell the asset if he has
a negative private value θi respectively buy it conditional on a positive private value. That is, the
social planner will set the probability of a trade P (trade) = 1 and the spread s = 0 (otherwise there
will be liquidity traders unwilling to exchange the asset). Under these conditions, the maximum
welfare is given by the expectation of the absolute private value. Applying Bayes’ rule, we have:
WelfareFirstBest = E [|θi|] = E [θi|θi ≥ 0]P (θi ≥ 0) + E [−θi|θi < 0]P (θi < 0) (16)
Knowing θi is uniformly distributed on [−1, 1], we compute the first best as:
WelfareFirstBest = E [|θi|] =1
2× 1
2+
1
2× 1
2=
1
2(17)
In our model, the welfare will be given by the probability of a trade with a liquidity agent times
the expectation of its private value.
WelfareCModel = P (F onmarket)×P (F trades)×E [|θi| |trade]− cI (EU [IRN ] > EU [UN ]) (18)
This value is the benchmark we should use to analyse the effect market speed is having on
welfare through adverse selection rationales. In our model, welfare will be given by:
WelfareCModel =
(
µµ+α + α
µ+α exp (−µ∆H))× (1− s∗UN(C))
(1+s∗UN(C)
)
2 , ∆H > ∆CH(
µµ+α + α
µ+α exp (−µ∆H))× (1− s∗IR(C))
(1+s∗IR(C)
)
2 − c, ∆H ≤ ∆CH
(19)
Generally, as both the spreads s∗UN and s∗IR are decreasing in ∆H and the probability of a
liquidity trader arriving first also has a similar behaviour, this would lead to lower welfare as market
speed rises.
[Figure 3 here]
At the threshold point ∆CH , when HFT-M starts monitoring, there are 2 effects on welfare,
summarised in the table below:
18
Effects of Trading Speed on Welfare (Competitive Case) at ∆H = ∆CH
No. Agent Effect on Welfare Influence (+/-) Total Effect
1 LFT lower trade probability Welfare↘ Negative
2 HFT-M monitoring costs Welfare↘ Negative
According to the results in Figure 3 robust to various other parameter specifications, welfare
drops as we decrease latency. At the threshold ∆CH where the competitive dealer switches the
strategy, there is a downward jump in welfare due to monitoring costs.
4 Market Latency with a Monopolist Market Maker
The analysis in sections 3 focused on the case of a competitive market maker. In this section, we
show that the same qualitative results hold if we allow for HFT-M to have market power and earn
positive expected profits. We are considering the opposite polar case, that of a monopolist HFT-M,
than can set s in the quoting stage of the original game (T = 1) to maximise its profits.
4.1 Optimal Spreads with a Monopolist HFT-M
The market-maker will select the monitoring strategy as well as the spread that maximises his profit.
Conditional of choosing any of the IR or UN strategies (note that the others are still dominated
under the results in lemmas 2 to 3), he will choose the corresponding spread that maximises the
expected utility.
Note that the expected utility functions are strictly concave and, for all un-dominated strategies,
the expected utility for s = 1 is larger than the expected utility for s = 0 (zero profits from liquidity
traders, but a smaller loss if adverse selection occurs). Hence, searching for the maximum of the
utility functions on s ∈ (0, 1) it will occur either at an interior point, when the first order condition
is zero, or at s = 1. If the maximum is at s = 1, then the maximum utility is negative (only adverse
selection losses).
Solving the first order conditions of the utility functions, we find that:
s∗IR(M) =2−A
4 (1−A); s∗UN(M) =
1
2 (1−A)(20)
Lemma 7. All interior optimal half-spreads s∗(M) are decreasing in the HFT latency, ∆H and in
the probability of news, α.
Proof. First derivatives of s∗(M) with respect to ∆H and α are strictly negative, respectively positive.
19
Lemma 8. The level of the half-spread, for a given ∆H and α, is always larger for the UN strategy
than for the IR strategy. That is, s∗UN(M) > s∗IR(M), for any given ∆H and α.
Proof. Immediate mathematic calculation.
4.2 Equilibrium
In equilibrium, HFT-M compares the maximum payoff he can get from each strategy (UN , IR or
stopping the game) and sets the optimal quote for that particular strategy, as a function of the
primitive parameters α, σ,∆H , µ and c.
For ∆H →∞ (intuitively, the probability of a liquidity trader arriving first is equal to 1), the UN
strategy yields the maximum payoff (basically there is no adverse selection risk, nor any successful
opportunity for the dealer to withdraw the quote, were he to wish so).
As speed increases, the gains from monitoring also rise to offset the adverse selection risks. For
very low latencies, the risk of being picked off is so large that the HFT-M cannot set any feasible
spread (s < 1) and the market breaks down. The result is stated in the following lemma:
Lemma 9. For ∆H < 1µ ln
(3α
2(α+µ)
), the market breaks down as both the optimal spreads for the
UN and IN strategy are larger than 1 - and attract no LFT demand.
Proof. Immediate calculation shows that for A ≥ 23 we have that s∗IR(M) > 1 and for A ≥ 1
2 we have
that s∗UN(M) > 1. Since ∂A∂∆HFT
< 0, any latency lower than the threshold ∆HFT = A−1(
23
)will
render both optimal quotes unfeasible. Computing the inverse function of A yields the mentioned
latency threshold of 1µ ln
(3α
2(α+µ)
).
4.2.1 Equilibrium Dealer Discrete Strategy Choice
Proposition 3. In the case of a monopolist market maker, the IR strategy is optimal if and only if
the following condition holds:
2A2 − 16c (1−A) + 8σA (1−A) ≥ 0 (21)
For ∆H →∞, the UN strategy is always optimal for HFT-M. The condition (21) is monotonically
relaxed as ∆H decreases. Hence, there can be at most one latency switching point ∆MH below which
monitoring becomes optimal. The existence condition of such a threshold requires that α is large
enough relative to µ:
4α
α+ µ+ 16c+ 8σ
µ− αα+ µ
> 0
20
If the condition holds with equality, then HFT-M is indifferent between the UN and IR strategies
and any mixed strategy between the two.
Proof. See appendix A
We find in equilibrium is that for lower market speeds, the market-maker always starts by not
monitoring and never trying to withdraw his quotes. The adverse selection risk increases with lower
latencies, which HFT-M accommodates through setting a higher spread. At some speed threshold
∆MH , the adverse selection risk becomes so large that the market-maker becomes willing to monitor
the news. At this speed, the probability he will trade with a LFT conditional on news becomes so
low that the monitoring costs are lower than the expected losses he will make from being adversely
selected.
Switching from UN to IR - Market Maker’s Tradeoff
Gains for Dealer Losses for Dealer
LFT arrives first: (1− s) s↗ monitoring costs c
News arrives first:
P (trade−HFT −B)↘News arrives first:
P (trade− F )↘
As the equilibrium spreads are lower for IR, the market maker will earn more in expectation
conditional he meets a LFT. (s(1− s) is decreasing in s for s ≥ 12 , the frictionless monopoly quote).
The equilibrium results for the HFT-M’s choice are presented in Figures 4.
[Figure 4 here]
In the terminology of our game, the optimal strategy of HFT-M has the following form:
HFT −M →
(UN, s∗UN ) ∆H > ∆MH (α, c, σ, µ)
(IR, s∗IR) ∆H ≤ ∆MH (α, c, σ, µ)
(22)
4.2.2 Comparative Statics for ∆MH
The speed threshold ∆MH is decreasing in the market volatility parameters α and σ and increasing
in the information costs c. This is intuitive: in less volatile markets, information asymmetries
manifest with a lower frequency which makes paying the information costs suboptimal unless the
loss conditional on being adversely selected increases (with market speed). In a similar fashion,
lower costs to obtain information make the dealer monitor even when the adverse selection risk is
not as high (for a higher ∆MH ).
21
4.2.3 Welfare
The welfare benchmark we are considering is the situation where all private values are realised, as
in the competitive model. However though, we need to change the benchmark from the first best
to a frictionless monopoly, as we are increasing the market power of the dealer, which is bound to
reduce welfare.
Without the adverse selection problem, the dealer will simply maximise his monopoly profit and
set a spread of 12 (solve max s(1 − s) =⇒ s∗Mon = 1
2). In this case, a liquidity trader will accept
the terms with probability 12 and the expected value of his private value, in absolute terms will be
1+s∗Mon2 = 3
4 . Hence, the monopoly welfare is given by:
WelfareMonopoly =1
2×
1 + s∗Mon
2= 0.375 (23)
This value is the benchmark we should use to analyse the effect market speed is having on
welfare through adverse selection rationales. In our model, welfare will be given by:
WelfareMModel =
(
µµ+α + α
µ+α exp (−µ∆H))× (1− s∗UN(M))
(1+s∗UN(M)
)
2 , ∆H > ∆MH(
µµ+α + α
µ+α exp (−µ∆H))× (1− s∗IR(M))
(1+s∗IR(M)
)
2 − c, ∆H ≤ ∆MH
(24)
Again, both the spreads s∗UN(M) and s∗IRN(M) are decreasing in ∆H and the probability of a
liquidity trader arriving first increases in ∆H : which lead to lower welfare as market speed increases,
at least as long as the dealer does not switch strategies (see figure 5).
At the threshold point ∆MH , when HFT-M starts monitoring, there are 3 effects on welfare,
summarised in the table below:
Effects of Trading Speed on Welfare (Monopoly) at ∆H = ∆MH
No. Agent Effect on Welfare Influence (+/-) Total Effect
1 Liquidity lower spreads Welfare↗ Positive
(1+2)
2 Liquidity lower trade probability Welfare↘ Positive
(1+2)
3 Dealer monitoring costs Welfare↘ Negative
(1+2+3)
The lower spreads benefit is stronger that the loss from lower trade probability with the liquidity
traders, which generates positive welfare for the liquidity traders when the dealer starts monitoring.
22
However, after we include the dealer’s costs, the total effect is negative, which leads to an even lower
welfare after monitoring becomes optimal for the dealer.
4.3 Model Predictions
For both the perfect competitive and monopolistic market maker, the model’s equilibrium spread
(for both the competitive and monopoly settings) is increasing in the HFT speed parameter ∆H , as
well as the intensity of news α. Also, the effect of market latency on the model equilibrium spread
is stronger if σ is larger. In the model, there is no other friction determining the spreads apart from
the information asymmetry. Hence, our predictions are formulated in terms of the adverse selection
component of the spread:
1. A drop in market latency will result in larger adverse selection costs on limit orders submitted
by HFT: ∆H ↓−→ s ↑
2. The intensity of news arrival results in larger adverse selection costs: α ↑−→ s ↑
3. The effect of a drop in market latency is larger for more volatile stocks:∣∣∣ ∂s∂∆H
∣∣∣ is increasing in
σ (see proof of Lemma 5)
5 Empirical Strategy
5.1 Dataset
Trade Data The trade data for this project is provided by the European Multilateral Clearing
Facility (henceforth EMCF) and consists of detailed individual trade information on equities from
Sweden, Denmark and Finland. The period spanned by our dataset is of 1 year, from September
1st, 2009 to September 10th, 2010. Due to the fact that up to October 19th, 2009 only a small part
of the trades were cleared through EMCF (who launched mandatory CCP services from that point
onwards), we have decided to exclude from analysis the first month covered as unrepresentative,
and focus on the remaining 11 months.
As a central counterparty institution, EMCF stands ready to become a third party in all the
trades - by buying the security from the original seller and then selling it to the original buyer. This
is known as novation process, in which 2 contracts (between EMCF and both parties) are created
instead of a single one between buyer and seller. Hence, data on all executed trades is available at
EMCF, including agency stamps: whether one of the original parties executed the trade for a client
23
or for its own account. More information on the EMCF central counterparty operation is included
in the Internet supplements of this paper 6.
The dataset contains information on approximately 70 million trades, including date and time
stamps 7, trader ID (anonymised), transaction price, quantity, sign of the transaction, trading
platform and whether the trade was executed on its own account or for a client (principal trade or
agent trade). Each of the trades takes place either on NASDAQ OMX (the incumbent market prior
to 2007) or on one of the new (entrant) markets, such as Chi-X, BATS Europe, NASDAQ Europe,
Burgundy or Quote MTF. For a snapshot of the dataset and the type of variables we use in this
paper, see Appendix D.
Order Book Data Information on the available quotes for all the stocks in the sample on all
markets is collected from the Thomson Reuters Tick History database, through SIRCA. At the
millisecond level, there are approximately 2.2 billion data points. This represents information on the
top of the order book for all exchanges: best bid and ask prices and the quantities demanded/supplied.
Since the trade data is available at the level of seconds, we have selected the last quote in each
second to match the trade and order book datasets.
Complementary Data Information from the main EMCF dataset is complemented by metadata
on each of the securities, obtained from Datastream - number of shares outstanding and ISIN codes,
as well as daily exchange rates between Euro and Swedish/Danish Krona. All trading prices are
converted to Euros at the daily exchange rate to provide comparability across stocks. Data on
intraday market volatility (on the OMX Nordic 40 index high and low daily prices) is obtained via
the Thomson Reuters Tick History (TRTH) system, for the days in the EMCF sample.
The universe of our sample consist of 226 traders being active in 242 stocks. The average trader
is present on the market in 157 out of 228 days and trades in about a third of the available stocks.
The average trade value over the full dataset was approximately 20.62 thousand Euro.
Variables We aggregate the data up to three hierarchical levels: first, we build a stock-day panel
with information on average price, daily volume, daily volatility, market capitalisation and stock
fragmentation and effective spreads on incumbent and entrant markets. Then, for each stock-day,
we look at the trader IDs which were active in that particular security (third aggregation level). We
build thus a multi-level panel with 3 dimensions: a stock-day panel extended in the cross-sectional
dimensions of traders. In the remainder of the paper, we index days by t, securities by i and traders
6See document http://db.tt/l03D3h2V7converted to standard GMT, including hours, minutes and seconds
24
by j.
The list of variable definitions and comments on their measurement are provided in Appendix
C. There we also present a table with variable short-names, for ease of exposition. The final
stock-day-trader-agency multilevel panel has approximately 1.7 million observations.
5.2 Measurement and model specification
Measuring the model spread Absent any inventory or order processing costs, the positive
spread in our model stems only from the adverse selection component (and dealer’s market power in
the monopoly setting). Hence, the dependent variable for testing is the adverse selection component
of the spread, computed as in Hendershott, Jones, and Menkveld (2011) and Hoffmann (2010). For
each trade, define: pt is the transaction price and mt is the prevailing midpoint at the transaction
time; the sign indicator qt takes the value qt = 1 for buys and qt = −1 for sell transactions (taking
the market taker perspective). The effective spread then is defined as the percentage deviation from
the midpoint:
ES = qtpt −mt
mt(25)
Assuming the market maker will close his position on average in∆ = 5 min, we decompose ES
in an adverse selection and a realised spread component. The adverse selection component then
measures whether and by how much the price moved against the quote submitter in the period
immediately following the trade:
ES = qtmt+∆ −mt
mt︸ ︷︷ ︸AS
+ qtpt −mt+∆
mt︸ ︷︷ ︸RS
(26)
Volatility Volatility is measured by 2 variables: we capture the dynamics of systemic risk by the
daily range based volatility of OMX Nordic 40 Index (σMktt ) and the cross-section of risk by the
idiosyncratic risk for each security (σIDi ). The idiosyncratic risk is computed as the estimate of
residual variance from regressing daily stock returns on daily index returns in the year following the
event (February 2010-February 2011).
Identification strategy We use the introduction of the INET technology on NASDAQ OMX on
February 8, 2010 as an instrument for an exogenous change in market speed. The latency dropped
ten times, from 2.5 ms to 250 µs. The market speed jump is captured by a time dummy DEvent,
which takes value 1 after February 8, 2010.
25
To allow for heterogenous effects between high- and low- frequency traders (for identification
details, see subsection 6.2), we define a new dummy variable: DHFT , which takes value 1 for HFT
accounts and 0 otherwise.
Econometric Model The benchmark model is a fixed-effects panel linear regression, estimated
by least squares. We regress the adverse selection component of the spread (expressed in bps)
aggregated across stocks, traders and days on event dummies, HFT dummies, their interaction,
volatility variables and stock-specific fixed effects. We choose the most conservative standard errors,
double-clustered at stock and day level, following the methodology in Petersen (2009). The equation
is given by:
ASijt = β0DLFTEvent + β1D
HFTEvent + β2DHFT + β3σ
Mktt + β4σ
IDi DEvent + β5 log V oltrad + δi + εijt (27)
Hypotheses We test the following hypotheses:
1. H10 : β1 > 0 and H2
0 : β0 > 0. The adverse selection component of the bid-ask spread increases
once the market latency drops, both for high and low frequency traders. This is the main
result of Lemmas 5 and 7 (under different competition assumptions, it holds that ∂s∗
∂k > 0 )
2. H30 : β3 > 0. The adverse selection component of the bid-ask spread increases in more volatile
periods (in the model, ∂s∗
∂α > 0 - Lemma 7 ).
3. H40 : β4 > 0. The effect of market speed on adverse selection is larger for riskier stocks. Under
both competition settings, the marginal effect of speed on spreads increasing in α: ∂2s∗
∂k∂α > 0.
6 Results
6.1 Summary Statistics
The volume-weighted means of adverse selection spread components for all trades are reported in
Table 1, separately for the periods before and after INET was introduced on NASDAQ OMX.
[Table 1 here]
Plotting the distribution of adverse selection averages for each stock, day and trader in the
sample on NASDAQ OMX, before and after the INET was introduced, we observe that the centre of
26
the probability mass shifts to the right in the second part of the sample. This finding is consistent
with larger mean adverse selection costs which are not due to an increase in the number of ”tail”
adverse selection events (which would be the case for instance if the period after INET would have
included days with extraordinary volatility in some stocks at least).
[Figure 6 here]
6.2 High Frequency Traders Identification
To identify high frequency traders in our dataset, we follow Kirilenko, Kyle, Samadi, and Tuzun
(2011) and, for each stock and day in our sample, we highlight the trader accounts who simultaneously
fulfil the following 3 conditions across all markets. Then, we compute a ”HFT ratio” by dividing
the number of stock × days when a particular trader behaved as a HFT to the total number of
appearances in the sample.
1. The account traded more than 10 contracts on a given stock in a given day.
2. The average of the absolute value of the end-of-day net position, expressed as a fraction of the
account’s total trading volume for the day is not more than 5%
3. The average of the square root of the sum of squared deviations of the minute-end net contract
holdings from the net contract holdings at the end of the day, expressed as a fraction of an
account’s total contract trading volume during that day, is not more that 1.5%. ”Contract
holdings” are defined as the net number of contracts bought or sold from the beginning of the
day until the end of the minute for which the calculation is made.
We find in total 7 trader accounts that have a pronounced HFT profile compared to the
mass of traders. Among those, only 4 accounts trade on NASDAQ OMX, while 3 use exclusively
alternative markets such as Chi-X or BATS Europe. The identified HFTs account for 5.43% of the
total NASDAQ OMX volume (denominated in Euro), with one particular account being strongly
dominating (4.91% of the total NASDAQ OMX volume and the third trader in the market by
volume, as well 90% of the total HFT volume).
[Figure 7 here]
27
6.3 Estimation Results
Table 2 shows the results of the empirical analysis, estimating the equation:
ASijt = β0DLFTEvent + β1D
HFTEvent + β2DHFT + β3σ
Mktt + β4σ
IDi DEvent + β5 log V oltrad + δi + εijt (28)
We consider different symmetric estimation windows around the INET implementation: 2 months
(December 8, 2009 - April 8, 2010), 3 months (November 8, 2009 - May 8, 2010) and 4 months
(October 19, 2009 - June 8, 2010).
[Table 2 here]
The empirical findings are summarised below:
1. We find that the drop in market latency has a positive significant effect on adverse selection
of approximately 0.4 bps (approximately 7%). Without distinguishing between high- and low-
frequency traders, this effect is strongly significant for all windows considered, in specification
with or without other control variables.
2. When allowing for separate effects between high frequency and low frequency traders, we find
an increase in the adverse selection spread component for LFT of approximately the same
magnitude (0.4 bps), which is significant and persistent across all time windows considered
(β1 > 0).
3. For the HFT, the effect is also significant and positive for 2 months after the event date
(about 0.55 bps), but it declines as we move further away from the event date - while
becoming statistically insignificant 4 months after the implementation date. This could point
to monitoring technology adjustments.
4. We find a positive and significant relationship between adverse selection and the volatility of
the index. A standard deviation increase in market volatility leads to a 0.27-0.36 bps increase
in the adverse selection component of the spread. We find thus empirical support for H30 .
5. The increase in the adverse selection is larger for securities with larger idiosyncratic risk. A
standard deviation increase in idiosyncratic risk leads to a 0.44 bps additional market latency
effect on adverse selection spread components. We find thus empirical support for H40 .
6. Before the INET implementation, HFTs have much lower adverse selection costs than LFTs:
between 3-3.5 bps, very strongly significant and consistent across all model specifications.
28
Placebo Analysis We test the relationship between the adverse selection spread component and
the event date for the other markets Nordic securities are traded on (the largest of which are Chi-X
Europe and BATS). These constitute a placebo group, as there was no similar speed improvement
on any other market during the same period.The results for a symmetric 2-months event window
around the event date are presented in Table 3. Note the event coefficients are no longer statistically
significant, regardless if we consider HFT and LFT separately or not.
[Table 3 here]
7 Concluding Remarks
This paper studies the effects on adverse selection and information asymmetries of the lower latency
technologies implemented by trading venues. We focus on symmetric and exogenous technology
improvements across HFT traders, regardless whether they act as market-makers trying to capture
the spread or as speculators earning profits on short-term price trends.
For empirical identification, we use a natural experiment in the Nordic markets. In February
2010, NASDAQ OMX implemented the INET technology, which reduced round-trip latencies tenfold.
We find that adverse selection increased by 15% on NASDAQ OMX following the latency drop,
after controlling for market volatility, volume and realised spreads. On the other trading venues,
adverse selection dropped in the post-event period, which might indicate the migration of informed
speculators from the slower to the faster market.
To explain our results, we develop a costly-monitoring theoretical model of the limit order market.
We find that lower market latencies lead to most market interactions to take place between high
frequency traders. This phenomenon, the “crowding-out” of low-frequency (liquidity) traders results
in larger adverse selection risks. Consequently, wider spreads are set to compensate the extra risk,
but also market-makers will improve their monitoring levels (when they do, spreads actually drop in
equilibrium). As the latency is reduced indefinitely, there are no more monitoring investments to be
undertaken by market-makers or speculators. Trading becomes increasingly more a zero-sum game
between high-frequency traders, leading to a lower trade to quotes ratio (due to quote withdrawals)
and lower welfare gains, as liquidity traders get to realise their private values less often.
An alternative policy to a symmetric latency drop between limit and market orders, stipulating
that only the limit orders should benefit from lower latencies will reduce the adverse selection risks
and result in tighter spreads, as the speculators no longer benefit from the faster trading.
29
Table 1: Adverse selection spread components on NASDAQ OMX. We report volume-weightedmeans of AS components for all trades in Nordic markets. The interquartile range is provided inparantheses.
Window Before INET After INET ∆(%)
Panel A: Adverse Selection Spread Component (bps)
1 month 2.59(2.32−2.79)
2.71(2.26−3.01)
4.63%
2 months 2.42(2.16−2.69)
2.62(2.16−2.95)
8.26%
4 months 2.60(2.17−2.89)
2.73(2.20−3.19)
5.38%
Panel B: OMX Nordic 40 Daily Volatility
1 month 1.04%(0.72−1.30)
0.94%(0.64−1.13)
−9.61%
2 months 0.88%(0.63−1.08)
0.88%(0.62−1.06)
−0.01%
4 months 0.98%(0.66−1.33)
0.98(0.63−1.31)
−0.05%
Panel C: Average Daily Volume Traded (EUR million)
1 month 2605.63(2314.35−2672.85)
2215.54(1973.25−2282.24)
−14.97%
2 months 2120.95(1632.14−2508.74)
2226.99(1999.90−2440.97)
4.99%
4 months 1956.77(1631.30−2283.53)
2577.56(2075.13−2899.43)
31.72%
30
Table 2: Adverse selection costs in basis points is regressed on event dummies/ event dummiesinteracted with the trader type (HFT / LFT). In several specifications, we also allow for the INETeffect to vary in the cross-section of stocks with the idiosyncratic volatility of the security. Multipleevent windows are considered: from 2 months to 4 months around the INET implementation.Volatility and volume measures are standardized to have mean zero and variance one. We usedouble-clustered standard errors (as in Petersen (2009)) and stock-specific FE.
Panel A: NASDAQ OMX (2 months around event date)
Variable (1) (2) (3) (4) (5)
DLFTEvent − − 0.372∗∗
2.280.562∗∗∗
3.490.551∗∗∗
3.47
DHFTEvent − − 0.558
2.47
∗∗ 0.767∗∗∗3.17
0.754∗∗∗3.29
DHFT − − −3.625∗∗∗−19.14
−3.241∗∗∗−16.82
−3.145∗∗∗−16.3
DEvent 0.351∗∗2.22
0.525∗∗∗3.43
− − −σMktt − 0.257∗∗∗
2.92− 0.274∗∗∗
2.920.254∗∗
2.89
σIDi ×DEvent − − − 0.441∗∗∗4.61
−log V oltrad − −0.913∗∗∗
−8.27−0.832∗∗∗−7.65
−0.862∗∗∗−7.79
No. Obs. 288909 288909 288909 288909 288909
Panel B: Window around event - 3 months
Variable (1) (2) (3) (4) (5)
DLFTEvent − − 0.435∗∗∗
2.990.561∗∗∗
4.040.553∗∗∗
4.05
DHFTEvent − − 0.452∗∗
2.160.644∗∗
2.640.637∗∗∗
2.91
DHFT − − −3.718∗∗∗−19.71
−3.339∗∗∗−17.51
−3.294∗∗∗−17.17
DEvent 0.419∗∗∗2.97
0.532∗∗∗4.05
− − −σMktt − 0.342∗∗∗
4.61− 0.341∗∗∗
4.510.338∗∗∗
4.57
σIDi ×DEvent − − − 0.449∗∗∗4.08
−log V oltrad − −0.857∗∗∗
−9.06− −0.791∗∗∗
−8.59−0.806∗∗∗−8.57
No. Obs. 446819 446819 446819 446819 446819
Panel C: Window around event - 4 months
Variable (1) (2) (3) (4) (5)
DLFTEvent − − 0.417∗∗∗
3.110.446∗∗∗
3.550.443∗∗∗
3.56
DHFTEvent − − −0.144
−0.59−0.023−0.09
−0.039−0.16
DHFT − −3.442∗∗∗−17.25
−3.104∗∗∗−15.57
−3.002∗∗∗−15.02
DEvent 0.394∗∗∗2.97
0.408∗∗∗3.34
− − −σMktt − 0.366∗∗∗
5.74− 0.366
5.710.362∗∗∗
5.72
σIDi ×DEvent − − − 0.434∗∗∗4.52
−log V oltrad − −0.862∗∗∗
−10.16− −0.784∗∗∗
−9.53−0.808∗∗∗−9.61
No. Obs. 566841 566841 566841 566841 566841
31
Table 3: Placebo Analysis: Adverse selection (bps) on alternative markets (other than NASDAQOMX) is regressed on event dummies/ event dummies interacted with the trader type (HFT / LFT).In several specifications, we also allow for the INET effect to vary in the cross-section of stockswith the idiosyncratic volatility of the security. We consider a 2 months window around the INETimplementation. Volatility and volume measures are standardized to have mean zero and varianceone. We use double-clustered standard errors (as in Petersen (2009)) and stock-specific FE.
Variable (1) (2) (3) (4) (5)
DLFTEvent 0.189
1.250.184
1.130.176
1.14
DHFTEvent −0.171
−0.87−0.212−0.89
−0.184−0.77
DHFT −0.255−1.22
−1.636∗∗∗−6.74
−1.576∗∗∗−6.53
DEvent −0.007−0.01
0.1671.11
− − −
σMktt 0.277∗∗∗
2.440.328∗∗∗
2.980.273∗∗∗
2.41
σIDi ×DEvent − −0.032−0.13
−
log V oltrad −0.487∗∗∗−4.6
−0.468∗∗∗−4.23
−0.452∗∗∗−4.29
R2 1.06% 1.1% 1.11% 1.13% 0.67%No. Obs. 141853 141853 141853 141853 141853
32
HFT-M
mon
itors(paysc)
not
mon
itor
(pays0)HFT-M
HFT-M
quotes
s
quotes
s
endgame
endgame
New
s
∆H
LFT
executesmarketorder
HFT-B
executesmarketorder
New
s
∆H
New
s
∆H
HFT-M
withdrawsquotes
LFT
executesmarketorder
New
s
(I)
(II)
(III)
(IV)
Pre-Trading
TradingOutcomes
Fig
ure
1:M
odel
Tim
ing
33
0.2 0.4 0.6 0.8 1.0Half Spread
-0.10
-0.05
0.05
0.10
0.15
HFT-M Utility
UN Strategy
IR Strategy
Cannot profitably undercut UN(competition proof)
Strategy UN is profitable at lower spreads than IR break-even spread.
(a) Dealer’s Utility Functions - ∆H = 1.5
0.2 0.4 0.6 0.8 1.0Half Spread
-0.20
-0.15
-0.10
-0.05
0.05
HFT-M Utility
UN Strategy
IR Strategy
Cannot profitably undercut IR(competition proof)
Strategy IR is profitable at lower spreads than UN break-even spread.
(b) Dealer’s Utility Functions - ∆H = 0.5
Figure 2: Dealer’s utility functions for the UR and IR strategies. We take α = 0.2, µ = 0.65, σ = 1.4and c = 0.07. Note that in the first panel, with lower market speed (∆H = 1.5), the break-evenspread from the informed strategy can always be profitably undercut by choosing not to acquireinformation. As the latency drops (second panel, ∆H = 0.5), the situation is reversed and thecompetition-proof spread becomes the break-even spread of the informed (IR strategy).
34
15 10 5Market Latency HInverted ScaleL
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Optimal Half-Spread
Monopolist HFT-M
Bertrand Competition
Switch to monitoring (competition)
Switch to monitoring (monopoly)
Market breakdown (competition)
(a) Comparison of equilibrium half-spreads against market latency for competitive and monopolymarket-makers (Parameter calibration: α = 0.2, µ = 0.25, σ = 1.25 and c = 0.025)
15 10 5Market Latency HInverted ScaleL
0.2
0.3
0.4
0.5Welfare
Monopolist HFT-M
Bertrand Competition
Switch to monitoring (competition)
Switch to monitoring (monopoly)
Market breakdown (competition)
(b) Comparison of welfare against market latency for competitive and monopoly market-makers(Parameter calibration: α = 0.2, µ = 0.25, σ = 1.25 and c = 0.025))
Figure 3: Equilbrium Spreads, Welfare and Market Speed in the competitive and monopolymarket-makers environments
35
15 10 5 2Market Latency HInverted ScaleL
0.05
0.10
0.15
0.20
Dealer Utility
IR Strategy
UN Strategy
Strategy Switch Threshold for HFT-M
(a) Low News Intensity (α = 0.2, µ = 0.25, σ = 1.25, c = 0.025)
15 10 5 2Market Latency HInverted ScaleL
-0.10
-0.05
0.05
0.10
0.15
0.20
Dealer Utility
IR Strategy
UN Strategy
Strategy Switch Threshold for HFT-M
(b) High News Intensity (α = 1.0, µ = 0.25, σ = 1.25, c = 0.025)
Figure 4: HFT-M’s choice between strategies in the monopoly case: expected utilities as a functionof market latency. We also include different news intensity (α) regimes to illustrate the relationshipbetween the strategy switching point and the frequency of news.36
15 10 5 2Market Latency HInverted ScaleL
0.54
0.56
0.58
0.60
0.62
Half-Spread
alpha=0.2, c=0.05
alpha=0.3, c=0.025
alpha=0.2, c=0.025
Market-maker starts monitoring: spread drops
(a) Optimal equilibrium spreads
15 10 5 2Market Latency HInverted ScaleL
0.25
0.30
0.35
Welfare
alpha=0.2, c=0.05
alpha=0.3, c=0.025
alpha=0.2, c=0.025
Market-maker starts monitoring
(b) Welfare
Figure 5: Equilibrium spreads and welfare under monopolist HFT-M for various news intensitiesand information cost parameters 37
Figure 6: Distribution of adverse selection averaged at stock-day-trader levels. The blue solid linecorresponds to the distribution before February 8, 2010 whereas the red dotted line corresponds tothe post-event distribution. Note the shift to the right of the probability mass following the INETevent.
38
Figure 7: HFT Profiles and Trading Aggressiveness. Each dot stands for a trader account. HFTprofiles (between 0 and 1) are measured as the proportion of the number of stock × days an accountsimultaneously passes the three Kirilenko, Kyle, Samadi, and Tuzun (2011) HFT criteria in the totalnumber appearances in the sample. Aggressiveness is simply measured as the proportion of marketorder executed volume in total traded volume, for each trader account. The size of the dots isproportional to the Euro-denominated volume traded on NASDAQ OMX by each particular agent.
39
A Proofs of Lemmas and Propositions
Lemma 1
Proof. It can be proven that the increments of M(t) and N(t) have the same distribution. From
the properties of the Poisson processes, we can write for any s < t and k ∈ {0, 1, 2...}:
P{N ′(t)−N ′(s) = k
}=µk (t− s)k
k!exp (−µ (t− s)) (29)
Similarly, for M(t), we have:
P {M(t)−M(s) = k} = P{N ′(t−∆L)−N ′(s−∆L) = k
}=µk (t− s)k
k!exp (−µ (t− s)) (30)
The previous relations holds because the interval (s−∆L, t−∆L) has the same length as (s, t):
P{N2(t−∆L)−N2(s−∆L) = k
}=µk (t−∆L − s+ ∆L)k
k!exp (−µ (t−∆L − s+ ∆L)) (31)
Lemma 2:
Part 1
Proof. If the bandit does not monitor the news (strategy UN) or monitors the asset value but does
not submit a market order (strategy IN), he will earn an expected payoff of 0 (no trade and no
monitoring cost). It is enough then to show that for s < σ, the expected payoff from strategy IR is
larger than zero. If the market-maker also monitors the news, the payoff of HFT-B is given by the
following expression:
ΠHFT−B =1
2× α
α+ µexp−µ∆H (σ − s) > 0 (32)
The expression (σ − s) is the profit conditional on arriving first to the market. The probability
of the HFT-B arriving first to the market is given by 3 components: αα+µ - the probability there is a
value jump before an LFT consumes the quote; exp−µ∆H - the probability that there are no LFTs
arriving in the interval from observing the jump to market arrival and 12 - the probability HFT-B
arrives before HFT-M.
If the market-maker does not monitor the news (or does not rush to market after a jump), the
payoff of HFT-B is as follows:
40
ΠHFT−B =α
α+ µexp−µ∆H (σ − s) > 0 (33)
The components of the profit are the same as before, except that the probability of HFT-B
arrives before HFT-M is now equal to 1 rather than 12 . Hence, IR a strictly dominant strategy for
the bandit.
Part 2
If s > σ, we see from the proof of Part 1 that the profit from the strategy IR is negative,
regardless of HFT-M’s strategy. Hence, it is optimal for the bandit to never act on information, as
the spread is larger than the potential benefit from the stale quote.
Lemma 3:
Part 1
Proof. With strictly positive monitoring costs, the strategy IN is strictly dominated by UN . This
is natural, since paying for information and not acting contingent on the value jumps yields a lower
payoff than not paying for information, with the difference being exactly the cost of obtaining the
information. Formally:
EUHFT−M [IN ] = EUHFT−M [UN ]− c
Part 2
We consider 2 possible cases: s ∈ [1, σ]and s ≥ σ, making use of the fact that σ > 1. Note that
for s ≥ 1, there are no liquidity traders willing to trade.
If s ∈ [1, σ] and there are news, then the bandit will rush to the market. If the bandit reaches
the market first, the dealer has a negative payoff s− σ < 0, whereas if the market-maker is first to
the market, he gains 0. No trade with liquidity agents will occur now, and the market-maker is
(weakly) worse off than ending the game.
If s > σ, both the bandit and the liquidity traders will stay off the market and HFT-M’s payoff
is 0. If s = 0, the gain he makes from liquidity traders is zero, whereas the losses he can incur from
the bandit are maximised (−σ).
Hence, in order for the dealer to earn a positive payoff, it should always set the half spread in
the interval (0, 1), which is the conclusion required.
Lemma 5.
41
Proof. We start with the optimal spread for the uninformed strategy (s∗UN(C)). To show this function
is decreasing in ∆H it is enough to show that∂s∗
UN(C)
∂A(∆H ,·) > 0, since by the chain rule we have that:
∂s∗UN(C)
∂∆H=
∂s∗UN(C)
∂A (∆H , ·)× ∂A (∆H , ·)
∂∆H︸ ︷︷ ︸<0
We have that:
∂s∗UN(C)
∂A=
√1− 4A (1−A)σ − 2 (1−A)σ
2 (1−A)2√
1− 4A (1−A)σ
Proving∂s∗
UN(C)(k)
∂A > 0 is equivalent to showing:
√1− 4A (1−A)σ − 1 + 2 (1−A)σ > 0
If the UN strategy yields positive utility, we have that:
s∗UN(C) < 1⇐⇒ 2 (1−A) > 1−√
1− 4A (1−A)σ
Thus, we have that:
√1− 4A (1−A)σ + 2 (1−A)σ >
√1− 4A (1−A)σ (σ − 1) > 0
which is true given our parameter restrictions.
Next, we turn to the IR equilibrium spread. Similarly, we need to prove only that∂s∗
IR(C)
∂A(∆H ,·) > 0,
that is:
∂s∗IR(C)
∂A=A− 2 + 8c (1−A) + 4σ (1−A) + 2
√(1− A2
)2 − 4 (1−A)(A
2 σ + c)
8 (1−A)2√(
1− A2)2 − 4 (1−A)
(A2 σ + c
) > 0
Proving∂s∗
IR(C)(k)
∂A > 0 is equivalent to showing:
A− 2 + 8c (1−A) + 4σ (1−A) + 2
√(1− A
2
)2
− 4 (1−A)
(A2σ + c
)> 0
If the IR strategy yields positive utility, we have that:
42
s∗IRN(C) < 1⇐⇒ A− 2 + 2
√(1− A
2
)2
− 4 (1−A)
(A2σ + c
)> A− 1 (34)
Then, after some algebraic manipulation, we only have to show that:
8c (1−A) + 4 (σ − 1) (1−A) ≥ 0 (35)
which again is true given our parameter restrictions (we know σ > 1 and A < 1).
Lemma 6.
Proof. We note first that the slope of the IR utility function is always larger than the slope of UN .
Simple algebraic manipulation of the first derivatives for the expected utility functions yields:
SlopeIR − SlopeUN =
[1− A
2− 2s (1−A)
]− [1− 2s (1−A)] = −A
2< 0 (36)
If EUHFT−M (s = 0|UN) > EUHFT−M (s = 0|IR) and since the UN utility grows faster than
the IR utility, then the smallest solution of EUHFT−M (s) = 0 will be for the uninformed, UN
strategy. Hence, the UN strategy is optimal.
The condition EUHFT−M (s = 0|UN) > EUHFT−M (s = 0|IR) is equivalent to −σA > −σ2A− c,
which can be written as: c > σ2A. This completes the proof.
Proposition 2.
Proof. For IR to be optimal it needs to hold that s∗IR(C) ≤ s∗UN(C) - the first root of IR utility
function is smaller than the first root of UN expected utility. Since in the proof of lemma 6 we have
shown that the UN utility is growing faster on the increasing section, the condition s∗IRN(C) > s∗UN(C)
is equivalent with the condition that the IR utility is always above UN in the negative quadrants:
EUHFT−M (s|IR) > EUHFT−M (s|UN), ∀s < s∗IR(C). (37)
This condition is equivalent to:
EUHFT−M (s|IR)− EUHFT−M (s|UN) ≥ 0⇐⇒ A2
(σ − s)− c ≥ 0 , ∀s ∈(
0, s∗IR(C)
)The inequality above is monotonically decreasing in s, so if it holds for the largest s in the
domain - s∗IR(C), it will hold for all lower values of the half spread. The sufficient condition is thus:
43
A2
(σ − s∗IR(C)
)− c ≥ 0 (38)
As s∗IR(C) is increasing in c (see definition), it is easy to see the condition is tightened for higher
values of the cost (the LHS is decreasing in the monitoring cost).
The condition states that:
IRCOptimal (α, µ,∆H , c, σ) =A (α, µ,∆H)
2
(σ − s∗IR(C)
)− c ≥ 0 (39)
We will prove that the condition is monotonically relaxed as ∆H decreases, under the assumption
s∗IR(C) exists and it is a feasible strategy (0 ≤ s∗IR(C) ≤ 1). If that is the case and there exists a
∆CH corresponding to a s∗IR(C) for which the condition (39) holds with equality, it will hold for all
∆H < ∆CH . The first derivative of (39) is given by:
∂IRCOptimal∂∆H
=1
2
[−σµA+ s∗IR(C)µA−A
∂s∗IR(C)
∂A∂A∂∆H
]=
1
2
[µA(s∗IR(C) − σ
)+ µA2
∂s∗IR(C)
∂A
](40)
Since µ and A are strictly positive, proving∂IRC
Optimal
∂∆H< 0 is equivalent to showing that:
s∗IR(C) − σ +A∂s∗IR(C)
∂A< 0 (41)
For expositional purposes, we make the following notation: B =
√(1− A2
)2 − 4 (1−A)(A
2 σ + c).
Remember that in the proof of Lemma 5 we found that:
s∗IR(C) =1− A2 −B
2(1−A);∂s∗IR(C)
∂A=A− 2 + 8c (1−A) + 4σ (1−A) + 2B
8 (A− 1)2 B(42)
Existence of s∗IR(C) ∈ R implies(1− A2
)2− 4 (1−A)(A
2 σ + c)> 0. This imposes a lower bound
on the cost term in∂s∗
IR(C)
∂A :
8 (1−A) c ≤ 2
(1− A
2
)2
− 4σ (1−A)A (43)
Hence, it holds that:
∂s∗IR(C)
∂A≤A− 2 + 2
(1− A2
)2+ 4σ (1−A)2 + 2B
8 (A− 1)2 B(44)
44
The condition s∗IR(C) > 0 imposes that: 2B < 2−A, so we can redefine the upper bound from
above:
∂s∗IR(C)
∂A≤
12 (2−A)2 + 4σ (1−A)2
8 (A− 1)2 B≤ 1
2σ
1
B(45)
The condition s∗IR(C) < 1 imposes that: 2B > 3A−2, equivalent to: 12B ≤
13A−2 . Hence, putting
these results together (and σ > 1), we have that:
s∗IR(C) − σ +A∂s∗IR(C)
∂A≤ 1− σ +
A3A− 2
σ = 1− σ(
2A− 2
3A− 2
)≤ A
3A− 2≤ 0 , ∀A ≤ 2
3(46)
Since we prove in Lemma 9 that the market breaks down for A ≤ 12 , the condition
∂IRCOptimal
∂∆H≤ 0
holds in any non-trivial equilibrium situation when the market maker posts any quotes at all.
Proposition 3
Proof. The HFT-M expected utility functions for both IR and UN strategies, evaluated at the
optimal spreads s∗IR(M) and s∗UN(M) are given by:
EUHFT−M (IR) =4 (1−A) +A2 + 16c (A− 1) + 8σA (A− 1)
16 (1−A)(47)
EUHFT−M (UN) =4 +A2 (16σ − 1)− 16Aσ
16 (1−A)(48)
Hence, we have that EUHFT−M (IR) ≥ EUHFT−M (UN) if and only if the following holds:
IRMOptimal (α, µ,∆H , c, σ) = 2A2 − 16c (1−A) + 8σ (1−A)A ≥ 0 (49)
We note that the derivative with respect to the cost is negative - larger monitoring costs tighten
the condition under which the informed strategy becomes optimal:
∂IRMOptimal∂c
= −16 (1−A) < 0
The derivative with respect to ∆H is given by:
∂IRMOptimal∂∆H
=∂IRMOptimal
∂A∂A∂∆H
= (4A+ 16c+ 8σ (1− 2A))∂A∂∆H
≤ 0 (50)
45
The previous holds for any A ≤ 12 - when the market does not break down, since we know that
∂A∂∆H
< 0.
For ∆H →∞ we have that the uninformed strategy is always optimal:
lim∆H→∞
IRMOptimal = −16c < 0
For ∆H → 0 we have that:
lim∆H→0
IRMOptimal = −(
4α
α+ µ+ 16c+ 8σ
µ− αα+ µ
)µ
If it holds that 4 αα+µ + 16c+ 8σ µ−αα+µ > 0 (for α sufficiently large relative to µ), there exists a
latency threshold ∆MH such that for all ∆H < ∆M
H the market maker chooses to be informed and for
all ∆H > ∆MH the market maker chooses to remain uninformed. The monotonicity of the optimality
condition assures the uniqueness of this threshold.
B Econometric Methodology Details
The models are estimated by Panel Ordinary Least Squares, by introducing a dummy variable
for each stock (to control for stock fixed effect) and for each trader (controlling thus for trader
intercepts). Since in our sample we have 242 stocks and 262 traders, this is equivalent to introducing
504 new parameters. However, due to the fact that we have over 1 million observations over time,
this estimation is feasible from the point of view of the sufficiency of the degrees of freedom.
Compared to random effects models, fixed effects linear models have the primary advantage of
not making a strong exogeneity assumption of a random factor relative to the included regressors
(see for instance Hsiao (2003) and Baltagi (2008)). If the panel is long enough, this approach yields
consistent and efficient estimates provided the model is true, with very little assumptions.
The least squares dummy variable approach (LSDV) is not a parsimonious model, due to the large
number of parameters, but it is computationally easier than a maximum likelihood optimisation over
more than 500 parameters. On the other hand, for models that have a limited dependent variable
(such as the volume share sent to entrant markets), the OLS approach might yield inconsistent and
biased estimates. Hence, maximum likelihood approaches are necessary for this type of variables.
46
C Variable Definitions and Measurement
The list of variable short-names used throughout this paper is presented in the table below. This
section describes how these variables are computed and provides relevant literature to motivate the
choice for different measurements.
Variable definitions and short labels
Variable Short Name Definition Levels
Adverse Selection AS Adverse Selection component of the bid-ask spread SD, SDTVolatility Range σ intraday volatility using high/low price SD(Absolute) Trade Imbalance (A)TI (absolute) scaled difference buys - sells SDTAverage Trade Size AvTr Volume divided by number of trades SDTAgency Profile AgPr Ratio client volume / total volume SDTAverage Price P Volume-weighted trading price SDMarket Capitalisation MktCap Shares outstanding x average price SDAggressive Trading AGG market order volume in total volume SDTEffective Spread ES volume-weighted half-spread per stock-day SD
Security specific measures To the fragmentation metrics defined before, we define average
prices and market capitalisation for a particular stock. The average price traded during the day is
computed by taking all trade prices and weighting them by transactions volume; for a certain day t,
with effective trading times τ , we have:
Pt =
∑τ∈t (Priceτ ×Quantityτ )∑
τ∈tQuantityτ(51)
The market capitalisation of a stock (MkCap) is defined on a daily basis, by multiplying the
number of stocks outstanding with the average price during that day.
Measurement for daily return volatility There are 3 main measures for the intraday return
volatility in the finance literature, as reviewed in Patton (2011): the squared daily returns, the
realised volatility measure (seeAndersen, Bollerslev, Diebold, and Labys (2003)) and the intra-daily
range (first proposed by Parkinson (1980)). The squared daily returns is the most naive option of
the three, since it does not take into account the variation in prices during the day - it is possible
that, after wide fluctuations, the closing price is not very different from the opening price, which
would incorrectly lead to an estimated volatility smaller than the real value.
The realised volatility measure is defined as the sum of n squared returns computed from
47
transaction prices over the day (RVt =∑n
i=1 r2it). This is an unbiased estimator of the true volatility
if the stock price follows a geometric Brownian motion (Patton (2011)) and has a lower variance
than the squared return. If the grid we are sampling prices from is fine enough, the realised volatility
comes arbitrarily close to the true volatility. However though, in the presence of microstructure
noise, the observed price is not equal to the true price process (there is a bid-ask bounce, since some
trades take place at the bid quote, others at the ask quote) - and the realised volatility measure can
thus overestimate the true volatility - see Alizadeh, Brandt, and Diebold (2002).
The volatility measure we choose to use in this paper is the scaled intra-daily range, defined as:
σt =1
2√
ln (2)ln
(supi {pit}infi {pit}
)(52)
The intra-daily range is considerably easier to compute - since we only need 2 prices for each
day: the highest and the lowest one. This is a unbiased estimator of the true volatility, just like
the realised volatility (the scaling factor 1
2√
ln(2)ensures unbiasedness under a geometric brownian
motion DGP for the prices). The efficiency is considerably higher than for daily squared return, and
close to the efficiency of realised volatility, computed with a 2 hour sampling interval (Andersen,
Bollerslev, Diebold, and Labys (2003)). Alizadeh, Brandt, and Diebold (2002) show that this
measure is more robust to the presence of microstructure noise, thus being potentially a superior
estimate in real-world situation.
This measure is computed both at individual stock level, from the EMCF dataset, as well as
market wide, using the OMX Nordic 40 index as a proxy for the Scandinavian markets.
48
D Appendix: Snapshot of the dataset
Sample Data Snapshot. There are 8 possible platforms: XHEL (Nasdaq OMX Helsinki), XSTO(Nasdaq OMX Stockholm), XCSE (Nasdaq OMX Copenhagen), CHIX (Chi-X Europe), NURO(Nasdaq Europe), BATS (BATE Euope), QMTF (Quote MTF) and BURG (Burgundy). AGNTstands for client trade, whereas PRCP stands for principal (own trade). Dates are formated as1yymmdd, to facilitate comparison across decades (2009-2010).
Platform Time Date Price Quantity TraderID
Origin Buy/Sell Currency Symbol Maker/Taker
XHEL 90001 1090901 9.59 8300 150002 AGNT S EUR NOKI 1...CHIX 90014 1090901 9.57 724 140001 PRCP B EUR NOKI -1...BATE 101014 1101003 9.81 901 300001 PRCP S DKK MAERS 1
References
Alizadeh, Sassan, Michael W. Brandt, and Francis X. Diebold, 2002, Range-based estimation of
stochastic volatility models, The Journal of Finance 57, 1047–1091.
Andersen, Torben G., Tim Bollerslev, Francis X. Diebold, and Paul Labys, 2003, Modeling and
forecasting realized volatility, Econometrica 71, 579–625.
Baltagi, B., 2008, Econometric Analysis of Panel Data (John Wiley & Sons).
Baron, M., J. Brogaard, and A. A. Kirilenko, 2012, The trading profits of high-frequency traders,
working paper, Un.
Biais, B., T. Foucault, and S. Moinas, 2013, Equilibrium fast trading, working paper, Toulouse
School of Economics.
Brogaard, J., T. Hendershott, and R. Riordan, 2012, High frequency trading and price discovery,
working paper, University of Washington.
Foucault, T., 2012, Market Microstructure: Confroning Many Viewpoints (Wiley Finance).
, J. Hombert, and I. Rosu, 2012, News trading and speed, working paper.
49
Foucault, Thierry, Ailsa Roell, and Patrik Sandas, 2003, Market making with costly monitoring: An
analysis of the soes controversy, Review of Financial Studies 16, 345–384.
Hagstromer, B., and L. Norden, 2013, The diversity of high-frequency traders, Available at SSRN:
http://ssrn.com/abstract=2153272.
Hasbrouck, J., and G. Saar, 2010, Low-latency trading, AFA 2012 Chicago Meetings Paper; Johnson
School Research Paper Series No. 35-2010.
Hendershott, T., C. M. Jones, and A. J. Menkveld, 2011, Does algorithmic trading improve liquidity?,
Journal of Finance 66, 1–33.
Hendershott, T., and R. Riordan, 2011, High-frequency trading and price discovery, working paper.
Hendershott, Terrence J., and Albert J. Menkveld, 2011, Price pressures, WFA 2010 paper.
Ho, Thomas S. Y., and Hans R. Stoll, 1983, The dynamics of dealer markets under competition,
The Journal of Finance 38, 1053–1074.
Hoffmann, Peter, 2010, Adverse selection, transaction fees, and multi-market trading, working paper
pp. –.
Hsiao, C., 2003, Analysis of Panel Data (Cambridge University Press).
Jovanovic, Boyan, and Albert J. Menkveld, 2011, Middlemen in limit-order markets, SSRN eLibrary.
Kirilenko, Andrei A., Albert (Pete) S. Kyle, Mehrdad Samadi, and Tugkan Tuzun, 2011, The flash
crash: The impact of high frequency trading on an electronic market, SSRN eLibrary.
Pagnotta, Emiliano, and Thomas Philippon, 2011, Competing on speed, SSRN eLibrary.
Parkinson, Michael, 1980, The extreme value method for estimating the variance of the rate of
return, The Journal of Business 53, 61–65.
Patton, Andrew J., 2011, Volatility forecast comparison using imperfect volatility proxies, Journal
of Econometrics 160, 246–256.
Petersen, Mitchell A., 2009, Estimating standard errors in finance panel data sets: Comparing
approaches, Review of Financial Studies 22, 435–480.
50