Oxford-Man Institute of Quantitative Finance, University ...

17
Deep Learning for Market by Order Data Zihao Zhang a , Bryan Lim a and Stefan Zohren a a Oxford-Man Institute of Quantitative Finance, University of Oxford, Oxford, UK. ARTICLE HISTORY Compiled July 28, 2021 ABSTRACT Market by order (MBO) data – a detailed feed of individual trade instructions for a given stock on an exchange – is arguably one of the most granular sources of microstructure information. While limit order books (LOBs) are implicitly derived from it, MBO data is largely neglected by current academic literature which focuses primarily on LOB modelling. In this paper, we demonstrate the utility of MBO data for forecasting high-frequency price movements, providing an orthogonal source of information to LOB snapshots and expanding the universe of alpha discovery. We provide the first predictive analysis on MBO data by carefully introducing the data structure and presenting a specific normalisation scheme to consider level informa- tion in order books and to allow model training with multiple instruments. Through forecasting experiments using deep neural networks, we show that while MBO-driven and LOB-driven models individually provide similar performance, ensembles of the two can lead to improvements in forecasting accuracy – indicating that MBO data is additive to LOB-based features. KEYWORDS Market by Order Data; Limit Order Books; Deep Learning; Long Short-Term Memory; Attention. 1. Introduction High-frequency microstructure data has received growing attention both in academia and industry with the computerisation of financial exchanges and the increase capacity of data storage. The detailed records of order flow and price dynamics provide us with a granular description of short-term supply and demand, and we can take the dynamics of order books into account during the modelling process. Propelled by the publication of the benchmark dataset (Ntakaris et al. 2018) of high-frequency limit order book (LOB) data, there has been a growing interest in research studying LOB data. Recent works by Tsantekidis et al. (2017); Sirignano and Cont (2019); Zhang, Zohren, and Roberts (2019a); Briola, Turiel, and Aste (2020) demonstrate that strong predictive performance can be obtained from modelling high-frequency LOB data and with resulting predictions finding applications in market-making and trade execution which have short holding periods. In this work, we introduce Market by Order (MBO) data for predictive modelling with deep learning algorithms. MBO data provides full resolution of the underlying CONTACT Zihao Zhang. Email: [email protected] arXiv:2102.08811v2 [q-fin.TR] 27 Jul 2021

Transcript of Oxford-Man Institute of Quantitative Finance, University ...

Deep Learning for Market by Order Data

Zihao Zhanga, Bryan Lima and Stefan Zohren a

aOxford-Man Institute of Quantitative Finance, University of Oxford, Oxford, UK.

ARTICLE HISTORY

Compiled July 28, 2021

ABSTRACT

Market by order (MBO) data – a detailed feed of individual trade instructionsfor a given stock on an exchange – is arguably one of the most granular sources ofmicrostructure information. While limit order books (LOBs) are implicitly derivedfrom it, MBO data is largely neglected by current academic literature which focusesprimarily on LOB modelling. In this paper, we demonstrate the utility of MBO datafor forecasting high-frequency price movements, providing an orthogonal source ofinformation to LOB snapshots and expanding the universe of alpha discovery. Weprovide the first predictive analysis on MBO data by carefully introducing the datastructure and presenting a specific normalisation scheme to consider level informa-tion in order books and to allow model training with multiple instruments. Throughforecasting experiments using deep neural networks, we show that while MBO-drivenand LOB-driven models individually provide similar performance, ensembles of thetwo can lead to improvements in forecasting accuracy – indicating that MBO datais additive to LOB-based features.

KEYWORDSMarket by Order Data; Limit Order Books; Deep Learning; Long Short-TermMemory; Attention.

1. Introduction

High-frequency microstructure data has received growing attention both in academiaand industry with the computerisation of financial exchanges and the increase capacityof data storage. The detailed records of order flow and price dynamics provide uswith a granular description of short-term supply and demand, and we can take thedynamics of order books into account during the modelling process. Propelled by thepublication of the benchmark dataset (Ntakaris et al. 2018) of high-frequency limitorder book (LOB) data, there has been a growing interest in research studying LOBdata. Recent works by Tsantekidis et al. (2017); Sirignano and Cont (2019); Zhang,Zohren, and Roberts (2019a); Briola, Turiel, and Aste (2020) demonstrate that strongpredictive performance can be obtained from modelling high-frequency LOB data andwith resulting predictions finding applications in market-making and trade executionwhich have short holding periods.

In this work, we introduce Market by Order (MBO) data for predictive modellingwith deep learning algorithms. MBO data provides full resolution of the underlying

CONTACT Zihao Zhang. Email: [email protected]

arX

iv:2

102.

0881

1v2

[q-

fin.

TR

] 2

7 Ju

l 202

1

market microstructure – with both LOB data and trade sequences being derivedfrom it. Despite MBO data being the original raw data source, current literature onhigh-frequency predictive modelling focuses predominantly on LOBs and, to the bestof our knowledge, MBO data has not been used for direct predicting modelling. Weshowcase that the usage of MBO data as an additional source information to LOBimproves predictive performance and MBO data could inspire a range of meaningfulfeatures that are related to individual order positions.

A LOB is a record of all outstanding limit orders (passive orders) for an instrumentat a given time point and it is sorted into different levels based on submitted prices.At each price level, a LOB only shows the total available quantity. However, any givenprice level actually consists of many individual orders with different sizes. MBO datais essentially a message-base data feed that allows us to infer the individual queueposition for each individual order by reconstructing the order book step by step. Adetailed description of MBO data and how it relates to LOB data is presented inSection 3.

We propose a deep learning model based on MBO data, and in particular, a clas-sification framework is adopted to predict stock price movements. In doing so, weprovide a complete analysis of MBO by carefully introducing the data structure andthe components of the message-base data feed. A specific data normalisation scheme isintroduced to model level information contained in LOBs and to allow model trainingwith multiple instruments. Our dataset consists of MBO data over a period of one yearfor five highly liquid instruments from the London Stock Exchange. Our testing setcontains millions of samples to verify the robustness and generalisation of the results.

In our proposed models, we apply deep learning architectures including LSTMs(Hochreiter and Schmidhuber 1997) and Attention mechanisms (Bahdanau, Cho, andBengio 2014) to model the dynamics of MBO data for market predictions. Our ex-periments show consistent and robust results from MBO data that are comparable tomodels that utilise derived LOB data. We observe that predictive models based onMBO data are complementary to LOB models and we propose an ensemble approachwhich yields superior results. As such, we observe that MBO data adds diversificationto the LOB model and improves prediction performance.

The remainder of the paper is organised as follows. After a short literature reviewin Section 2, we proceed in Section 3 by introducing MBO data, including data prepro-cessing, normalisation and labelling. Section 4 presents deep learning architectures. Wenext describe our experiments and present the results of predicting market movementsfrom MBO data in Section 5. We conclude our findings and discuss promising futureextensions in Section 6.

2. Literature

Research on the high-frequency microstructure data remains largely focused on mod-elling the limit order book (LOB), where the classical works are referred to O’Hara(1995); Harris (2003) and a review is presented in Gould et al. (2013). However, there islimited work on MBO data in the current literature. NASDAQ (OUCH 2020) and CMEGroup (CME 2020) provide a preliminary description on MBO data for introducingtheir exchange match engines, and the works of Byrd, Hybinette, and Balch (2019);Belcak, Calliess, and Zohren (2020) use MBO data for market simulation to modeltrading scenarios or to study latency effects. To the best of our knowledge, this paperis the first to use MBO data to predict market movements, filling in this literature gap

2

by using deep learning models.Deep Learning (Goodfellow et al. 2016) algorithms have been heavily used for

predicting high-frequency microstructure data (Tsantekidis et al. 2017; Sirignano andCont 2019; Briola, Turiel, and Aste 2020; Wallbridge 2020). In particular, Zhang,Zohren, and Roberts (2018, 2019a,b) apply convolutional neural networks and LSTMsto model the dynamics of LOB and demonstrate accuracy improvements over linearmodels. Unlike traditional time-series models (Mills and Mills 1991; Hamilton 2020)or stochastic models (Islam and Nguyen 2020) that assume a parametric processfor the underlying time-series, deep learning methods are able to capture arbitrarynonlinear relationships without placing any specific assumptions on the input data.Our experiments also suggest that deep networks deliver better results than linearmethods for modelling MBO data.

We investigate deep learning models, including LSTMs (Hochreiter and Schmidhuber1997) and Attention (Bahdanau, Cho, and Bengio 2014), to model MBO data. Attentionis used to solve the problem of diminishing performance with long input sequencesby utilising information at each hidden state of a recurrent network (Bahdanau, Cho,and Bengio 2014; Dai et al. 2019), and it can be used for constructing multi-horizonforecasting models (Lim and Zohren 2020). Our experiment suggests that networkswith a recurrent nature lead to good predictive results compared to the state-of-artnetworks trained with LOB data, suggesting the potential benefits of using MBO dataas an additional data source.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

In general, exchanges provide high-frequency microstructure data in three tiers, namelyL1, L2 and L3, offering increasingly granular information and capabilities:

• Level 1 (L1): L1 shows the price and quantity of the last executed trade anddisplays real time best bid and ask of an order book, also known as quote data;• Level 2 (L2): L2 data is more granular than L1 by showing bids and asks at

deeper levels of an order book, and it is commonly referred as LOB data;• Level 3 (L3): L3 is essentially the MBO data introduced in this work and it

provides even more information than L2 as it shows non-aggregated bids andasks placed by individual traders.

In this work, we focus on MBO data, which is essentially a message-base data feedthat allows us to observe individual actions of market participants. Essentially, it is anorder instruction that describes the action of a specific trader at a given time point. Inwhat follows, we focus on the essential components of such messages ignoring certainauxiliary information. Table 1 shows an example of sequences of MBO data, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Type indicates the order type, here limit order (Type = 1) or market order

(Type = 2);• Side indicates whether an order is buy (1) or sell (2);• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A as

3

Table 1. An example of a sequence of market by order data.

Time stamp ID Type Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 1 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 1 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 1 2 1 68.59 5100.0

the matching engine will be able to identify and cancel the existing order usingthe unique ID;• Price shows the price level of the instruction;• Size shows the size (i.e. number of stocks) of the instruction.

A LOB updates whenever there is a new message from the MBO data coming inand this process is illustrated in Figure 1, where we show how a MBO message affect aLOB. For example, if we look at the top of Figure 1, a new limit order (ID=46280) isadded to the ask side of the order book with price at 70.04 and size of 7580. The orderbook updates its status and the new order is added to the right price level. In general, aLOB only shows the total available quantity at each price level but MBO data providesus with extra information by showing individual behaviour. Although, MBO data doesnot directly indicate which price level the order is added to, our normalisation schemeintroduced in the next section allows us to consider this information and we not onlyobtain a smaller input space but also obtain relevant information comparable withLOB data.

In addition, the usage of MBO data increases transparency and improves theunderstanding of order book dynamics without disclosing customer identification.Although, we can access to unique order ID but this number is generally assignedsequentially by the exchange match engine (CME 2020) and a private link is providedto the customer, which keeps identification confidential. Further, unlike LOB datawhere we sometimes only view limited price levels, MBO data allows us to observe theentire order book with full-depth information. Such a granularity can improve traders’confidence in posting large order size as they can better evaluate the potential marketimpact by knowing individual queue positions.

3.2. Data Preprocessing and Normalisation

We focus on MBO data that represents limit orders because market orders only accountfor a tiny percentage of total order flow. Figure 2 illustrates the process of datapreprocessing and normalisation. In particular, we process the MBO data for an uniqueID as:

• Side and Price: Missing values correspond to updates and cancellations andwe fill those with the corresponding values from the original order of that ID;• Size: Missing values correspond to full cancellation and we fill those with 0 to

indicate that no shares are outstanding after the action;• Action: we change Action to have values -1, 0 and 1. -1 means cancelling an

order, 0 means updating price or size for the existing order and 1 means addinga new order;• Change price and Change size: We add these two new features to calculate

the difference between entries for the price and size of a specific ID to reflect theintention of adding or decreasing positions for the given order.

4

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

BidAsk

3

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

BidAsk

3

Bid Ask

Price

Size

time: t

70.04 Price

Size

time: t+1

Bid Ask

70.0570.01 70.02

70.04 70.0570.01 70.02

3. Market by Order

3.1. Descriptions of Market by Order

MBO is a message-base data feed that provides individual queue position and we canview price and quantity for each individual order. Essentially, it is an order instructionthat describes the action of a specific trader at a given time point. In this work, wefocus on MBO that represents limit orders because market orders only account for atiny percentage of total order flow. Table 1 shows an example of a sequence of MBOfor a single security, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for trader identification;• Side indicates which side of the order book the order is placed, where 0 means

cancelled order, 1 means bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan old order;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 2 1 70.04 3024.0

2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 1 2 68.56 0.02018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

A LOB updates whenever there is a new message of MBO flowing in and this processis illustrated in Figure

Market by Order provides individual orders for all price levels (not restricted to top10). Each order is assigned an anonymous OrderID for identification and a PriorityIDfor sorting.

Participant queue position can be determined by using the OrderID. Queue positionand proper sorting of book is done using PriorityID. PriorityID may change if ordermodified or refreshed, OrderID is consistent for the life of the order.

3

Figure 1. A slice of a LOB at time t (Top). When a message of MBO flows in, we have a new update forLOB (Bottom) at time t+1.

Sizetime: ttime: t + 1A LOB updates whenever there is a new message of MBO flowing in and this process

is illustrated in Figure 1. In the example, a new limit order (ID=462805645163273214)is added to the ask side of the order book with price at 70.04 and size of 3024. The orderbook updates its status and the new order is added to the right price level. In general, aLOB only shows the total available quantity at each price level but MBO data providesus with extra information by showing individual behaviours. Although, MBO datadoes not directly indicate which price level the order is added to, our normalisationscheme introduced in the next section allows us to consider this information and wenot only obtain a smaller input space but also obtain relevant information comparablewith LOB data.

In addition, the usage of MBO data increases transparency and improves theunderstanding of order book dynamics without disclosing customer identification.Although, we can access to unique order ID but this number is generally assignedsequentially by the exchange match engine (CME 2020) and a private link is providedto the customer, which keeps identification confidential. Further, unlike LOB datawhere we sometimes only view limited price levels, MBO data allows us to observe theentire order book with full-depth information. Such a granularity can improve traders’confidence in posting large order size as they can better evaluate the potential market

4

Bid Ask

Price

Size

time: t

70.04 Price

Size

time: t+1

Bid Ask

70.0570.01 70.02

70.04 70.0570.01 70.02

3. Market by Order

3.1. Descriptions of Market by Order

MBO is a message-base data feed that provides individual queue position and we canview price and quantity for each individual order. Essentially, it is an order instructionthat describes the action of a specific trader at a given time point. In this work, wefocus on MBO that represents limit orders because market orders only account for atiny percentage of total order flow. Table 1 shows an example of a sequence of MBOfor a single security, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for trader identification;• Side indicates which side of the order book the order is placed, where 0 means

cancelled order, 1 means bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan old order;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 2 1 70.04 3024.0

2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 1 2 68.56 0.02018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

A LOB updates whenever there is a new message of MBO flowing in and this processis illustrated in Figure

Market by Order provides individual orders for all price levels (not restricted to top10). Each order is assigned an anonymous OrderID for identification and a PriorityIDfor sorting.

Participant queue position can be determined by using the OrderID. Queue positionand proper sorting of book is done using PriorityID. PriorityID may change if ordermodified or refreshed, OrderID is consistent for the life of the order.

3

Figure 1. A slice of a LOB at time t (Top). When a message of MBO flows in, we have a new update forLOB (Bottom) at time t+1.

Sizetime: ttime: t + 1A LOB updates whenever there is a new message of MBO flowing in and this process

is illustrated in Figure 1. In the example, a new limit order (ID=462805645163273214)is added to the ask side of the order book with price at 70.04 and size of 3024. The orderbook updates its status and the new order is added to the right price level. In general, aLOB only shows the total available quantity at each price level but MBO data providesus with extra information by showing individual behaviours. Although, MBO datadoes not directly indicate which price level the order is added to, our normalisationscheme introduced in the next section allows us to consider this information and wenot only obtain a smaller input space but also obtain relevant information comparablewith LOB data.

In addition, the usage of MBO data increases transparency and improves theunderstanding of order book dynamics without disclosing customer identification.Although, we can access to unique order ID but this number is generally assignedsequentially by the exchange match engine (CME 2020) and a private link is providedto the customer, which keeps identification confidential. Further, unlike LOB datawhere we sometimes only view limited price levels, MBO data allows us to observe theentire order book with full-depth information. Such a granularity can improve traders’confidence in posting large order size as they can better evaluate the potential market

4

Bid Ask

Price

Size

time: t

70.04 Price

Size

time: t+1

Bid Ask

70.0570.01 70.02

70.04 70.0570.01 70.02

3. Market by Order

3.1. Descriptions of Market by Order

MBO is a message-base data feed that provides individual queue position and we canview price and quantity for each individual order. Essentially, it is an order instructionthat describes the action of a specific trader at a given time point. In this work, wefocus on MBO that represents limit orders because market orders only account for atiny percentage of total order flow. Table 1 shows an example of a sequence of MBOfor a single security, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for trader identification;• Side indicates which side of the order book the order is placed, where 0 means

cancelled order, 1 means bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan old order;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 2 1 70.04 3024.0

2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 1 2 68.56 0.02018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

A LOB updates whenever there is a new message of MBO flowing in and this processis illustrated in Figure

Market by Order provides individual orders for all price levels (not restricted to top10). Each order is assigned an anonymous OrderID for identification and a PriorityIDfor sorting.

Participant queue position can be determined by using the OrderID. Queue positionand proper sorting of book is done using PriorityID. PriorityID may change if ordermodified or refreshed, OrderID is consistent for the life of the order.

3

Figure 1. A slice of a LOB at time t (Top). When a message of MBO flows in, we have a new update forLOB (Bottom) at time t+1.

Sizetime: ttime: t + 1A LOB updates whenever there is a new message of MBO flowing in and this process

is illustrated in Figure 1. In the example, a new limit order (ID=462805645163273214)is added to the ask side of the order book with price at 70.04 and size of 3024. The orderbook updates its status and the new order is added to the right price level. In general, aLOB only shows the total available quantity at each price level but MBO data providesus with extra information by showing individual behaviours. Although, MBO datadoes not directly indicate which price level the order is added to, our normalisationscheme introduced in the next section allows us to consider this information and wenot only obtain a smaller input space but also obtain relevant information comparablewith LOB data.

In addition, the usage of MBO data increases transparency and improves theunderstanding of order book dynamics without disclosing customer identification.Although, we can access to unique order ID but this number is generally assignedsequentially by the exchange match engine (CME 2020) and a private link is providedto the customer, which keeps identification confidential. Further, unlike LOB datawhere we sometimes only view limited price levels, MBO data allows us to observe theentire order book with full-depth information. Such a granularity can improve traders’confidence in posting large order size as they can better evaluate the potential market

4

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

Price70.01

3

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

70.01 70.02 70.04 70.05A LOB updates whenever there is a new message of MBO flowing in and this process

3

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

BidAsk

3

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

BidAsk

3

Bid Ask

Price

Size

time: t

70.04 Price

Size

time: t+1

Bid Ask

70.0570.01 70.02

70.04 70.0570.01 70.02

3. Market by Order

3.1. Descriptions of Market by Order

MBO is a message-base data feed that provides individual queue position and we canview price and quantity for each individual order. Essentially, it is an order instructionthat describes the action of a specific trader at a given time point. In this work, wefocus on MBO that represents limit orders because market orders only account for atiny percentage of total order flow. Table 1 shows an example of a sequence of MBOfor a single security, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for trader identification;• Side indicates which side of the order book the order is placed, where 0 means

cancelled order, 1 means bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan old order;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 2 1 70.04 3024.0

2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 1 2 68.56 0.02018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

A LOB updates whenever there is a new message of MBO flowing in and this processis illustrated in Figure

Market by Order provides individual orders for all price levels (not restricted to top10). Each order is assigned an anonymous OrderID for identification and a PriorityIDfor sorting.

Participant queue position can be determined by using the OrderID. Queue positionand proper sorting of book is done using PriorityID. PriorityID may change if ordermodified or refreshed, OrderID is consistent for the life of the order.

3

Figure 1. A slice of a LOB at time t (Top). When a message of MBO flows in, we have a new update forLOB (Bottom) at time t+1.

Sizetime: ttime: t + 1A LOB updates whenever there is a new message of MBO flowing in and this process

is illustrated in Figure 1. In the example, a new limit order (ID=462805645163273214)is added to the ask side of the order book with price at 70.04 and size of 3024. The orderbook updates its status and the new order is added to the right price level. In general, aLOB only shows the total available quantity at each price level but MBO data providesus with extra information by showing individual behaviours. Although, MBO datadoes not directly indicate which price level the order is added to, our normalisationscheme introduced in the next section allows us to consider this information and wenot only obtain a smaller input space but also obtain relevant information comparablewith LOB data.

In addition, the usage of MBO data increases transparency and improves theunderstanding of order book dynamics without disclosing customer identification.Although, we can access to unique order ID but this number is generally assignedsequentially by the exchange match engine (CME 2020) and a private link is providedto the customer, which keeps identification confidential. Further, unlike LOB datawhere we sometimes only view limited price levels, MBO data allows us to observe theentire order book with full-depth information. Such a granularity can improve traders’confidence in posting large order size as they can better evaluate the potential market

4

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

Price70.01

3

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

70.01 70.02 70.04 70.05A LOB updates whenever there is a new message of MBO flowing in and this process

3

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

BidAsk

3

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

BidAsk

3

Bid Ask

Price

Size

time: t

70.04 Price

Size

time: t+1

Bid Ask

70.0570.01 70.02

70.04 70.0570.01 70.02

3. Market by Order

3.1. Descriptions of Market by Order

MBO is a message-base data feed that provides individual queue position and we canview price and quantity for each individual order. Essentially, it is an order instructionthat describes the action of a specific trader at a given time point. In this work, wefocus on MBO that represents limit orders because market orders only account for atiny percentage of total order flow. Table 1 shows an example of a sequence of MBOfor a single security, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for trader identification;• Side indicates which side of the order book the order is placed, where 0 means

cancelled order, 1 means bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan old order;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 2 1 70.04 3024.0

2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 1 2 68.56 0.02018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

A LOB updates whenever there is a new message of MBO flowing in and this processis illustrated in Figure

Market by Order provides individual orders for all price levels (not restricted to top10). Each order is assigned an anonymous OrderID for identification and a PriorityIDfor sorting.

Participant queue position can be determined by using the OrderID. Queue positionand proper sorting of book is done using PriorityID. PriorityID may change if ordermodified or refreshed, OrderID is consistent for the life of the order.

3

Figure 1. A slice of a LOB at time t (Top). When a message of MBO flows in, we have a new update forLOB (Bottom) at time t+1.

Sizetime: ttime: t + 1A LOB updates whenever there is a new message of MBO flowing in and this process

is illustrated in Figure 1. In the example, a new limit order (ID=462805645163273214)is added to the ask side of the order book with price at 70.04 and size of 3024. The orderbook updates its status and the new order is added to the right price level. In general, aLOB only shows the total available quantity at each price level but MBO data providesus with extra information by showing individual behaviours. Although, MBO datadoes not directly indicate which price level the order is added to, our normalisationscheme introduced in the next section allows us to consider this information and wenot only obtain a smaller input space but also obtain relevant information comparablewith LOB data.

In addition, the usage of MBO data increases transparency and improves theunderstanding of order book dynamics without disclosing customer identification.Although, we can access to unique order ID but this number is generally assignedsequentially by the exchange match engine (CME 2020) and a private link is providedto the customer, which keeps identification confidential. Further, unlike LOB datawhere we sometimes only view limited price levels, MBO data allows us to observe theentire order book with full-depth information. Such a granularity can improve traders’confidence in posting large order size as they can better evaluate the potential market

4

Bid Ask

Price

Size

time: t

70.04 Price

Size

time: t+1

Bid Ask

70.0570.01 70.02

70.04 70.0570.01 70.02

3. Market by Order

3.1. Descriptions of Market by Order

MBO is a message-base data feed that provides individual queue position and we canview price and quantity for each individual order. Essentially, it is an order instructionthat describes the action of a specific trader at a given time point. In this work, wefocus on MBO that represents limit orders because market orders only account for atiny percentage of total order flow. Table 1 shows an example of a sequence of MBOfor a single security, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for trader identification;• Side indicates which side of the order book the order is placed, where 0 means

cancelled order, 1 means bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan old order;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 2 1 70.04 3024.0

2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 1 2 68.56 0.02018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

A LOB updates whenever there is a new message of MBO flowing in and this processis illustrated in Figure

Market by Order provides individual orders for all price levels (not restricted to top10). Each order is assigned an anonymous OrderID for identification and a PriorityIDfor sorting.

Participant queue position can be determined by using the OrderID. Queue positionand proper sorting of book is done using PriorityID. PriorityID may change if ordermodified or refreshed, OrderID is consistent for the life of the order.

3

Figure 1. A slice of a LOB at time t (Top). When a message of MBO flows in, we have a new update forLOB (Bottom) at time t+1.

Sizetime: ttime: t + 1A LOB updates whenever there is a new message of MBO flowing in and this process

is illustrated in Figure 1. In the example, a new limit order (ID=462805645163273214)is added to the ask side of the order book with price at 70.04 and size of 3024. The orderbook updates its status and the new order is added to the right price level. In general, aLOB only shows the total available quantity at each price level but MBO data providesus with extra information by showing individual behaviours. Although, MBO datadoes not directly indicate which price level the order is added to, our normalisationscheme introduced in the next section allows us to consider this information and wenot only obtain a smaller input space but also obtain relevant information comparablewith LOB data.

In addition, the usage of MBO data increases transparency and improves theunderstanding of order book dynamics without disclosing customer identification.Although, we can access to unique order ID but this number is generally assignedsequentially by the exchange match engine (CME 2020) and a private link is providedto the customer, which keeps identification confidential. Further, unlike LOB datawhere we sometimes only view limited price levels, MBO data allows us to observe theentire order book with full-depth information. Such a granularity can improve traders’confidence in posting large order size as they can better evaluate the potential market

4

Bid Ask

Price

Size

time: t

70.04 Price

Size

time: t+1

Bid Ask

70.0570.01 70.02

70.04 70.0570.01 70.02

3. Market by Order

3.1. Descriptions of Market by Order

MBO is a message-base data feed that provides individual queue position and we canview price and quantity for each individual order. Essentially, it is an order instructionthat describes the action of a specific trader at a given time point. In this work, wefocus on MBO that represents limit orders because market orders only account for atiny percentage of total order flow. Table 1 shows an example of a sequence of MBOfor a single security, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for trader identification;• Side indicates which side of the order book the order is placed, where 0 means

cancelled order, 1 means bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan old order;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 2 1 70.04 3024.0

2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 1 2 68.56 0.02018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

A LOB updates whenever there is a new message of MBO flowing in and this processis illustrated in Figure

Market by Order provides individual orders for all price levels (not restricted to top10). Each order is assigned an anonymous OrderID for identification and a PriorityIDfor sorting.

Participant queue position can be determined by using the OrderID. Queue positionand proper sorting of book is done using PriorityID. PriorityID may change if ordermodified or refreshed, OrderID is consistent for the life of the order.

3

Figure 1. A slice of a LOB at time t (Top). When a message of MBO flows in, we have a new update forLOB (Bottom) at time t+1.

Sizetime: ttime: t + 1A LOB updates whenever there is a new message of MBO flowing in and this process

is illustrated in Figure 1. In the example, a new limit order (ID=462805645163273214)is added to the ask side of the order book with price at 70.04 and size of 3024. The orderbook updates its status and the new order is added to the right price level. In general, aLOB only shows the total available quantity at each price level but MBO data providesus with extra information by showing individual behaviours. Although, MBO datadoes not directly indicate which price level the order is added to, our normalisationscheme introduced in the next section allows us to consider this information and wenot only obtain a smaller input space but also obtain relevant information comparablewith LOB data.

In addition, the usage of MBO data increases transparency and improves theunderstanding of order book dynamics without disclosing customer identification.Although, we can access to unique order ID but this number is generally assignedsequentially by the exchange match engine (CME 2020) and a private link is providedto the customer, which keeps identification confidential. Further, unlike LOB datawhere we sometimes only view limited price levels, MBO data allows us to observe theentire order book with full-depth information. Such a granularity can improve traders’confidence in posting large order size as they can better evaluate the potential market

4

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

Price70.01

3

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

70.01 70.02 70.04 70.05A LOB updates whenever there is a new message of MBO flowing in and this process

3

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

BidAsk

3

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

BidAsk

3

Bid Ask

Price

Size

time: t

70.04 Price

Size

time: t+1

Bid Ask

70.0570.01 70.02

70.04 70.0570.01 70.02

3. Market by Order

3.1. Descriptions of Market by Order

MBO is a message-base data feed that provides individual queue position and we canview price and quantity for each individual order. Essentially, it is an order instructionthat describes the action of a specific trader at a given time point. In this work, wefocus on MBO that represents limit orders because market orders only account for atiny percentage of total order flow. Table 1 shows an example of a sequence of MBOfor a single security, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for trader identification;• Side indicates which side of the order book the order is placed, where 0 means

cancelled order, 1 means bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan old order;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 2 1 70.04 3024.0

2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 1 2 68.56 0.02018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

A LOB updates whenever there is a new message of MBO flowing in and this processis illustrated in Figure

Market by Order provides individual orders for all price levels (not restricted to top10). Each order is assigned an anonymous OrderID for identification and a PriorityIDfor sorting.

Participant queue position can be determined by using the OrderID. Queue positionand proper sorting of book is done using PriorityID. PriorityID may change if ordermodified or refreshed, OrderID is consistent for the life of the order.

3

Figure 1. A slice of a LOB at time t (Top). When a message of MBO flows in, we have a new update forLOB (Bottom) at time t+1.

Sizetime: ttime: t + 1A LOB updates whenever there is a new message of MBO flowing in and this process

is illustrated in Figure 1. In the example, a new limit order (ID=462805645163273214)is added to the ask side of the order book with price at 70.04 and size of 3024. The orderbook updates its status and the new order is added to the right price level. In general, aLOB only shows the total available quantity at each price level but MBO data providesus with extra information by showing individual behaviours. Although, MBO datadoes not directly indicate which price level the order is added to, our normalisationscheme introduced in the next section allows us to consider this information and wenot only obtain a smaller input space but also obtain relevant information comparablewith LOB data.

In addition, the usage of MBO data increases transparency and improves theunderstanding of order book dynamics without disclosing customer identification.Although, we can access to unique order ID but this number is generally assignedsequentially by the exchange match engine (CME 2020) and a private link is providedto the customer, which keeps identification confidential. Further, unlike LOB datawhere we sometimes only view limited price levels, MBO data allows us to observe theentire order book with full-depth information. Such a granularity can improve traders’confidence in posting large order size as they can better evaluate the potential market

4

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

Price70.01

3

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

70.01 70.02 70.04 70.05A LOB updates whenever there is a new message of MBO flowing in and this process

3

Table 2. An example of a sequence of market by order data.

Time stamp ID Type Side Action Price Size

2018-01-02 09:21:15.717500766 46280 1 2 1 70.04 7580

Table 3. An example of a sequence of market by order data.

Time stamp ID Type Side Action Price Size

2018-01-02 09:21:15.717500766 46280 1 N/A 2 N/A N/A

Table 4. An example of a sequence of market by order data.

Time stamp ID Type Side Action Price Size

2018-01-02 09:21:15.717500766 46280 1 2 0 70.05 6250

Table 5. An example of a sequence of market by order data.

Time stamp ID Type Side Action Price Size

2018-01-02 09:21:15.717500766 46280 2 N/A N/A 70.04 3520

book updates its status and the new order is added to the right price level. In general, aLOB only shows the total available quantity at each price level but MBO data providesus with extra information by showing individual behaviours. Although, MBO datadoes not directly indicate which price level the order is added to, our normalisationscheme introduced in the next section allows us to consider this information and wenot only obtain a smaller input space but also obtain relevant information comparablewith LOB data.

In addition, the usage of MBO data increases transparency and improves theunderstanding of order book dynamics without disclosing customer identification.Although, we can access to unique order ID but this number is generally assignedsequentially by the exchange match engine (CME 2020) and a private link is providedto the customer, which keeps identification confidential. Further, unlike LOB datawhere we sometimes only view limited price levels, MBO data allows us to observe theentire order book with full-depth information. Such a granularity can improve traders’confidence in posting large order size as they can better evaluate the potential marketimpact by knowing individual queue positions.

3.2. Data Preprocessing and Normalisation

Since the raw MBO data contains missing entries, we preprocess the data and thenmake the normalisation. Figure 2 illustrates the complete process of data preprocessingand normalisation. We process the MBO data for an unique ID as:

• Side and Price: we fill in the missing entries with previous values;• Size: we fill in the missing entry as 0 to indicate that this order is cancelled;• Action: we change Action to have values -1, 0 and 1. -1 means cancelling an

order, 0 means updating price or size for the existing order and 1 means addinga new order;

• Change price and Change size: we add these two new features to calculatethe di↵erence between entries for the price and size of a specific ID to reflect theintention of adding or decreasing positions for the given order.

4

Table 2. An example of a sequence of market by order data.

Time stamp ID Type Side Action Price Size

2018-01-02 09:21:15.717500766 46280 1 2 1 70.04 7580

Table 3. An example of a sequence of market by order data.

Time stamp ID Type Side Action Price Size

2018-01-02 09:21:15.717500766 46280 1 N/A 2 N/A N/A

Table 4. An example of a sequence of market by order data.

Time stamp ID Type Side Action Price Size

2018-01-02 09:21:15.717500766 46280 1 2 0 70.05 6250

Table 5. An example of a sequence of market by order data.

Time stamp ID Type Side Action Price Size

2018-01-02 09:21:15.717500766 46280 2 N/A N/A 70.04 3520

book updates its status and the new order is added to the right price level. In general, aLOB only shows the total available quantity at each price level but MBO data providesus with extra information by showing individual behaviours. Although, MBO datadoes not directly indicate which price level the order is added to, our normalisationscheme introduced in the next section allows us to consider this information and wenot only obtain a smaller input space but also obtain relevant information comparablewith LOB data.

In addition, the usage of MBO data increases transparency and improves theunderstanding of order book dynamics without disclosing customer identification.Although, we can access to unique order ID but this number is generally assignedsequentially by the exchange match engine (CME 2020) and a private link is providedto the customer, which keeps identification confidential. Further, unlike LOB datawhere we sometimes only view limited price levels, MBO data allows us to observe theentire order book with full-depth information. Such a granularity can improve traders’confidence in posting large order size as they can better evaluate the potential marketimpact by knowing individual queue positions.

3.2. Data Preprocessing and Normalisation

Since the raw MBO data contains missing entries, we preprocess the data and thenmake the normalisation. Figure 2 illustrates the complete process of data preprocessingand normalisation. We process the MBO data for an unique ID as:

• Side and Price: we fill in the missing entries with previous values;• Size: we fill in the missing entry as 0 to indicate that this order is cancelled;• Action: we change Action to have values -1, 0 and 1. -1 means cancelling an

order, 0 means updating price or size for the existing order and 1 means addinga new order;

• Change price and Change size: we add these two new features to calculatethe di↵erence between entries for the price and size of a specific ID to reflect theintention of adding or decreasing positions for the given order.

4

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

BidAsk

3

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

BidAsk

3

Bid Ask

Price

Size

time: t

70.04 Price

Size

time: t+1

Bid Ask

70.0570.01 70.02

70.04 70.0570.01 70.02

3. Market by Order

3.1. Descriptions of Market by Order

MBO is a message-base data feed that provides individual queue position and we canview price and quantity for each individual order. Essentially, it is an order instructionthat describes the action of a specific trader at a given time point. In this work, wefocus on MBO that represents limit orders because market orders only account for atiny percentage of total order flow. Table 1 shows an example of a sequence of MBOfor a single security, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for trader identification;• Side indicates which side of the order book the order is placed, where 0 means

cancelled order, 1 means bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan old order;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 2 1 70.04 3024.0

2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 1 2 68.56 0.02018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

A LOB updates whenever there is a new message of MBO flowing in and this processis illustrated in Figure

Market by Order provides individual orders for all price levels (not restricted to top10). Each order is assigned an anonymous OrderID for identification and a PriorityIDfor sorting.

Participant queue position can be determined by using the OrderID. Queue positionand proper sorting of book is done using PriorityID. PriorityID may change if ordermodified or refreshed, OrderID is consistent for the life of the order.

3

Figure 1. A slice of a LOB at time t (Top). When a message of MBO flows in, we have a new update forLOB (Bottom) at time t+1.

Sizetime: ttime: t + 1A LOB updates whenever there is a new message of MBO flowing in and this process

is illustrated in Figure 1. In the example, a new limit order (ID=462805645163273214)is added to the ask side of the order book with price at 70.04 and size of 3024. The orderbook updates its status and the new order is added to the right price level. In general, aLOB only shows the total available quantity at each price level but MBO data providesus with extra information by showing individual behaviours. Although, MBO datadoes not directly indicate which price level the order is added to, our normalisationscheme introduced in the next section allows us to consider this information and wenot only obtain a smaller input space but also obtain relevant information comparablewith LOB data.

In addition, the usage of MBO data increases transparency and improves theunderstanding of order book dynamics without disclosing customer identification.Although, we can access to unique order ID but this number is generally assignedsequentially by the exchange match engine (CME 2020) and a private link is providedto the customer, which keeps identification confidential. Further, unlike LOB datawhere we sometimes only view limited price levels, MBO data allows us to observe theentire order book with full-depth information. Such a granularity can improve traders’confidence in posting large order size as they can better evaluate the potential market

4

Bid Ask

Price

Size

time: t

70.04 Price

Size

time: t+1

Bid Ask

70.0570.01 70.02

70.04 70.0570.01 70.02

3. Market by Order

3.1. Descriptions of Market by Order

MBO is a message-base data feed that provides individual queue position and we canview price and quantity for each individual order. Essentially, it is an order instructionthat describes the action of a specific trader at a given time point. In this work, wefocus on MBO that represents limit orders because market orders only account for atiny percentage of total order flow. Table 1 shows an example of a sequence of MBOfor a single security, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for trader identification;• Side indicates which side of the order book the order is placed, where 0 means

cancelled order, 1 means bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan old order;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 2 1 70.04 3024.0

2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 1 2 68.56 0.02018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

A LOB updates whenever there is a new message of MBO flowing in and this processis illustrated in Figure

Market by Order provides individual orders for all price levels (not restricted to top10). Each order is assigned an anonymous OrderID for identification and a PriorityIDfor sorting.

Participant queue position can be determined by using the OrderID. Queue positionand proper sorting of book is done using PriorityID. PriorityID may change if ordermodified or refreshed, OrderID is consistent for the life of the order.

3

Figure 1. A slice of a LOB at time t (Top). When a message of MBO flows in, we have a new update forLOB (Bottom) at time t+1.

Sizetime: ttime: t + 1A LOB updates whenever there is a new message of MBO flowing in and this process

is illustrated in Figure 1. In the example, a new limit order (ID=462805645163273214)is added to the ask side of the order book with price at 70.04 and size of 3024. The orderbook updates its status and the new order is added to the right price level. In general, aLOB only shows the total available quantity at each price level but MBO data providesus with extra information by showing individual behaviours. Although, MBO datadoes not directly indicate which price level the order is added to, our normalisationscheme introduced in the next section allows us to consider this information and wenot only obtain a smaller input space but also obtain relevant information comparablewith LOB data.

In addition, the usage of MBO data increases transparency and improves theunderstanding of order book dynamics without disclosing customer identification.Although, we can access to unique order ID but this number is generally assignedsequentially by the exchange match engine (CME 2020) and a private link is providedto the customer, which keeps identification confidential. Further, unlike LOB datawhere we sometimes only view limited price levels, MBO data allows us to observe theentire order book with full-depth information. Such a granularity can improve traders’confidence in posting large order size as they can better evaluate the potential market

4

Bid Ask

Price

Size

time: t

70.04 Price

Size

time: t+1

Bid Ask

70.0570.01 70.02

70.04 70.0570.01 70.02

3. Market by Order

3.1. Descriptions of Market by Order

MBO is a message-base data feed that provides individual queue position and we canview price and quantity for each individual order. Essentially, it is an order instructionthat describes the action of a specific trader at a given time point. In this work, wefocus on MBO that represents limit orders because market orders only account for atiny percentage of total order flow. Table 1 shows an example of a sequence of MBOfor a single security, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for trader identification;• Side indicates which side of the order book the order is placed, where 0 means

cancelled order, 1 means bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan old order;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 2 1 70.04 3024.0

2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 1 2 68.56 0.02018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

A LOB updates whenever there is a new message of MBO flowing in and this processis illustrated in Figure

Market by Order provides individual orders for all price levels (not restricted to top10). Each order is assigned an anonymous OrderID for identification and a PriorityIDfor sorting.

Participant queue position can be determined by using the OrderID. Queue positionand proper sorting of book is done using PriorityID. PriorityID may change if ordermodified or refreshed, OrderID is consistent for the life of the order.

3

Figure 1. A slice of a LOB at time t (Top). When a message of MBO flows in, we have a new update forLOB (Bottom) at time t+1.

Sizetime: ttime: t + 1A LOB updates whenever there is a new message of MBO flowing in and this process

is illustrated in Figure 1. In the example, a new limit order (ID=462805645163273214)is added to the ask side of the order book with price at 70.04 and size of 3024. The orderbook updates its status and the new order is added to the right price level. In general, aLOB only shows the total available quantity at each price level but MBO data providesus with extra information by showing individual behaviours. Although, MBO datadoes not directly indicate which price level the order is added to, our normalisationscheme introduced in the next section allows us to consider this information and wenot only obtain a smaller input space but also obtain relevant information comparablewith LOB data.

In addition, the usage of MBO data increases transparency and improves theunderstanding of order book dynamics without disclosing customer identification.Although, we can access to unique order ID but this number is generally assignedsequentially by the exchange match engine (CME 2020) and a private link is providedto the customer, which keeps identification confidential. Further, unlike LOB datawhere we sometimes only view limited price levels, MBO data allows us to observe theentire order book with full-depth information. Such a granularity can improve traders’confidence in posting large order size as they can better evaluate the potential market

4

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

Price70.01

3

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

70.01 70.02 70.04 70.05A LOB updates whenever there is a new message of MBO flowing in and this process

3

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

BidAsk

3

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

BidAsk

3

Bid Ask

Price

Size

time: t

70.04 Price

Size

time: t+1

Bid Ask

70.0570.01 70.02

70.04 70.0570.01 70.02

3. Market by Order

3.1. Descriptions of Market by Order

MBO is a message-base data feed that provides individual queue position and we canview price and quantity for each individual order. Essentially, it is an order instructionthat describes the action of a specific trader at a given time point. In this work, wefocus on MBO that represents limit orders because market orders only account for atiny percentage of total order flow. Table 1 shows an example of a sequence of MBOfor a single security, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for trader identification;• Side indicates which side of the order book the order is placed, where 0 means

cancelled order, 1 means bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan old order;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 2 1 70.04 3024.0

2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 1 2 68.56 0.02018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

A LOB updates whenever there is a new message of MBO flowing in and this processis illustrated in Figure

Market by Order provides individual orders for all price levels (not restricted to top10). Each order is assigned an anonymous OrderID for identification and a PriorityIDfor sorting.

Participant queue position can be determined by using the OrderID. Queue positionand proper sorting of book is done using PriorityID. PriorityID may change if ordermodified or refreshed, OrderID is consistent for the life of the order.

3

Figure 1. A slice of a LOB at time t (Top). When a message of MBO flows in, we have a new update forLOB (Bottom) at time t+1.

Sizetime: ttime: t + 1A LOB updates whenever there is a new message of MBO flowing in and this process

is illustrated in Figure 1. In the example, a new limit order (ID=462805645163273214)is added to the ask side of the order book with price at 70.04 and size of 3024. The orderbook updates its status and the new order is added to the right price level. In general, aLOB only shows the total available quantity at each price level but MBO data providesus with extra information by showing individual behaviours. Although, MBO datadoes not directly indicate which price level the order is added to, our normalisationscheme introduced in the next section allows us to consider this information and wenot only obtain a smaller input space but also obtain relevant information comparablewith LOB data.

In addition, the usage of MBO data increases transparency and improves theunderstanding of order book dynamics without disclosing customer identification.Although, we can access to unique order ID but this number is generally assignedsequentially by the exchange match engine (CME 2020) and a private link is providedto the customer, which keeps identification confidential. Further, unlike LOB datawhere we sometimes only view limited price levels, MBO data allows us to observe theentire order book with full-depth information. Such a granularity can improve traders’confidence in posting large order size as they can better evaluate the potential market

4

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

Price70.01

3

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

70.01 70.02 70.04 70.05A LOB updates whenever there is a new message of MBO flowing in and this process

3

Table 2. An example of a sequence of market by order data.

Time stamp ID Type Side Action Price Size

2018-01-02 09:21:15.717500766 46280 1 2 1 70.04 7580

Table 3. An example of a sequence of market by order data.

Time stamp ID Type Side Action Price Size

2018-01-02 09:21:15.717500766 46280 1 N/A 2 N/A N/A

Table 4. An example of a sequence of market by order data.

Time stamp ID Type Side Action Price Size

2018-01-02 09:21:15.717500766 46280 1 2 0 70.05 6250

Table 5. An example of a sequence of market by order data.

Time stamp ID Type Side Action Price Size

2018-01-02 09:21:15.717500766 46280 2 N/A N/A 70.04 3520

book updates its status and the new order is added to the right price level. In general, aLOB only shows the total available quantity at each price level but MBO data providesus with extra information by showing individual behaviours. Although, MBO datadoes not directly indicate which price level the order is added to, our normalisationscheme introduced in the next section allows us to consider this information and wenot only obtain a smaller input space but also obtain relevant information comparablewith LOB data.

In addition, the usage of MBO data increases transparency and improves theunderstanding of order book dynamics without disclosing customer identification.Although, we can access to unique order ID but this number is generally assignedsequentially by the exchange match engine (CME 2020) and a private link is providedto the customer, which keeps identification confidential. Further, unlike LOB datawhere we sometimes only view limited price levels, MBO data allows us to observe theentire order book with full-depth information. Such a granularity can improve traders’confidence in posting large order size as they can better evaluate the potential marketimpact by knowing individual queue positions.

3.2. Data Preprocessing and Normalisation

Since the raw MBO data contains missing entries, we preprocess the data and thenmake the normalisation. Figure 2 illustrates the complete process of data preprocessingand normalisation. We process the MBO data for an unique ID as:

• Side and Price: we fill in the missing entries with previous values;• Size: we fill in the missing entry as 0 to indicate that this order is cancelled;• Action: we change Action to have values -1, 0 and 1. -1 means cancelling an

order, 0 means updating price or size for the existing order and 1 means addinga new order;

• Change price and Change size: we add these two new features to calculatethe di↵erence between entries for the price and size of a specific ID to reflect theintention of adding or decreasing positions for the given order.

4

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

BidAsk

3

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

BidAsk

3

Bid Ask

Price

Size

time: t

70.04 Price

Size

time: t+1

Bid Ask

70.0570.01 70.02

70.04 70.0570.01 70.02

3. Market by Order

3.1. Descriptions of Market by Order

MBO is a message-base data feed that provides individual queue position and we canview price and quantity for each individual order. Essentially, it is an order instructionthat describes the action of a specific trader at a given time point. In this work, wefocus on MBO that represents limit orders because market orders only account for atiny percentage of total order flow. Table 1 shows an example of a sequence of MBOfor a single security, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for trader identification;• Side indicates which side of the order book the order is placed, where 0 means

cancelled order, 1 means bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan old order;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 2 1 70.04 3024.0

2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 1 2 68.56 0.02018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

A LOB updates whenever there is a new message of MBO flowing in and this processis illustrated in Figure

Market by Order provides individual orders for all price levels (not restricted to top10). Each order is assigned an anonymous OrderID for identification and a PriorityIDfor sorting.

Participant queue position can be determined by using the OrderID. Queue positionand proper sorting of book is done using PriorityID. PriorityID may change if ordermodified or refreshed, OrderID is consistent for the life of the order.

3

Figure 1. A slice of a LOB at time t (Top). When a message of MBO flows in, we have a new update forLOB (Bottom) at time t+1.

Sizetime: ttime: t + 1A LOB updates whenever there is a new message of MBO flowing in and this process

is illustrated in Figure 1. In the example, a new limit order (ID=462805645163273214)is added to the ask side of the order book with price at 70.04 and size of 3024. The orderbook updates its status and the new order is added to the right price level. In general, aLOB only shows the total available quantity at each price level but MBO data providesus with extra information by showing individual behaviours. Although, MBO datadoes not directly indicate which price level the order is added to, our normalisationscheme introduced in the next section allows us to consider this information and wenot only obtain a smaller input space but also obtain relevant information comparablewith LOB data.

In addition, the usage of MBO data increases transparency and improves theunderstanding of order book dynamics without disclosing customer identification.Although, we can access to unique order ID but this number is generally assignedsequentially by the exchange match engine (CME 2020) and a private link is providedto the customer, which keeps identification confidential. Further, unlike LOB datawhere we sometimes only view limited price levels, MBO data allows us to observe theentire order book with full-depth information. Such a granularity can improve traders’confidence in posting large order size as they can better evaluate the potential market

4

Bid Ask

Price

Size

time: t

70.04 Price

Size

time: t+1

Bid Ask

70.0570.01 70.02

70.04 70.0570.01 70.02

3. Market by Order

3.1. Descriptions of Market by Order

MBO is a message-base data feed that provides individual queue position and we canview price and quantity for each individual order. Essentially, it is an order instructionthat describes the action of a specific trader at a given time point. In this work, wefocus on MBO that represents limit orders because market orders only account for atiny percentage of total order flow. Table 1 shows an example of a sequence of MBOfor a single security, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for trader identification;• Side indicates which side of the order book the order is placed, where 0 means

cancelled order, 1 means bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan old order;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 2 1 70.04 3024.0

2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 1 2 68.56 0.02018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

A LOB updates whenever there is a new message of MBO flowing in and this processis illustrated in Figure

Market by Order provides individual orders for all price levels (not restricted to top10). Each order is assigned an anonymous OrderID for identification and a PriorityIDfor sorting.

Participant queue position can be determined by using the OrderID. Queue positionand proper sorting of book is done using PriorityID. PriorityID may change if ordermodified or refreshed, OrderID is consistent for the life of the order.

3

Figure 1. A slice of a LOB at time t (Top). When a message of MBO flows in, we have a new update forLOB (Bottom) at time t+1.

Sizetime: ttime: t + 1A LOB updates whenever there is a new message of MBO flowing in and this process

is illustrated in Figure 1. In the example, a new limit order (ID=462805645163273214)is added to the ask side of the order book with price at 70.04 and size of 3024. The orderbook updates its status and the new order is added to the right price level. In general, aLOB only shows the total available quantity at each price level but MBO data providesus with extra information by showing individual behaviours. Although, MBO datadoes not directly indicate which price level the order is added to, our normalisationscheme introduced in the next section allows us to consider this information and wenot only obtain a smaller input space but also obtain relevant information comparablewith LOB data.

In addition, the usage of MBO data increases transparency and improves theunderstanding of order book dynamics without disclosing customer identification.Although, we can access to unique order ID but this number is generally assignedsequentially by the exchange match engine (CME 2020) and a private link is providedto the customer, which keeps identification confidential. Further, unlike LOB datawhere we sometimes only view limited price levels, MBO data allows us to observe theentire order book with full-depth information. Such a granularity can improve traders’confidence in posting large order size as they can better evaluate the potential market

4

Bid Ask

Price

Size

time: t

70.04 Price

Size

time: t+1

Bid Ask

70.0570.01 70.02

70.04 70.0570.01 70.02

3. Market by Order

3.1. Descriptions of Market by Order

MBO is a message-base data feed that provides individual queue position and we canview price and quantity for each individual order. Essentially, it is an order instructionthat describes the action of a specific trader at a given time point. In this work, wefocus on MBO that represents limit orders because market orders only account for atiny percentage of total order flow. Table 1 shows an example of a sequence of MBOfor a single security, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for trader identification;• Side indicates which side of the order book the order is placed, where 0 means

cancelled order, 1 means bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan old order;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 2 1 70.04 3024.0

2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 1 2 68.56 0.02018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

A LOB updates whenever there is a new message of MBO flowing in and this processis illustrated in Figure

Market by Order provides individual orders for all price levels (not restricted to top10). Each order is assigned an anonymous OrderID for identification and a PriorityIDfor sorting.

Participant queue position can be determined by using the OrderID. Queue positionand proper sorting of book is done using PriorityID. PriorityID may change if ordermodified or refreshed, OrderID is consistent for the life of the order.

3

Figure 1. A slice of a LOB at time t (Top). When a message of MBO flows in, we have a new update forLOB (Bottom) at time t+1.

Sizetime: ttime: t + 1A LOB updates whenever there is a new message of MBO flowing in and this process

is illustrated in Figure 1. In the example, a new limit order (ID=462805645163273214)is added to the ask side of the order book with price at 70.04 and size of 3024. The orderbook updates its status and the new order is added to the right price level. In general, aLOB only shows the total available quantity at each price level but MBO data providesus with extra information by showing individual behaviours. Although, MBO datadoes not directly indicate which price level the order is added to, our normalisationscheme introduced in the next section allows us to consider this information and wenot only obtain a smaller input space but also obtain relevant information comparablewith LOB data.

In addition, the usage of MBO data increases transparency and improves theunderstanding of order book dynamics without disclosing customer identification.Although, we can access to unique order ID but this number is generally assignedsequentially by the exchange match engine (CME 2020) and a private link is providedto the customer, which keeps identification confidential. Further, unlike LOB datawhere we sometimes only view limited price levels, MBO data allows us to observe theentire order book with full-depth information. Such a granularity can improve traders’confidence in posting large order size as they can better evaluate the potential market

4

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

Price70.01

3

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

70.01 70.02 70.04 70.05A LOB updates whenever there is a new message of MBO flowing in and this process

3

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

BidAsk

3

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

BidAsk

3

Bid Ask

Price

Size

time: t

70.04 Price

Size

time: t+1

Bid Ask

70.0570.01 70.02

70.04 70.0570.01 70.02

3. Market by Order

3.1. Descriptions of Market by Order

MBO is a message-base data feed that provides individual queue position and we canview price and quantity for each individual order. Essentially, it is an order instructionthat describes the action of a specific trader at a given time point. In this work, wefocus on MBO that represents limit orders because market orders only account for atiny percentage of total order flow. Table 1 shows an example of a sequence of MBOfor a single security, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for trader identification;• Side indicates which side of the order book the order is placed, where 0 means

cancelled order, 1 means bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan old order;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 2 1 70.04 3024.0

2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 1 2 68.56 0.02018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

A LOB updates whenever there is a new message of MBO flowing in and this processis illustrated in Figure

Market by Order provides individual orders for all price levels (not restricted to top10). Each order is assigned an anonymous OrderID for identification and a PriorityIDfor sorting.

Participant queue position can be determined by using the OrderID. Queue positionand proper sorting of book is done using PriorityID. PriorityID may change if ordermodified or refreshed, OrderID is consistent for the life of the order.

3

Figure 1. A slice of a LOB at time t (Top). When a message of MBO flows in, we have a new update forLOB (Bottom) at time t+1.

Sizetime: ttime: t + 1A LOB updates whenever there is a new message of MBO flowing in and this process

is illustrated in Figure 1. In the example, a new limit order (ID=462805645163273214)is added to the ask side of the order book with price at 70.04 and size of 3024. The orderbook updates its status and the new order is added to the right price level. In general, aLOB only shows the total available quantity at each price level but MBO data providesus with extra information by showing individual behaviours. Although, MBO datadoes not directly indicate which price level the order is added to, our normalisationscheme introduced in the next section allows us to consider this information and wenot only obtain a smaller input space but also obtain relevant information comparablewith LOB data.

In addition, the usage of MBO data increases transparency and improves theunderstanding of order book dynamics without disclosing customer identification.Although, we can access to unique order ID but this number is generally assignedsequentially by the exchange match engine (CME 2020) and a private link is providedto the customer, which keeps identification confidential. Further, unlike LOB datawhere we sometimes only view limited price levels, MBO data allows us to observe theentire order book with full-depth information. Such a granularity can improve traders’confidence in posting large order size as they can better evaluate the potential market

4

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

Price70.01

3

(2018, 2019a,b) apply convolutional neural network and LSTM to model the dynamicsof LOB and demonstrate promising predictive results. Unlike traditional time-seriesmodels (Mills and Mills 1991; Hamilton 2020) or econometric models (Fama and French1993) that assume a parametric process for the underlying time-series, deep learningmethods are able to capture arbitrary nonlinear relationships without placing specificassumption on the input data. Our experiment also suggest that deep networks deliverbetter results than linear methods for modelling MBO data.

In this work, we investigate recent developed techniques including Attention (Bah-danau, Cho, and Bengio 2014) and Transformer (Vaswani et al. 2017) to demonstratethe e↵ectiveness of modelling MBO data. Attention is used to solve the problem ofdiminishing performance with long input sequence by utilising information at eachhidden state of a recurrent network, and Transformer is designed to allow for parallelcomputing to speed up the training process. The work of Wallbridge (2020) studiesTransformers for LOB data and we adopt the Temporal Fusion Transformer (Limet al. 2019), a novel attention-based architecture designed specifically for time-series,to predict price movements from MBO data. Our experiment shows good predictiveresults, suggesting the potential ability of modelling MBO data in place of LOB.

3. Market by Order Data

3.1. Descriptions of Market by Order Data

MBO data is a message-base data feed that provides individual queue position andwe can view price and quantity for each individual order. Essentially, it is an orderinstruction that describes the action of a specific trader at a given time point. Inthis work, we focus on MBO data that represents limit orders because market ordersonly account for a tiny percentage of total order flow. Table 1 shows an example of asequence of MBO data for a given instrument, where:

• Time stamp records the time point when an instruction is given;• ID shows the unique ID for order identification which is anonymous to others;• Side indicates which side of the order book the order is placed, where 1 means

bid side and 2 means ask side;• Action represents the specific instruction where 0 means updating the price or

size for the existing order, 1 means adding a new order and 2 means cancellingan existing order. If Action = 2, the entries of Side, Price and Size are N/A asthe match engine will cancel the existing order using the unique ID;

• Price shows the price level of the instruction;• Size shows the position of the instruction.

Table 1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:21:15.717500766 462805645163273214 N/A 2 N/A N/A2018-01-02 09:21:18.585446702 462805645163298476 1 1 68.54 8334.02018-01-02 09:21:20.680552032 462805645163297649 1 0 68.56 3227.02018-01-02 09:21:20.944574722 462805645163297649 N/A 2 N/A N/A2018-01-02 09:21:20.945483443 462805645163298567 2 1 68.59 5100.0

70.01 70.02 70.04 70.05A LOB updates whenever there is a new message of MBO flowing in and this process

3

Table 2. An example of a sequence of market by order data.

Time stamp ID Type Side Action Price Size

2018-01-02 09:21:15.717500766 46280 1 1 1 70.04 8520

A LOB updates whenever there is a new message from the MBO data coming in andthis process is illustrated in Figure 1, where we show how a market order and threedi↵erent actions a↵ect a LOB. For example, if we look at the top of Figure 1, a newlimit order (ID=46280) is added to the ask side of the order book with price at 70.04and size of 7580. The order book updates its status and the new order is added to theright price level. In general, a LOB only shows the total available quantity at eachprice level but MBO data provides us with extra information by showing individualbehaviours. Although, MBO data does not directly indicate which price level the orderis added to, our normalisation scheme introduced in the next section allows us toconsider this information and we not only obtain a smaller input space but also obtainrelevant information comparable with LOB data.

In addition, the usage of MBO data increases transparency and improves theunderstanding of order book dynamics without disclosing customer identification.Although, we can access to unique order ID but this number is generally assignedsequentially by the exchange match engine (CME 2020) and a private link is providedto the customer, which keeps identification confidential. Further, unlike LOB datawhere we sometimes only view limited price levels, MBO data allows us to observe theentire order book with full-depth information. Such a granularity can improve traders’confidence in posting large order size as they can better evaluate the potential marketimpact by knowing individual queue positions.

3.2. Data Preprocessing and Normalisation

Since the raw MBO data contains missing entries, we preprocess the data and then makethe normalisation. We focus on MBO data that represents limit orders because marketorders only account for a tiny percentage of total order flow. Removing market ordersalso allows us to compare models trained with MBO data and LOB data respectively.Figure 2 illustrates the process of data preprocessing and normalisation and we processthe MBO data for an unique ID as:

• Side and Price: we fill in the missing entries with previous values;• Size: we fill in the missing entry as 0 to indicate that this order is cancelled;• Action: we change Action to have values -1, 0 and 1. -1 means cancelling an

order, 0 means updating price or size for the existing order and 1 means addinga new order;

• Change price and Change size: we add these two new features to calculatethe di↵erence between entries for the price and size of a specific ID to reflect theintention of adding or decreasing positions for the given order.

Data preprocessing is applied to every unique order ID and we then normalisa thedata as:

• Normalised price: (price - mid-price 1) / (minimum tick size ⇥100). Thiscalculation transforms price to tick change, representing how many ticks the price

1Mid-price is the mean between the best ask and bid price.

5

Figure 1. An illustration of how MBO data updates a LOB. Top: An addition of a new limit order; Middle

top: A cancellation of an existing order; Middle bottom: An update for a partial cancellation; Bottom: A

marketable buy limit order that crosses the spread.

5

Table B1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:00:52.369737129 462805645163386861 2 1 68.47 2049.02018-01-02 09:00:57.396438193 462805645163386861 2 0 68.46 2110.02018-01-02 09:01:30.641547824 462805645163386861 2 0 68.46 2082.02018-01-02 09:01:30.642333313 462805645163386861 2 0 68.45 2082.02018-01-02 09:01:33.656776415 462805645163386861 N/A 2 N/A N/A

Table B2. An example of a sequence of market by order data.

Time stamp Side Action Price Size Changeprice

Changesize

Mid-price Mid-size

2018-01-02 09:00:52.369737129 2 1 68.47 2049.0 0.00 2049.0 68.445 32900.02018-01-02 09:00:57.396438193 2 0 68.46 2110.0 -0.01 61.0 68.440 26750.02018-01-02 09:01:30.641547824 2 0 68.46 2082.0 0.00 -28.0 68.450 13250.02018-01-02 09:01:30.642333313 2 0 68.45 2082.0 -0.01 0.0 68.450 13250.02018-01-02 09:01:33.656776415 2 -1 68.45 0.0000 0.00 -2082.0 68.425 7750.00

Table B3. An example of a sequence of market by order data.

Time stamp Side Action Nor. price Nor. size Nor. changeprice

Nor. changesize

2018-01-02 09:00:52.369737129 2 1 2.5 0.0623 0.0 0.06232018-01-02 09:00:57.396438193 2 0 2.0 0.0789 -1.0 0.00232018-01-02 09:01:30.641547824 2 0 1.0 0.1571 0.0 -0.00212018-01-02 09:01:30.642333313 2 0 0.0 0.1571 -1.0 0.00002018-01-02 09:01:33.656776415 2 -1 2.5 0.0000 -2.0 -0.2686

15

Table B1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:00:52.369737129 462805645163386861 2 1 68.47 2049.02018-01-02 09:00:57.396438193 462805645163386861 2 0 68.46 2110.02018-01-02 09:01:30.641547824 462805645163386861 2 0 68.46 2082.02018-01-02 09:01:30.642333313 462805645163386861 2 0 68.45 2082.02018-01-02 09:01:33.656776415 462805645163386861 N/A 2 N/A N/A

Table B2. An example of a sequence of market by order data.

Time stamp Side Action Price Size Changeprice

Changesize

Mid-price Mid-size

2018-01-02 09:00:52.369737129 2 1 68.47 2049.0 0.00 2049.0 68.445 32900.02018-01-02 09:00:57.396438193 2 0 68.46 2110.0 -0.01 61.0 68.440 26750.02018-01-02 09:01:30.641547824 2 0 68.46 2082.0 0.00 -28.0 68.450 13250.02018-01-02 09:01:30.642333313 2 0 68.45 2082.0 -0.01 0.0 68.450 13250.02018-01-02 09:01:33.656776415 2 -1 68.45 0.0000 0.00 -2082.0 68.425 7750.00

Table B3. An example of a sequence of market by order data.

Time stamp Side Action Nor. price Nor. size Nor. changeprice

Nor. changesize

2018-01-02 09:00:52.369737129 2 1 2.5 0.0623 0.0 0.06232018-01-02 09:00:57.396438193 2 0 2.0 0.0789 -1.0 0.00232018-01-02 09:01:30.641547824 2 0 1.0 0.1571 0.0 -0.00212018-01-02 09:01:30.642333313 2 0 0.0 0.1571 -1.0 0.00002018-01-02 09:01:33.656776415 2 -1 2.5 0.0000 -2.0 -0.2686

15

Table B1. An example of a sequence of market by order data.

Time stamp ID Side Action Price Size

2018-01-02 09:00:52.369737129 462805645163386861 2 1 68.47 2049.02018-01-02 09:00:57.396438193 462805645163386861 2 0 68.46 2110.02018-01-02 09:01:30.641547824 462805645163386861 2 0 68.46 2082.02018-01-02 09:01:30.642333313 462805645163386861 2 0 68.45 2082.02018-01-02 09:01:33.656776415 462805645163386861 N/A 2 N/A N/A

Table B2. An example of a sequence of market by order data.

Time stamp Side Action Price Size Changeprice

Changesize

Mid-price Mid-size

2018-01-02 09:00:52.369737129 2 1 68.47 2049.0 0.00 2049.0 68.445 32900.02018-01-02 09:00:57.396438193 2 0 68.46 2110.0 -0.01 61.0 68.440 26750.02018-01-02 09:01:30.641547824 2 0 68.46 2082.0 0.00 -28.0 68.450 13250.02018-01-02 09:01:30.642333313 2 0 68.45 2082.0 -0.01 0.0 68.450 13250.02018-01-02 09:01:33.656776415 2 -1 68.45 0.0000 0.00 -2082.0 68.425 7750.00

Table B3. An example of a sequence of market by order data.

Time stamp Side Action Nor. price Nor. size Nor. changeprice

Nor. changesize

2018-01-02 09:00:52.369737129 2 1 2.5 0.0623 0.0 0.06232018-01-02 09:00:57.396438193 2 0 2.0 0.0789 -1.0 0.00232018-01-02 09:01:30.641547824 2 0 1.0 0.1571 0.0 -0.00212018-01-02 09:01:30.642333313 2 0 0.0 0.1571 -1.0 0.00002018-01-02 09:01:33.656776415 2 -1 2.5 0.0000 0.0 -0.2686

16

3.2. Data Preprocessing and Normalisation

Since the raw MBO data contains missing entries, we preprocess the data and thenmake the normalisation. Figure 2 illustrates the complete process of data preprocessingand normalisation. We process the MBO data for an unique ID as:

• Side and Price: we fill in the missing entries with previous values;• Size: we fill in the missing entry as 0 to indicate that this order is cancelled;• Action: we change Action to have values -1, 0 and 1. -1 means cancelling an

order, 0 means updating price or size for the existing order and 1 means addinga new order;

• Change price and Change size: we add these two new features to calculatethe di↵erence between entries for the price and size of a specific ID to reflect theintention of adding or decreasing positions for the given order.

Data preprocessing is applied to every unique order ID and we then normalisa thedata as:

• Normalised price: (price - mid-price 1) / (minimum tick size ⇥100). Thiscalculation transforms price to tick change, representing how many ticks the priceis away from the mid-price. The deviation of minimum tick size is needed whenwe train models with multiple instruments as it maps price to a similar scale;

• Normalised size: size / mid-size. Mid-size is the mean between the current bestask and bid size, which is similar to mid-price.

• Normalised change price: change price / minimum tick size;• Normalised change size: change size / mid-size.• Side and Action: same.

At the end, we remove “Time stamp” and “ID”, leading to 6 features in our featurespace. Note that the normalised price essentially represent the price level at where thecurrent order is in the order book, taking the level information into account.

3.3. Data preprocessing

3.4. Data normalisation

3.5. Data Labelling

In this work, we study a classification framework where we want to predict the futuremarket movements into three classes: market going up, staying stationary or goingdown. We use mid-price to create labels and adopt the labelling method in Zhang,Zohren, and Roberts (2019a) to decide movements:

lt =m+(t) � m�(t)

m�(t),

m�(t) =1

k

kX

i=0

pt�i,

m+(t) =1

k

kX

i=1

pt+i,

(1)

1Mid-price is the mean between the best ask and bid price.

5

3.2. Data Preprocessing and Normalisation

Since the raw MBO data contains missing entries, we preprocess the data and thenmake the normalisation. Figure 2 illustrates the complete process of data preprocessingand normalisation. We process the MBO data for an unique ID as:

• Side and Price: we fill in the missing entries with previous values;• Size: we fill in the missing entry as 0 to indicate that this order is cancelled;• Action: we change Action to have values -1, 0 and 1. -1 means cancelling an

order, 0 means updating price or size for the existing order and 1 means addinga new order;

• Change price and Change size: we add these two new features to calculatethe di↵erence between entries for the price and size of a specific ID to reflect theintention of adding or decreasing positions for the given order.

Data preprocessing is applied to every unique order ID and we then normalisa thedata as:

• Normalised price: (price - mid-price 1) / (minimum tick size ⇥100). Thiscalculation transforms price to tick change, representing how many ticks the priceis away from the mid-price. The deviation of minimum tick size is needed whenwe train models with multiple instruments as it maps price to a similar scale;

• Normalised size: size / mid-size. Mid-size is the mean between the current bestask and bid size, which is similar to mid-price.

• Normalised change price: change price / minimum tick size;• Normalised change size: change size / mid-size.• Side and Action: same.

At the end, we remove “Time stamp” and “ID”, leading to 6 features in our featurespace. Note that the normalised price essentially represent the price level at where thecurrent order is in the order book, taking the level information into account.

3.3. Data preprocessing

3.4. Data normalisation

3.5. Data Labelling

In this work, we study a classification framework where we want to predict the futuremarket movements into three classes: market going up, staying stationary or goingdown. We use mid-price to create labels and adopt the labelling method in Zhang,Zohren, and Roberts (2019a) to decide movements:

lt =m+(t) � m�(t)

m�(t),

m�(t) =1

k

kX

i=0

pt�i,

m+(t) =1

k

kX

i=1

pt+i,

(1)

1Mid-price is the mean between the best ask and bid price.

5

Figure 2. An example of preprocessing and normalising the MBO data.

Data preprocessing is applied to every unique order ID and we then normalise the dataas:

• Normalised price: (price - mid-price) / (minimum tick size ×100).1 Thiscalculation transforms price to tick change, representing how many ticks the priceis away from the mid-price. The deviation of minimum tick size is needed whenwe train models with multiple instruments as it maps price to a similar scale;• Normalised size: size / mid-size. Mid-size is the mean between the current best

ask and bid size, which is similar to mid-price.• Normalised change price: change price / minimum tick size;• Normalised change size: change size / mid-size.2

• Side and Action: remain unchanged.

At the end, we remove “Time stamp” and “ID”, leading to 6 features in our featurespace. Note that the normalised price essentially represent the price level where thecurrent order is in the order book, taking the level information into account.

3.3. Data Labelling

In this work, we study a classification framework where we want to predict the futuremarket movements into three classes: the market going up, staying stationary or going

1Mid-price is the mean between the best ask and bid price. Those references prices and their sizes which weuse below can be obtained on the fly from the MBO data.2By mid-size we denote the mean of best bid and ask size, in analogy to mid price. Those values again are

obtained on the fly from the MBO data.

6

down. We use mid-prices to create labels and adopt the labelling method in Zhang,Zohren, and Roberts (2019a) to classify movements. In particular, we define

lt =m+(t)−m−(t)

m−(t),

m−(t) =1

k

k−1∑

i=0

pt−i,

m+(t) =1

k

k∑

i=1

pt+i,

(1)

where pt is the mid-price at time t. We denote the prediction horizon as k and itrepresents the number of arrivals of MBO data, meaning that we are working with ticktime instead of clock time. To decide on the label we compare lt with a threshold (α),labelling it as up if lt > α, down if lt < −α and stationary otherwise. The choice of αis related to the prediction horizon (k) and we set α for each instrument to obtain abalanced training set. Our choices of k and α are listed in Section 5 and we show thatthe dataset are balanced under our choice.

Note that Equation (1) introduces a smooth labelling that leads to consistent labelsthat are better for designing trading signals and the work of Zhang, Zohren, andRoberts (2019a) includes a more detailed discussion demonstrating the effects ofdifferent labelling methods. Interested readers are referred to their work for a detailedexplanation.

4. Methodology

In this section, we introduce the different deep learning algorithms studied in our work.For a single input of any time-series, we write x1:T , where xt represents the features attime t and T is the length of the sequence which will later correspond to the length ofthe lookback of the input.

4.1. Multilayer perceptrons (MLPs)

MLPs are canonical neural network models where a typical network is organised into aseries of layers in a chain structure, with each layer being a function of the layer thatprecedes it. We can define the hidden layer of a MLP as:

h(l) = g(l)(W (l)h(l−1) + b(l)), (2)

where h(l) ∈ RNl represents the l-th hidden layer with weights W (l) ∈ RNl×Nl−1 andbiases b(l) ∈ RNl . Here g(l)(·) is the activation function that allows networks to modelnonlinearities. The final output is a function of the last hidden layer and we computeobjective functions to minimise errors between target outputs and estimates.

However, for MLPs, we first need to flatten x1:T and feed it to subsequent hiddenlayers. Doing this breaks the time dependences and treats features at different timestamps independently. We generally observe inferior results using MLPs and find thatrecurrent neural networks (RNNs) often deliver better performance. This is becausea RNN acts as a memory buffer by summarising past information and recursively

7

updating the hidden state with new observations at each time step of the input (Zhang,Zohren, and Roberts 2020a).

4.2. Long Short-Term Memory (LSTMs)

Standard RNNs suffer from vanishing or exploding gradient problems (Bengio, Simard,and Frasconi 1994) and Long Short-Term Memory networks (LSTMs) are proposedto solve this problem. This is done by operating a gating mechanism that efficientlycontrols the propagation of past information (Hochreiter and Schmidhuber 1997). ALSTM updates its hidden state recursively and has a cell state ct coupled with a seriesof gates at each hidden state. In mathematical terms, we can write

Input gate: it = σ(Wi,hht−1 + Wi,xxt + bi),

with Wi,h ∈ RNh×Nh ,Wi,x ∈ RNh×Nx and bi ∈ RNh ,

Output gate: ot = σ(Wo,hht−1 + Wo,xxt + bo),

with Wo,h ∈ RNh×Nh ,Wo,x ∈ RNh×Nx and bo ∈ RNh ,

Forget gate: ft = σ(Wf,hht−1 + Wf,xxt + bf ),

with Wf,h ∈ RNh×Nh ,Wf,x ∈ RNh×Nx and bf ∈ RNh ,

(3)

where ht−1 is the hidden state of a LSTM at time t− 1 and σ(·) represents the sigmoidactivation function. We use W and b to represent weights and biases at different gateoperations. Subsequently, the current cell state and hidden state can be written as:

Cell state: ct = ft � ct−1 + it � tanh(Wc,hht−1 + Wc,xxt + bc),

Hidden state: : ht = ot � tanh(ct),(4)

where Wc,h ∈ RNh×Nh , Wc,x ∈ RNh×Nx , bc ∈ RNh , � is the element-wise product andtanh(·) is the hyperbolic tangent activation function. The hidden state ht summarisesthe information from past states and current observations, and the gating mechanismefficiently addresses the vanishing gradient problem.

4.3. Attention Mechanism

The Attention Mechanism (Bahdanau, Cho, and Bengio 2014) is heavily used in machinetranslation and is proposed to solve the problem of diminishing performance for longinput sequences. On the one hand, a LSTM calculates the final output as a function ofonly the last hidden state. An attention model, on the other hand, with an additionalcomponent called context vector, assigns trainable weights to all the hidden states ofan input. We can write an attention mechanism for modelling many-to-one problem as:

ht = ft(ht−1,xt), (5)

8

where ht can be the hidden state from a LSTM at time t for an input x1:T , and wedefine the context vector cT as:

Convext vector: cT =

T∑

t=1

α(t, T )ht,

Attention weights: α(t, T ) =exp(e(t, T ))

∑Tt=1 exp(e(t, T ))

,

Score: e(t, T ) = vT tanh(Whht),

(6)

where v ∈ RNh and Wh ∈ RNh×Nh are the trainable weights. We can then obtain theattention vector:

aT = f(cT ,hT ) = tanh(Wc[cT ;hT ]), (7)

where the final output is a function of cT , taking information at every hidden stateinto account.

5. Experiments

5.1. Descriptions of Datasets

Our datasets consist of MBO data for five highly liquid stocks, Lloyds (LLOY), Barclays(BARC), Tesco (TSCO), BT and Vodafone (VOD), for the entire year of 2018 fromthe London Stock Exchange. From the MBO data one can derive LOB data whichwe use for our benchmarks and for references prices. Our LOB dataset contains askand bid information for an order book up to ten levels. For our modelling we removemessages outside ten levels from the MBO data to align the timestamps of two datasetsallowing for fair comparisons in the performance analysis. Afterwards, we train twosets of models by separately using the MBO and LOB data with the same targets.A direct comparison can be then made to compare predictive performance using theMBO and LOB data respectively.

For each trading day, we take the data between 08:30:00 and 16:00:00, restrictingourselves to liquid continuous trading hours, excluding any auctions. Overall, we havemore than 169 million samples in our dataset and we take the first 6 months as trainingdata, the next 3 months as validation data and the last 3 months as testing data.In the context of high-frequency microstructure data, we have more than 46 millionobservations in our testing set, providing sufficient scope for verifying the robustnessand generalisability of model performance.

We test our models at three prediction horizons (k = 20, 50, 100) and list thechoices of label parameter (α) in Table 2. We choose α for each instrument to have abalanced training set and the proportion of different classes is presented in Figure B1in Appendix B. Overall, the labels are roughly balanced for the testing set as well(noting that those were fixed on the training set). In terms of the lookback window(T ) of the input, we take the 50 most recent updates of MBO data to form a singleinput and feed it to our model. Note that we are working with tick time instead ofphysical clock time. In other words, the notation of time step refers to the arrival ofMBO updates. One advantage of working with tick time is to deal with uneven tradingvolumes throughout a day. When a market opens with great volatility, we obtain more

9

Table 2. Label parameters (α) for different predictionhorizons and instruments (units in 10−4).

LLOY BARC TSCO BT VOD

k = 20 0.25 0.35 0.10 0.40 0.22k = 50 0.50 0.65 0.70 0.70 0.45k = 100 0.75 0.95 1.20 1.00 0.70

ticks and the model naturally makes faster predictions.

5.2. Training Procedure

For the MBO data, we study the deep learning models (MBO-MLP, MBO-LSTM andMBO-Attention) introduced in Section 4 along with a simple linear model (MBO-LM).We list the values of hyperparameters for different algorithms in Table 3, and theGradient descent with the Adam optimiser (Kingma and Ba 2015) is used for trainingall models. The complete search space of hyperparameters is included in Appendix Aand we use a grid-search method to select best hyperparameters.

For the LOB data, we include the 10 levels of a limit order book and past 50observations as a single input. We follow the normalisation scheme in Zhang, Zohren,and Roberts (2019a) and both the MBO and LOB datasets share the same predictivetargets, allowing a direct comparison between different models. We choose state-of-artnetwork architectures as comparison models, including the LOB-LSTM (Sirignanoand Cont 2019), LOB-CNN (Tsantekidis et al. 2017) and LOB-DeepLOB (Zhang,Zohren, and Roberts 2019a). The details of the network architecture and choices ofhyperparameters can be found in their papers. We also include a linear model (LOB-LM)and a multilayer perception (LOB-MLP) as benchmark models.

We use categorical cross-entropy loss as our objective function and the learning isstopped when the validation loss does not decrease for more than 10 epochs. In general,it takes about 30 epochs to finish model training. TensorFlow and Keras (Girija 2016)are used to build all models and four NVIDIA GeForce RTX 2080 are used in ourexperiment.

Table 3. Choices of hyperparameters.

# oflayers

# ofunits

Learningrate

Batchsize

# ofparameters

LM - - 0.0001 128 903MLP 1 64 0.0001 128 19459LSTM 2 64 0.0001 128 51907Attention 2 64 0.0001 128 72067

5.3. Experimental Results

Table 4 summarises the results for all models studied (different rows) and one suitablefor each different prediction horizons. We use four evaluation metrics (different columns)to make comparisons: Accuracy, Precision, Recall and F1-score. Kolmogorov-Smirnov(Massey Jr 1951) tests are used to check the statistical significance of results and alldifferences in evaluation metrics are significant.

We observe that the models trained with LOB data are comparable, but slightlyoutperform the ones using MBO data. While a priori, MBO data contains moreinformation (contents of level and trades), it is harder to model the raw messages rather

10

Table 4. Experimental results for different prediction horizons (k).

Model Accuracy % Precision % Recall % F1 %

Prediction horizon k = 20

MBO-LM 41.81 41.16 41.81 34.97MBO-MLP 47.12 46.17 47.12 46.46MBO-LSTM 61.94 61.60 61.94 61.75MBO-Attention 61.19 62.83 61.19 61.73

LOB-LM 45.71 43.44 45.71 42.38LOB-MLP 50.06 50.04 50.06 46.89LOB-LSTM 66.09 67.53 66.09 66.68LOB-CNN 63.39 67.31 63.39 64.64LOB-DeepLOB 68.73 68.16 68.73 68.40

Ensemble-MBO 62.35 62.92 62.35 62.56Ensemble-LOB 67.97 68.74 67.97 68.31Ensemble-MBO-LOB 68.95 69.10 68.95 69.02

Prediction horizon k = 50

MBO-LM 41.88 38.42 41.88 36.57MBO-MLP 46.39 43.07 46.39 42.33MBO-LSTM 58.84 59.65 58.84 59.18MBO-Attention 59.31 56.10 59.31 56.88

LOB-LM 46.97 44.34 46.97 41.13LOB-MLP 50.56 48.46 50.56 47.25LOB-LSTM 64.49 64.88 64.49 64.65LOB-CNN 64.77 62.55 64.77 63.26LOB-DeepLOB 66.12 64.37 65.38 64.79

Ensemble-MBO 60.03 58.45 60.03 59.05Ensemble-LOB 65.95 64.72 65.95 65.23Ensemble-MBO-LOB 66.17 64.78 66.17 65.34

Prediction horizon k = 100

MBO-LM 41.27 34.23 41.27 35.05MBO-MLP 44.19 42.70 44.19 40.29MBO-LSTM 57.96 54.10 57.96 54.79MBO-Attention 56.36 53.66 56.36 53.75

LOB-LM 46.19 43.29 46.19 41.80LOB-MLP 48.36 47.39 48.36 43.66LOB-LSTM 61.27 58.47 62.82 57.97LOB-CNN 61.78 56.91 61.78 55.40LOB-DeepLOB 62.82 60.94 61.27 61.10

Ensemble-MBO 56.62 54.85 56.62 55.48Ensemble-LOB 63.25 59.41 63.25 60.56Ensemble-MBO-LOB 63.75 60.01 63.75 61.82

than LOB snapshots when can be seen as derived or handcrafted features from the MBOdata. What is encouraging is that we are able to obtain comparable performance bymodelling the raw messages directly. Furthermore, if we look at the Pearson correlationbetween predictive signals in Figure 3, we can see that predictive signals from the MBOdata are less correlated with LOB’s signals. This means that we were indeed able toextract different information from the MBO data. It also suggests that a combinationof two signals, from MBO and LOB data, can benefit from diversification that reducessignal variance given the low correlation.

To verify this statement, we include three ensemble models in our experiment,where Ensemble-MBO is obtained from MBO-LSTM and MBO-Attention; Ensemble-LOB is from LOB-LSTM, LOB-CNN, and LOB-DeepLOB; and Ensemble-MBO-LOBcombines Ensemble-MBO and Ensemble-LOB. A equal weighting scheme is used toconstruct ensemble models and we can observe that ensemble approaches, in general,

11

MBO-LM

MBO-M

LP

MBO-LS

TM

MBO-At

tentio

n

LOB-L

M

LOB-M

LP

LOB-L

STM

LOB-C

NN

LOB-D

eepL

OB

MBO-LM

MBO-MLP

MBO-LSTM

MBO-Attention

LOB-LM

LOB-MLP

LOB-LSTM

LOB-CNN

LOB-DeepLOB

1 0.26 0.14 0.14 0.017 0.04 0.099 0.097 0.1

0.26 1 0.33 0.35 0.092 0.14 0.29 0.28 0.29

0.14 0.33 1 0.8 0.21 0.29 0.64 0.62 0.66

0.14 0.35 0.8 1 0.21 0.3 0.65 0.63 0.67

0.017 0.092 0.21 0.21 1 0.49 0.26 0.27 0.24

0.04 0.14 0.29 0.3 0.49 1 0.37 0.36 0.36

0.099 0.29 0.64 0.65 0.26 0.37 1 0.86 0.83

0.097 0.28 0.62 0.63 0.27 0.36 0.86 1 0.79

0.1 0.29 0.66 0.67 0.24 0.36 0.83 0.79 1

MBO-LM

MBO-M

LP

MBO-LS

TM

MBO-At

tentio

n

LOB-L

M

LOB-M

LP

LOB-L

STM

LOB-C

NN

LOB-D

eepL

OB

1 0.22 0.1 0.11 0.033 0.051 0.11 0.11 0.11

0.22 1 0.26 0.25 0.085 0.12 0.25 0.24 0.25

0.1 0.26 1 0.76 0.23 0.32 0.7 0.68 0.69

0.11 0.25 0.76 1 0.22 0.31 0.65 0.64 0.65

0.033 0.085 0.23 0.22 1 0.6 0.28 0.27 0.27

0.051 0.12 0.32 0.31 0.6 1 0.41 0.39 0.4

0.11 0.25 0.7 0.65 0.28 0.41 1 0.87 0.85

0.11 0.24 0.68 0.64 0.27 0.39 0.87 1 0.83

0.11 0.25 0.69 0.65 0.27 0.4 0.85 0.83 1

MBO-LM

MBO-M

LP

MBO-LS

TM

MBO-At

tentio

n

LOB-L

M

LOB-M

LP

LOB-L

STM

LOB-C

NN

LOB-D

eepL

OB

1 0.29 0.11 0.077 0.041 0.05 0.11 0.098 0.11

0.29 1 0.28 0.21 0.094 0.12 0.24 0.23 0.23

0.11 0.28 1 0.67 0.22 0.3 0.64 0.59 0.62

0.077 0.21 0.67 1 0.2 0.27 0.58 0.54 0.56

0.041 0.094 0.22 0.2 1 0.62 0.31 0.29 0.29

0.05 0.12 0.3 0.27 0.62 1 0.41 0.38 0.39

0.11 0.24 0.64 0.58 0.31 0.41 1 0.79 0.81

0.098 0.23 0.59 0.54 0.29 0.38 0.79 1 0.8

0.11 0.23 0.62 0.56 0.29 0.39 0.81 0.8 1

Figure 3. Pearson correlation between different predictive signals for different prediction horizons (k). Top:k = 20; Middle: k = 50; Bottom: k = 100.

improve predictive performance. In particular, Ensemble-MBO-LOB delivers the bestperformance, indicating the potential benefits of combining the MBO and LOB data.

Since this work aims to study MBO data, we focus on analysing results fromthe models trained using the MBO data. We can see that the deep learning modelsoutperform the simple linear model, suggesting the existence of nonlinear features infinancial time-series, and networks are capable of extracting such features from the rawmessages in MBO data. We observe that MBO-MLP delivers inferior results comparedto other networks. This is most liekly due to the structure of the MLP which has fullconnectivity between input and hidden units – leading MLPs to often underperformwhen compared to other networks in financial applications with low signal-to-noiseratio. MBO-LSTM and MBO-Attention all have a recurrent structure with parametersharing that enables hidden states to summarise past information and update statuswith current observations. Such a process filters unnecessary input components andnaturally models the propagation of order flow. This observation has also been reportedby Lim, Zohren, and Roberts (2019); Zhang, Zohren, and Roberts (2020b,a) wherethey find that networks with a recurrent nature deliver better results than MLPs whenmodelling financial time-series.

Figure 4 shows the normalised confusion matrices which helps to understand howmodels perform at predicting each label class. We calculate the accuracy score forevery instrument and for each testing day to understand the consistency of our results.This is summarised in the whisker plots in Figure 5. Each point in the whisker plotrepresents the accuracy score for one testing day, and we make the box represents themedian and interquartile range from these scores. We can see that the MBO-LM andMBO-MLP have large interquartile ranges, suggesting high variances in results, whileMBO-LSTM and MBO-Attention show consistent and robust results across the entiretesting period. These whisker plots allow us to understand the model performance ona daily basis to ensure the generalisability of our methods. In particular, we see thatperformance is consistent across the entire testing period and not focused on a fewdays which could be due to noise.

6. Conclusion

In this work we introduce deep learning models for Market by Order (MBO) data. Tothe best of our knowledge this is the first study of predictive modelling of MBO data

12

Down Stationary Up

Down

Stat

ionar

yUp

0.741808 0.000005 0.258187

0.676790 0.000007 0.323203

0.635519 0.000003 0.364479

Down Stationary Up

0.513840 0.175325 0.310834

0.346369 0.249395 0.404236

0.256477 0.172192 0.571331

Down Stationary Up

0.684286 0.176342 0.139372

0.295008 0.400596 0.304397

0.130519 0.173944 0.695537

Down Stationary Up

0.612744 0.242170 0.145086

0.216956 0.487972 0.295072

0.089993 0.219272 0.690734

Down Stationary Up

Down

Stat

ionar

yUp

0.641287 0.012958 0.345755

0.591262 0.015203 0.393535

0.533721 0.014162 0.452117

Down Stationary Up

0.592282 0.046523 0.361195

0.481301 0.061096 0.457603

0.365356 0.044239 0.590405

Down Stationary Up

0.638096 0.233093 0.128810

0.291836 0.364000 0.344164

0.093662 0.225520 0.680818

Down Stationary Up

0.708677 0.114317 0.177005

0.384309 0.182643 0.433048

0.152502 0.110238 0.737261

Down Stationary Up

Down

Stat

ionar

yUp

0.441086 0.000019 0.558895

0.387759 0.000005 0.612237

0.340291 0.000002 0.659707

Down Stationary Up

0.754643 0.076291 0.169066

0.653474 0.104437 0.242089

0.540255 0.104975 0.354770

Down Stationary Up

0.712187 0.104438 0.183375

0.410967 0.148166 0.440867

0.170257 0.094844 0.734899

Down Stationary Up

0.524920 0.308129 0.166952

0.284459 0.327665 0.387876

0.112656 0.234458 0.652887

Figure 4. Normalised confusion matrices for different prediction horizons (k). Top: k = 20; Middle: k = 50;Bottom: k = 100. From the left to right, we display MBO-LM, MBO-MLP, MBO-LSTM and MBO-Attention.

using data-driven techniques in the academic literature. Current academic researchin this direction is primarily focused on LOB data and we hope that this work helpsto popularise the usage of MBO which we see as the next frontier in microstructuremodelling in financial data science.

We carefully introduce the structure of MBO data and demonstrate a specificnormalisation scheme that allows model training with multiple instruments using deeplearning. We consider a wide range of deep learning architectures including MLP,LSTM and attention layers. Our dataset consists of millions of sample for highlyliquid instruments from the London Stock Exchange, ensuring the consistency andgeneralisability of our methods.

We compare models trained using MBO and LOB data respectively. We show that wecan obtain similar, but slightly inferior, performance by modelling raw MBO messages,when compared to modelling LOB data. While MBO data a priori contains moreinformation, it is harder to model the raw messages rather than LOBs, which can beseen as derived features of the data. Importantly, we show that our models can extractadditional information from the MBO data which is not captured by models trainedon LOB data. This means that they can add additional value as we demonstrate in anensemble approach that combines signals from the MBO and LOB data and deliversthe best performance.

In subsequent continuation of this work, we can apply MBO data to various financialapplications including market-making or trade execution. Further, the work of Briolaet al. (2021) applies Reinforcement Learning (RL) algorithms to high-frequency trading,and it would be interesting to test the effectiveness of using MBO data within a RLframework.

13

LLOY BARC TSCO BT VODStock

0.20.30.40.50.60.70.8

Accu

racy

MBO-LMMBO-MLPMBO-LSTMMBO-Attention

LLOY BARC TSCO BT VODStock

0.3

0.4

0.5

0.6

0.7

Accu

racy

MBO-LMMBO-MLPMBO-LSTMMBO-Attention

LLOY BARC TSCO BT VODStock

0.3

0.4

0.5

0.6

0.7

Accu

racy

MBO-LMMBO-MLPMBO-LSTMMBO-Attention

Figure 5. Whisker plots of daily accuracy for different prediction horizons (k). Top: k = 20; Middle: k = 50;Bottom: k = 100.

Acknowledgement(s)

The authors would like to thank members of Machine Learning Research Group at theUniversity of Oxford for their useful comments. We are most grateful to the Oxford-ManInstitute of Quantitative Finance for computing support and data access.

References

Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. 2014. “Neural machine translationby jointly learning to align and translate.” arXiv:1409.0473 .

Belcak, Peter, Jan-Peter Calliess, and Stefan Zohren. 2020. “Fast Agent-Based SimulationFramework of Limit Order Books with Applications to Pro-Rata Markets and the Study ofLatency Effects.” arXiv:2008.07871 .

Bengio, Yoshua, Patrice Simard, and Paolo Frasconi. 1994. “Learning long-term dependencieswith gradient descent is difficult.” IEEE transactions on neural networks 5 (2): 157–166.

Briola, Antonio, Jeremy Turiel, and Tomaso Aste. 2020. “Deep Learning Modeling of the LimitOrder Book: A Comparative Perspective.” Available at SSRN 3714230 .

Briola, Antonio, Jeremy Turiel, Riccardo Marcaccioli, and Tomaso Aste. 2021. “Deep Rein-forcement Learning for Active High Frequency Trading.” arXiv:2101.07107 .

14

Byrd, David, Maria Hybinette, and Tucker Hybinette Balch. 2019. “Abides: Towards high-fidelity market simulation for AI research.” arXiv:1904.12066 .

CME. 2020. “Market by Order (MBO).” https://www.cmegroup.com/education/

market-by-order-mbo.html.Dai, Zihang, Zhilin Yang, Yiming Yang, Jaime Carbonell, Quoc V Le, and Ruslan Salakhut-

dinov. 2019. “Transformer-xl: Attentive language models beyond a fixed-length context.”arXiv:1901.02860 .

Girija, Sanjay Surendranath. 2016. “TensorFlow: Large-scale machine learning on heterogeneousdistributed systems.” Software available from tensorflow. org 39 (9).

Goodfellow, Ian, Yoshua Bengio, Aaron Courville, and Yoshua Bengio. 2016. Deep learning.Vol. 1. MIT press Cambridge.

Gould, Martin D, Mason A Porter, Stacy Williams, Mark McDonald, Daniel J Fenn, andSam D Howison. 2013. “Limit order books.” Quantitative Finance 13 (11): 1709–1742.

Hamilton, James Douglas. 2020. Time series analysis. Princeton university press.Harris, Larry. 2003. Trading and exchanges: Market microstructure for practitioners. OUP

USA.Hochreiter, Sepp, and Jurgen Schmidhuber. 1997. “Long short-term memory.” Neural computa-

tion 9 (8): 1735–1780.Islam, Mohammad Rafiqul, and Nguyet Nguyen. 2020. “Comparison of Financial Models for

Stock Price Prediction.” Journal of Risk and Financial Management 13 (8): 181.Kingma, Diederik P, and Jimmy Ba. 2015. “Adam: A method for stochastic optimization.”

Proceedings of the International Conference on Learning Representations .Lim, Bryan, and Stefan Zohren. 2020. “Time series forecasting with deep learning: A survey.”

arXiv:2004.13408 .Lim, Bryan, Stefan Zohren, and Stephen Roberts. 2019. “Enhancing Time-Series Momentum

Strategies Using Deep Neural Networks.” The Journal of Financial Data Science https:

//jfds.pm-research.com/content/early/2019/09/09/jfds.2019.1.015.Massey Jr, Frank J. 1951. “The Kolmogorov-Smirnov test for goodness of fit.” Journal of the

American statistical Association 46 (253): 68–78.Mills, Terence C, and Terence C Mills. 1991. Time series techniques for economists. Cambridge

University Press.Ntakaris, Adamantios, Martin Magris, Juho Kanniainen, Moncef Gabbouj, and Alexandros

Iosifidis. 2018. “Benchmark dataset for mid-price forecasting of limit order book data withmachine learning methods.” Journal of Forecasting 37 (8): 852–866.

O’Hara, Maureen. 1995. Market microstructure theory. Vol. 108. Blackwell Publishers Cambridge,MA.

OUCH. 2020. “NASDAQ, OUCH.” http://www.nasdaqtrader.com/content/

technicalsupport/specifications/TradingProducts/OUCH4.2.pdf.Sirignano, Justin, and Rama Cont. 2019. “Universal features of price formation in financial

markets: Perspectives from deep learning.” Quantitative Finance 19 (9): 1449–1459.Tsantekidis, Avraam, Nikolaos Passalis, Anastasios Tefas, Juho Kanniainen, Moncef Gabbouj,

and Alexandros Iosifidis. 2017. “Forecasting stock prices from the limit order book usingconvolutional neural networks.” In 2017 IEEE 19th Conference on Business Informatics(CBI), Vol. 1, 7–12. IEEE.

Wallbridge, James. 2020. “Transformers for limit order books.” arXiv:2003.00130 .Zhang, Zihao, Stefan Zohren, and Stephen Roberts. 2018. “BDLOB: Bayesian deep convolutional

neural networks for limit order books.” Third workshop on Bayesian Deep Learning (NeurIPS2018), MontrA©al, Canada .

Zhang, Zihao, Stefan Zohren, and Stephen Roberts. 2019a. “DeepLOB: Deep convolutionalneural networks for limit order books.” IEEE Transactions on Signal Processing 67 (11):3001–3012.

Zhang, Zihao, Stefan Zohren, and Stephen Roberts. 2019b. “Extending Deep Learning Modelsfor Limit Order Books to Quantile Regression.” Proceedings of Time Series Workshop of the36 th International Conference on Machine Learning, Long Beach, California, PMLR 97,

15

2019 .Zhang, Zihao, Stefan Zohren, and Stephen Roberts. 2020a. “Deep learning for portfolio opti-

mization.” The Journal of Financial Data Science 2 (4): 8–20.Zhang, Zihao, Stefan Zohren, and Stephen Roberts. 2020b. “Deep reinforcement learning for

trading.” The Journal of Financial Data Science 2 (2): 25–40.

Appendix A. Complete Search Space for Hyperparameters

Linear Model (MBO-LM):

• Learning rate: [0.0001, 0.0005, 0.001]• Minibatch Size: [64, 128, 256]

Multi-layer Perceptron (MBO-MLP):

• Number of hidden layer: [1, 2, 3]• Number of neurons at each layer: [32, 64, 128]• Learning rate: [0.0001, 0.0005, 0.001]• Minibatch Size: [64, 128, 256]

Long Short-Term Memory (MBO-LSTM):

• Number of hidden layer: [1, 2, 3]• Number of neurons at each layer: [32, 64, 128]• Learning rate: [0.0001, 0.0005, 0.001]• Minibatch Size: [64, 128, 256]

MBO-Attention:

• Number of hidden layer: [1, 2, 3]• Number of neurons at each layer: [32, 64, 128]• Learning rate: [0.0001, 0.0005, 0.001]• Minibatch Size: [64, 128, 256]

16

Appendix B. Additional Results

Train Validation Test05

101520253035

Prop

ortio

n of

Cla

sses

in % 33.1 32.5

37.7

34.0 35.2

24.4

32.9 32.3

37.9DownStationaryUp

Train Validation Test05

10152025303540

Prop

ortio

n of

Cla

sses

in % 33.6 33.6

37.9

33.1 33.2

24.0

33.3 33.2

38.1DownStationaryUp

Train Validation Test05

101520253035

Prop

ortio

n of

Cla

sses

in % 33.2 33.3

37.5

33.9 34.1

25.0

32.9 32.6

37.5DownStationaryUp

Figure B1. Label class balancing for train, validation and test sets for different prediction horizons (k). Top:

k = 20; Middle: k = 50; Bottom: k = 100.

17