Modeling TradesThrough in a Limit Order Book
The paper, written by Ioane Muni Toke and Fabrizio Pomponio, titled, Modeling TradesThrough in a Limit Order Book Using Hawkes Processes, uses Hawkes process to examine microstructure behavior.
This paper uses Multivariate Hawkes process to model tradesthrough. The best thing about this paper is that the authors have made the dataset available for the readers so that they can work through the numbers
and get a feel of model inference. The dataset is available at dataverse. I have used the dataset from the repository, crunched numbers and have managed to replicate most of the results in the paper. I hope this feature of “Reproducible Research” becomes more widespread and authors start disseminating their datasets along with their papers. In this blog post, I will summarize the main points of the paper.
**Introduction
**The authors model tradesthrough, i.e. transactions that reach at least the second level of limit orders in an order book. Tradesthrough are very important in price formation and microstructure. Any big size order
is usually chunked and executed and hence tradesthrough may contain information. The paper has three sections. In the first section, basic summary statistics of the dataset is given. There are 1296707 time stamps in the dataset.
**Tradesthrough Summary Statistics
**If you spend sometime watching the order book, then it becomes abundantly clear that tradesthrough are the ones that stand outside the usual trading pattern. What is a tradesthrough ? An nth limit tradethrough is any trade that consumes at least one share at the nth limit available in the order book. The paper describes the tradesthrough statistics of BNP Paribas stock for 109 trading days(June 2010 to Oct 2010). The empirical findings leads one to infer the following

Tradesthrough are clustered both in physical time and in trade time.

The average waiting time between a trade and tradesthrough is more than the average waiting time between two tradesthrough

Both tradesthrough at the ask and at the bid are more closely followed in time by tradesthrough (whatever their sign), than trades at the bid and trades at the ask are

There seems to be a crossside effect of clustering of tradesthrough: a tradethrough at one side of the book will be more closely followed in time by a tradethrough on the other side of the book
Modeling and Calibration
The authors fit a bivariate Hawkes process for the tradesthrough on the ask side and bid side. There are 4 variants of the model that are tested in the paper :

Full model specification with baseline rate intensity for ask and bid tradesthrough being considered as constant

Model with no Cross excitation term with baseline rate intensity for ask and bid tradesthrough being considered as constant

Full model specification with baseline rate intensity for ask and bid tradesthrough being considered as piecewise linear function

Model with no Cross excitation term with baseline rate intensity for ask and bid tradesthrough being considered as piecewise linear function
The parameters for each of the above four models for each trading day is aggregated across 109 days and the major finding is that there is no cross excitation effect.
**Goodnessoffit
**The author perform the following two goodnessof fit tests for each of the tradesthrough processes(ask and bid) for each day:

Testing exponential distribution t for interarrivals of time changed process via standard KolmogorovSmirnov test

Testing whether the interarrivals of time changed process random variables are independent via LjungBox test
The authors conclude that univariate Hawkes process with piecewiselinear function is a better fit to the tradesthrough on the bid and ask side, than the other models considered in the paper.
The paper models the tradesthrough for BNP Paribas stock for a period of 109 days. An empirical analysis of selfexcitation and crossexcitation motivates the authors to test out multivariate Hawkes model for the tradesthrough on the bid and ask side. There are four variants of Hawkes model fitted to the data. For each of the four models, for each day, two diagnostic tests are applied to ask tradesthrough process and bid tradesthrough process, thus obtaining 4 tests per day per model. These diagnostic tests are aggregated across 109 days. The authors find that, out of the four models, the univariate Hawkes process with a piecewise linear function for base intensity seems to fit the data better.