Trade Classification : A Bayesian Approach
The paper titled, “Discerning Information from Trade Data” by David Easley, Marcos Lopez de Prado, Maureen O’Hara, gives a Bayesian framework for trade classification. The most popular method for classifying a trade as buy/sell is via “tick test”. The authors introduced Bulk Volume Classification (BVC) and empirically test the performance of it visàvis tick test. In this post, I will briefly summarize the paper :
**Introduction
**With the advent of HFT, the markets have changed completely. Order cancellations and modifications have shot up as compared to yesteryears. Execution side algos chop orders and send it across to the exchange and hence it is order flow rather than individual orders that relate to trade motivation. Also with strategies such as “persistent bidder”, where the aggressive trader uses limit orders to trade, the vital link between informed traders and aggressive traders is lost. All these implications severely undermine algos that infer trade direction from a single trade.
If one thinks about the trading intention, it is clear that it is unobservable and hence a Bayesian approach seems a very logical approach. Have a prior on the unobservable, look at the data and then formulate a posterior on the unobservable  sounds good on paper but it is inherently difficult in a practitioner’s world. Why ?

Unless the observable data is perfectly information, Bayesian calculations would yield probabilities rather than point estimates

Computing closed form solution for conditional probabilities would be complex

It is likely that the observations are not independent given the underlying unobservable data. Correlation structure may be complex
Knowing the above problems, avoiding any distributional assumptions, which is what ticktest does, also might not be a reasonable idea. The authors reason that in a noisy data world, it is likely that Bayes approach would yield a better solution, despite its crude assumptions. This paper is about testing the performance of BVC and ticktest on HF datasets. There are obviously problems in testing classification problems. No one knows the true intention. Hence the authors use three proxies :

Aggressor side of trading given by the buysell indicator flags in the data

Estimate of spreads

Permanent price effects of trades
**What’s the idea behind BVC ?
**BVC is a coarser estimate, in the sense that, it works on a time interval or volume interval, computes the standardized price change over the interval and uses “t” distribution to classify the total volume in the time bar or volume bar in to buy volume and sell volume. Why t distribution? The authors reason that it is far more flexible and parsimonious than other distribution functions. BVC procedure splits the volume in a bar equally between buy and sell volume if there is no price change from the beginning to the end of the bar. If the price increases, the volume is weighted more towards buys than sells depending on how large the price change in absolute terms is relative to the distribution of price changes.
How does one test a coarse estimate vs. a fine estimate?
BVC is based on a time interval or volume interval. Ticktest is based on single trade. How does on check the performance of one with the other. The author employ two strategies here:

Use BVC for classifying a bar that has single trade. Hence the probability of a buy trade or sell trade is assigned to a single trade and thus can be comparable to the buy or sell indicator via tickrule

Aggregate ticktest for an interval and then compare its performance with the BVC indicator for the same interval
Using simple distributional assumptions, the authors make the following remarks :

The more noisier the data, the more problematic is the ticktest.

If the data is not too noisy, then ticktest performs better than BVC on a trade by trade basis.

If all trades yield price changes of the same absolute size, varying between negative for sells and positive for buys, then the aggregate tick rule and bulk volume statistic both reveal the actual numbers of buys and sells in a bar.

The aggregate tick rule underestimates the probability of buys when buys are more likely than sells, and overestimates it when buys are less likely than sells.
**Data:
**The authors use Emini S&P futures, the Gold futures and WTI Crude Oil futures for testing the performance of BVC. The dataset for Emini S&P500 Futures contract is from November 7th 2010 to November 6th 2011. Emini S&P futures has 128 million trades. WTI Crude oil futures dataset has ~78 million trades. Gold futures dataset has ~27 million trades. All these trades are characterized by small lot sizes. All these are big datasets and the authors do not shy away from sharing their experiences dealing with such a huge dataset.
In the context of this paper, we will always refer to version 2.19, dated 12/09/11. This level 3 data was purchased directly from the CME, and was delivered as 357 zip files containing 2272 flat files. This represents about 21.6GB of compressed data, and about 220GB uncompressed. We mention these numbers to signal the difficulty of working with this data using standard commercial package
Problems arising out of using aggressor proxy
 Using 2013 data from VWAP algorithmic orders, O’Hara (2014) shows that 87% of the executed child orders were passive – meaning that a parent order to buy would largely turn into trades classified by the aggressor flag as sales. Thus, trading intentions may not be well captured by the aggressor flag.
Test results :

Tick test performance degrades as one moves from equity futures to gold futures to oil future

Accuracy ratios, as expected, are higher for the aggregate tick rule than the tick rule on a tradebytrade basis. This is due to the offsetting that occurs within bars, and the larger the bar size the greater is the accuracy rate.

Accuracy ratios are higher for volume bars than for time bars.

BVC accuracy rates are generally higher than the tradebytrade accuracy of the tick rule.

When noise is high, BVC can have smaller errors than a tick rule; when noise is low tick rules can be more accurate

BVC can produce reasonably accurate trade classifications relative to the aggressor flag, and it is most useful when there is substantial noise in the data.

BVC also req uires much less data, and so may be particularly useful for actively traded securities.

The tick rule is successful in creating a measure of the aggressor side of trading, but it is not successful in creating a measure of informed trading because the aggressor side of trading itself is not a good measure of informed trading.

BVC imbalances have greater correlation with daily price changes than tick rule imbalances have with daily price changes.

Tick rule is a reasonably good classifier of the aggressor side of trading, both for individual trades and in aggregate. Bulk volume is shown to also be reasonably accurate for classifying buy and sell trades, but unlike the tickbased approaches, it can also provide insight into other proxies for underlying information

BVC is useful in the presence of noisy data

A biggest advantage of BVC is that one need not analyze every tick. One can classify the ticks in to volume bars and can assign a buy or sell probability. This cuts down the computational effort massively

Informed trading is not well captured by the aggressor side of the trade due to advances in algorithmic and high frequency trading strategies