Choosing the Best Volatility Models

The paper, “Determining the best forecasting models”, is about testing 55 models that belong to the GARCH family. If we have just one model and a straw model, it is easy to show that some statistic on the test sample that the hypothesized model is superior. How do we go about testing a set of competing models? There are many wonderful techniques in the Bayesian world. However this paper is more frequentist in nature. It uses a method called “Model Confidence Set” for deciding the best forecasting models. This is akin to forming a confidence interval for a parameter rather than a point estimate. What’s the advantage of Model Confidence Set ?

Liquidity considerations in estimating implied volatility

The paper titled, “Liquidity considerations in estimating implied volatility”, by Susan Thomas and Rohini Grover, is about a new way of constructing volatility index that is based on weighing the implied volatility of the options based on the relative spreads at various strikes. The key idea behind the paper is that there is considerable liquidity asymmetry across various strikes for the near month and mid month contracts on NIFTY options. This leads the authors to hypothesize a measure that is based on weighing implied volatilities. The other indices discussed in the paper are VXO, VVIX and EVIX. The obvious question is,” How should these indices be evaluated, given that the volatility is unobserved”?

Efficient Estimation of Volatility using High Frequency Data

The paper titled, Efficient Estimation of Volatility using High Frequency Data, is about testing a set of volatility estimators using high frequency data.  I will attempt to briefly summarize the paper. 

For a person working in the financial markets, there is not a day that goes by without hearing the word, “volatility’’. Yet, it is something that is not observed. If you assume that stocks follow some random process like a GBM, then the relevant question to ask is, “How does one estimate the diffusion parameter/process in the model?” One of the principles from classical statistics, minimal sufficient statistics, says that, for estimating the volatility, every increment in the price process is needed. This means that any discrete sampling implies loss of information.

Bootstrap method for robust inference

The paper titled, Regression analysis with many specifications, uses stationary bootstrap method to evaluate a large set of models.

In a typical data mining set up, the problem of choosing the number of covariates can be handled in many ways such as

  • Best subset selection

  • Forward or/ and backward regression

  • Forward stage wise regression

  • Lasso

  • Ridge regression

  • Combination of Lasso and Ridge regression

In each of the above cases, the assumption is that covariates are independent. In a time series setting where regressors are lagged time series, the assumption of independence is obviously weak. Hence these methods might have limited applicability. Having said that, I guess that there are researchers working in ML/DM areas who are trying to refine methods so that broad range of techniques available inthe data mining  field could be applied in econometrics.  Ok,now coming back to the paper,

Consistent High-Precision Volatility from High Frequency Data

The paper titled, “Consistent High-Precision Volatility from High Frequency Data”, talks about the trade off that one needs to make while considering sampling frequency. If you increase sampling frequency, the measurement error goes down but microstructure noise increases. If you decrease sampling frequency, the microstructure noise decreases but the measurement error goes up.

Researchers in the past have suggested the usage of 10min/20min/xmin intervals based on some visual tools that have fancy names such as volatility signature plots. The authors of the  paper argue that there is a flaw in using such tools that work on homogenized time series. If such tools are used in tick time, the sampling frequency becomes so low that it discards most of the HFD.

Mathematical Techniques in Finance : Review

image

Books on derivative pricing come in all shades and colors. Some books give a brief introduction of derivatives at a leisurely pace and then suddenly the content becomes very mathematical. There are some books that have theorems and lemmas all through. There are some books that talk about risk-neutral pricing giving very little intuition about the concept. In the gamut of books available, I think this book stands out for a couple of reasons. The fact that we all live in incomplete markets is addressed right from the beginning. This immediacy has the reader’s attention right away. If the markets are incomplete, i.e there are more state variables than the instruments, how does one hedge an exposure ? Can there be a perfect hedge ? If not, how does one compare between two or more hedging options? These are very practical questions for an options trader. An option trader intuitively knows that a perfect option hedge that is taught in a grad school is an idealistic scenario that holds good under a ton of assumptions. Real world is messy. Pick up any book where Black Scholes is derived; in 9 out 10 books, you will see measure theory as a prerequisite to understanding the content. This book though, does not to have a math heavy prerequisites as most of the book can be read with linear algebra, elementary calculus and probability knowledge. So, in a way, this book can be read by a wider audience. Even though this book uses many numerical simulations, the author also believes that

Quants: The new risk takers of finance

Via efincareers :

Quant traders working in investment banking are not happy. Squeezed by regulations that curb investment banks’ prop-trading activities and by cost-cutting that means that pre-crisis compensation packages have been consigned to history, job dissatisfaction is at an all-time low, according to industry observers.

Quantitative PhDs who would have usually gravitated towards high-paying roles in the financial sector are looking for alternative career paths, while those already working in banking are seeking to move on.

Data Dredging

Stumbled on to an interesting comment on crossvalidated which I think is a nice way to warn against using techniques such as best subset regression, forward step regression, backward step regression etc.

Wanting to know the best model given some information about a large number of variables is quite understandable. Moreover, it is a situation in which people seem to find themselves regularly. In addition, many textbooks (and courses) on regression cover stepwise selection methods, which implies that they must be legitimate. Unfortunately however, they are not, and the pairing of this situation and goal are quite difficult to successfully navigate. The following is a list of problems with automated stepwise model selection procedures (attributed to Frank Harrell, and copied from here):

Goldman’s desperate attempt

Via NYMag:

In news that is sure to stir hearts across the country, Goldman Sachs has decided to give a $15,000 annual raise to next year’s class of analysts (junior bankers typically just out of college, usually about 22 to 24 years old) as part of an effort to retain and attract top-level talent.

It’s an overdue move, and not all that surprising. Life as a young banker on Wall Street is a fairly miserable experience, and many young bankers are understandably fleeing the cramped bullpens of Manhattan for the free snacks and treadmill desks of San Francisco. Paying bankers more is one way to fight that attrition. But it’s not going to solve the biggest problem Goldman has.

High Frequency Manipulation at Futures Expiry

Here is a paper by IIMA working group that argues for automatic detection of market manipulation near expiry.

The essence of the paper is that, SEBI has to detect fraud and punish the manipulators, rather than putting measures to prevent fraud( which has proven inadequate in general). This paper is inspired by an episode in the Indian market where a group of entities resorted to manipulative trading, who were later barred by SEBI from trading in the capital markets.

An Introduction to Information Theory

image

Link to my book review

imageTakeaway:

We see/hear/talk about “Information”  in many contexts. In the last two decades or so, one can also go and make a career in the field of “Information” technology. But what is “Information” ? If someone talks about a certain subject for 10 minutes in English and 10 minutes in French, Is the “Information” same in both the instances?. Can we quantify the two instances in someway ? This book explains Claude Shannon’s remarkable achievement of measuring “Information” in terms of probabilities. Almost 50 years ago, Shannon laid out a mathematical framework and it was an open challenge for engineers to develop devices and technologies that Shannon proved as a “mathematical certainty”. This book distils the main ideas that go in to quantifying information with very little math and hence makes it accessible to a wider audience. A must read if you are curious about knowing a bit about “Information” which has become a part of every day’s vocabulary.

Why econometricians need to learn new tricks

Here is a write-up by Hal R. Varian that talks about the deluge of data that makes it imperative for econometricians to equip themselves with  “big data” skills. Some of the main points of the write-up are :

  • Machine learning techniques such as decision trees, support vector machines, neural nets, deep learning and so on may allow for more effective ways to model complex relationships

  • Data cleaning tools such as OpenRefine and Data Wrangler can be used to assist in data cleansing

A Mind for Numbers

image

This book is mainly targeted at high school / college kids who feel their learning efforts are not paying off, teachers who are on the look out for effective instruction techniques, parents who are concerned with their child’s academic results and want to do something about it.

The author of the book, Dr. Barbara Oakley, has an interesting background. She served in the US army as a language translator before transitioning to academia. She is now a professor of engineering at Oakland University in Rochester, Michigan. In her book, she admits that she had to completely retool her mind. A person who was basically in to artsy kind of work had to read hard sciences to get a PhD and do research. Needless to say the transition was a frustrating experience.  One of her research areas is neuroscience where she explores effective human learning techniques. The author claims that her book is essentially meant to demystify some of the common notions that we all have about learning.

Matrix Algebra : Theory, Computations, and Applications in Statistics

Matrix Algebra Theory  Computations and Applications in Statistics

We often come across mathematical expressions represented via matrices and assume that numerical calculations exactly happen the way expressions appear. Let’s take for example

image

These are the well known “normal equations” to compute regression coefficients. One might look at this expression and  conclude that the code that computes beta inverts the Gramian matrix XTX and then multiplies the inverse with XTy. Totally false. Why? The condition number of the Gramian matrix XTX equals square of the condition number of X. The higher the condition number of the matrix, the more numerically unstable is the solution.This is the recurrent theme of the book by James E. Gentle.

Axler revisited

image

I was looking for something in my old stack of books when I stumbled on to Sheldon Axler’s fantastic book, ‘’Linear Algebra Done Right". I have fond memories about the book. I think the last time I referred to this book was more than 3.5 years ago. Took a few hours to go over the book again. Like wine that tastes better when aged, I think some books also give the same kind of effect, at least to me. A legitimate understanding of ANY discipline needs one to have a good grip on Linear Algebra. The more randomness in the field you choose to work in, the more you will use linear algebra, as you are forever looking for an approximate solution.

Curse of Dimensionality

Our intuition does not serve well in high dimensional spaces. Hence there are few issues with using nearest neighbor methods on high dimensional data. Firstly, the methods that involve capturing a fixed neighborhood around the points gives a high variance for the fit. Secondly, if you relax the fixed neighborhood criterion and try to capture a specific number of neighbors, the methods are no longer local. Hence it pays to think through these issues on whatever dataset you are working on. You might expect low variance fit but the curse of dimensionality shows up and you get a high variance fit.

Mumbai street lamps

via humansofmumbai :

image

Studying here in Poddar galli (Abhyaas Gali Path) under a street lamp is my choice and its not only mine but for many students who come here in the night to study, the main reason being the environment and the peace which this lane gives me. This place is silent and one can study whatever one want, one don’t see the time, whenever one wants to come, one comes and there is undisturbed studying here.

Optimal Liquidation

For any sell side trader, optimal liquidation is his bread and butter. Given any client order, the trader has to compete against two forces, market impact and volatility risk.

  • Market impact : If a large trade is executed too rapidly, costs will be incurred as the trades move the market in an adverse situation

  • Volatility risk  : On the other hand, if the trade is executed too slowly, the the position is subject to risk during the time that the shares remain in the portfolio.

Google trends : Proxy in State Space modeling

Till date, I must confess that I have never read a “marketing” paper. So, when one of my friends wanted my comments on a paper that is published in JMR, June 2014, titled “Decomposing the Impact of Advertising: Augmenting Sales with Online Search Data”, I thought I might encounter a lot of marketing jargon and might be put off. Thankfully there is lot less of it in the paper.

The paper is very interesting as it uses Google trends as a proxy for consumer pre-purchase interest. Historically most of the models that have been built have never decomposed sales data in to pre purchase component and a conversion rate component. The reason being that the data relating to pre purchase activity was tough to obtain and had all sorts of problems. Thanks to Google trends, one can get all the real time data that one wants. So, the authors of the paper use Google trends data relating to automobile purchase queries and use that as a proxy for modeling the latent state variable, “consumer interest in prepurchase information”.  The authors analyze 21 cars in 4 segments and conclude some cool things which would not have been possible with out decomposing sales in to various components.

Is Deliberate Practice hyped

hype

Stumbled on to a paper titled, “Deliberate practice: Is that all it takes to become an expert?” that takes a hard look at the much talked about and written about principle of deliberate practice. Indeed there has a been cottage industry of books/ blogs/ articles that seem to propagate that “only practice matters and nothing else”. The paper does some number crunching and concludes that the whole idea might be a fancy thing just caught on. Yes, the authors who wrote about it and talked about it have made fortunes. But does it really deliver ? Reading the paper does give one a dose of reality check.

Temporal Aggregation of GARCH Processes

The paper titled, Temporal Aggregation of GARCH processes, by Drost and  Njiman is a classic paper that introduces three forms of GARCH processes: Strong form of GARCH, Semi-strong form of GARCH and Weak form of GARCH. Only the Weak form of GARCH is appropriate for connecting volatility estimates and parameters of models built at various frequencies. In most of the literature on volatility estimation from high frequency data, the authors assume Weak form of GARCH.

Spectral Analysis of Time Series Data : Summary

image

Link to book summary

imageTakeaway :

Most of the phenomena in our world are periodic in nature. Yet, econometric courses at undergraduate / graduate level inevitably start  from the time domain instead of frequency domain. This book is a great introduction for some one looking to get an overview of frequency domain analysis. All the principles are explained from a regression standpoint  in the initial chapters. Discrete Fourier Transforms are gradually introduced to connect various ideas. Both univariate and multivariate time series are dealt at length with just enough mathematical rigor.

Intraday periodicity and volatility persistence in financial markets

The paper titled, “Intraday periodicity and volatility persistence in financial markets”, by Andersen and Bollerslev is a 44 page analysis on volatility modeling and has close to 75 references. This paper is one of the widely quoted papers on intraday volatility modeling. In this post, I will give a brief summary of the main sections of the paper.

Introduction

Return volatility varies systematically over the trading day and this pattern is highly correlated with the intraday variation of trading volume and bid-ask patterns. The authors of the paper conjecture that intraday return dynamics is neglected primarily because the standard time series models of volatility have proven inadequate when applied to high frequency returns data. The paper demonstrates that the difficulties encountered by standard volatility models arise largely from the systematic patterns of average volatility across the trading day and explains a method to estimate and extract the intraday periodic component of return volatility. The datasets used by the authors pertain to intraday data and interdaily data for two assets, one from forex OTC market and the other a futures equity index contract. Most of the models used for volatility modeling on the intraday appear puzzling and in stark contrast to the aggregation studies. One does not find any relationship amongst the parameters of models build at different scales. The authors remark that

Stylized Facts

The paper titled, “Empirical properties of asset returns: stylized facts and statistical issues”, by Rama Cont presents 11 stylized facts applicable to wide set of assets that should be always in a quant’s working memory. These stylized facts should shape one’s thinking in building financial models.

Stylized facts are the statistical properties of asset prices that are common across a wide range of instruments, markets and time periods.  They are usually formulated in terms of qualitative properties of asset returns and may not be precise enough to distinguish among different parametric models. Nonetheless, these stylized facts are so constraining that it is not easy to exhibit even an adhoc stochastic process which possesses the same set of properties and one has to go to great lengths to reproduce them with a model.

Security Bid/Ask Dynamics with Discreteness and Clustering

Joel Hasbrouck in his paper, “Security Bid/Ask Dynamics with Discreteness and Clustering” , uses Gibbs sampling for estimating the parameters of a stylized market microstructure model.  For any model, there are many ways to estimate parameters. One of the common methods is the likelihood approach. Even though this approach makes sense intuitively, the computational complexity explodes as the number of parameters increase. The curse of dimensionality kicks in and hence parameters become notoriously unstable. On the other hand, estimation methods based on MCMC scale linearly with parameters. Thus MCMC becomes an important technique for dealing with curse of dimensionality.

Understanding the Kalman Filter

When I first encountered Kalman Filter technique, I was overwhelmed by the ton of approaches taken by various authors to explain it. It can be explained from an engineering vocabulary but  I wanted to understand it from a stats point of view . One typically reads either the Frequentist approach( where  Gaussian multivariate normal distribution is used to derive all the formulae) or the Bayesian approach where the usual prior-posterior stuff is used to derive Kalman Filter. Irrespective of whatever approach one comes across, unless one derives the Kalman Filter using pen and paper, it is tough to understand what’s going on.

Make it Stick : Summary

image

In today’s world, parents are extremely observant about how their children are learning. Be it academics or music or sport any other field that the child has developed a semblance of liking, the parent gives and seeks all the guidance available to make his/her kid’s learning process effective. Given the hyperconnected instant gratification world that we are all living it, Kids left to their own devices, become just that, in the literal sense. Their lives are surrounded by world of devices (cell phones, gaming consoles, ipod, ipad, etc.) and naturally they develop an affinity towards them. One doesn’t need some academic research to infer that attention spans are going down across all age groups, more so in children. In such an environment, can parents or teachers be confident that the children develops thinking and meta-thinking(thinking about how they are thinking )skills to become effective learners ?.

Standard Volatility models do work!

The paper titled, “Answering the Skeptics : Yes, Standard Volatility models do provide accurate forecasts” is a classic paper on volatility modeling by Andersen and Bollerslev.

What’s this paper about ? If you build a volatility model, How do you go about testing it ? This is the key question answered in the paper.

At a daily frequency or intraday frequency, returns do not show serial correlation. However there is a serial dependency amongst them. This needs to get reflected in one way or the other. One of the most popular models in finance are ARCH family of models. These models are characterized by two equations, mean return equation and conditional variance equation. I guess before this paper came out, many researchers had shown excellent in-sample parameter estimates for their ARCH model but failed to show the model’s forecasting power. For validating the ARCH forecasts, the common proxy of unobserved volatility is the returns squared estimate.

The Misbehavior of Markets : Summary

image

Crisis hits financial markets at regular intervals but the market participants keep assuming that they “understand the behavior” of markets and are in “total control” of the situation until the day things crash. There is an army of portfolio managers, equity research analysts, macro analysts, low frequency quants,derivative modeling quants, high frequency quants etc., all trying to understand the markets and trying to make money out of it. Do their gut /intuitive/quant models come close to how the market behaves ?.

Variance Ratio plots are not enough!

This paper is just 8 pages long but conveys an important point about random walk tests. The paper analyzes the use of variance and absolute variation as measures of volatility while testing a series for random walk. The paper suggest the following plot :

image

for different values of zeta. For zeta=1 ,one ends up using absolute variation and for zeta=2, one ends up using variance. If the time series has fat tails, it might happen that variance ratio plots do not show anything fishy. However in such cases, absolute variation plots have a higher probability of highlighting fat tails.

NonSynchronous trading

This paper by Lo and MacKinlay analyze the effects of non synchronous trading on stochastic properties. The transaction data of any asset traded in an exchange is irregularly spaced. Homogeneous time series is an artifact. Non Homogeneous time series is the reality. For example, the daily prices of securities quoted in the news papers as “closing prices” are not the prices that are exactly traded at the very last second of the market close. Some exchanges aggregate the price data based on time/ volume/ type of asset etc. and report these prices.

Returns standardized by Realized Volatility

This paper by Anderson, Bollerslev, Diebold and Labys documents an empirical finding about standardized returns.

If one needs to obtain standardized returns, the usual way is to divide the returns by volatility estimated  by ARCH, GARCH type of models. This does not eliminate fat tails though. This paper studies 10 years of high frequency returns for USD-Yen. It begins by showing that the unstandardized returns are fat tailed(obvious to everyone in today’s world). Subsequently the returns are standardized via realized volatility. The analysis shows that returns standardized by realized volatility appears slightly thin-tailed. Subsequently the authors a multivariate standardization of returns and find that  returns standardized by modeling joint behavior of realized volatility, are close to i.i.d.

Overlapping vs. Non Overlapping

Let’s say you want to compute the annualized monthly volatility of your portfolio. There are two ways to go about doing  it :

  1. Compute the monthly volatility of each month for your portfolio, average it and multiply by sqrt(12)
  2. Create a moving window to capture monthly volatility, average it, and then multiply by sqrt(12). In this case, there will many more data points that give you an estimate of monthly volatility as compared to the first case.

In the second method, one can see that overlapping intervals are used to compute volatility. This means that volatility estimate is not independent across time slots and this dependency obviously smoothens the volatility. Intuitively one knows that the volatility computed from case 2 < volatility computed from case 1.  By how much does the volatility reduce by taking overlapping returns ? The answer is not that obvious. This paper by Ulrich Muller says that the volatility of overlapping returns is approximately 2/3rd of the volatility of non overlapping returns in the case of Gaussian iid.

Street-Fighting Mathematics : Summary

image

The title is meant to convey the message that many problems in mathematics can be solved using elementary tools, more of a street fighting kind than some heavy weight combat type tools. There have been many other books in this genre that highlight the importance of smart guessing and approximations but this book is exceptional in one way - It shows that math problems like solving differentiation, integration, differential equations, etc. which are typically not dealt in pop science books, can also be solved by street fighting tools. There are many non linear differential equations too, that can be solved using the tools mentioned in the book. I guess this book is particularly targeted towards those who need to deal with some kind of math on a daily basis. What does the book contain? There are six tools mentioned in the book:

VIX computation

CBOE introduced VIX to measure the market’s expectation of 30-day volatility implied by at-the-money S&P 100 Index option prices. This was in 1993. Ten years later in 2003, CBOE with Goldman Sachs updated the VIX to reflect  a new way to measure expected volatility, one that continues to be widely used by financial theorists, risk managers and volatility traders. The new VIX is based on  S&P 500 Index and is estimated via averaging the weighted prices of SPX puts and calls over a wide range of strike prices. In 2004, CBOE introduced VIX futures and two years later, in 2006, CBOE launched VIX options. VIX options is touted to be the most successful new product in CBOE history. What’s the reason for its success ? Well, the most obvious one is that VIX futures and options act as hedge against volatility exposure.

Bootstrapping–flip side

Via Eran Raviv:

The big plus of non-parametric bootstrap is that it is strictly data-based, without any distributional assumption, the big minus is the same, it is strictly data-based.

Possible “futures” for the rate series in green and the red line is the actual realization

Bootstrap criticism

Defending HFT

For the past month, high-frequency trading has been under attack. The first volley came on a Sunday night in late March, when author Michael Lewis, introducing his new book Flash Boys on the news magazine program 60 Minutes, delivered the most perfectly succinct of all headline-grabbing comments. “The markets are rigged,” he told correspondent Steve Kroft, implying that high-frequency traders front-run the market and are cheating ordinary investors.

Since then, the imagery used to battle HFT has only grown more fanciful and over the top. Charles Schwab, founder of the brokerage firm that bears his name, called high-frequency traders a “cancer,” and Jim Kramer told his CNBC audience that “defending high-frequency trading is no different than defending the mosquito.”

Reproducible Research @ Coursera

I am a big fan of Literate programming. Seeing a course being offered on Coursera picked my interest. The whole lecture series comprises 4 lectures, each spanning an hour each. So, spending 4 to 5 hours on something that I had already learnt felt like a waste of time. However I realized that mind plays tricks on us and always gives us an illusion of mastery over something just because we are familiar with the topic. True learning happens only when we recall things from memory at regular intervals and take some tests along the way. Hence, I immersed myself for about 4 hours listening to the videos and attempting the tests. At the end of it, I am glad that I spent my time relearning.

Forget What You Know About Good Study Habits

Via NYTtimes :

Every September, millions of parents try a kind of psychological witchcraft, to transform their summer-glazed campers into fall students, their video-bugs into bookworms. Advice is cheap and all too familiar: Clear a quiet work space. Stick to a homework schedule. Set goals. Set boundaries. Do not bribe (except in emergencies).

And check out the classroom. Does Junior’s learning style match the new teacher’s approach? Or the school’s philosophy? Maybe the child isn’t “a good fit” for the school.