Markov Chains : Summary

image_thumb[1]

This book by J.R.Norris is a classic reference for Markov Chains. It is unlikely that any paper on Markov chains in the recent times does not mention this book somewhere in the paper. I will try to attempt to summarize the main chapters of the book in this post.

The book starts off by saying that Markov process are random processes that don’t retain memory. When the state-space is restricted to finite or countably many, these processes are called Markov Chains. The intro gives a great clarity for any reader in listing the kind of chains that would be dealt in the book. The following gives a basic flavor of Markov chains that are dealt in the introductory chapters:

Mareechika (Mirage)

Via Delhi-IBSEN  festival 2010 :

This play is an adaptation of Henrik Ibsen’s “Lady from the sea” and is directed by Ila Arun.

imageimage

The story is told in the folk form as the narrative through the balladeers, Bhopa and Bhopi, traditional story-tellers from Rajasthan, who use the Phad, or a painted scroll to tell their stories. The play begins with the Bhopi telling her husband that she had discovered a new phad, called ‘Ibsenji ka phad’, in her late father-in-law’s trunk. She persuades her husband to use this in their next performance, instead of the old, well-worn tales of gods and rajas. He reluctantly agrees and they start their story:

PageRank

I am hooked on to “Markov Chains” these days and they seem to fascinate me as the applications are in almost every field that I look at. I happened to go over Page Rank algo. From a Markov chain’s perspective, the algo can be summarized  in 2 steps

Step 1 : Represent a random surfer’s movement as a Markov chain

clip_image002

Here N is the total number of pages indexed by Google, Q is a transition matrix( irreducible, a periodic) that captures the transition probability of moving from one page to another , p is the page rank of the pages on the internet and alpha is adjustment factor to take in to consideration the inherent importance of a page.

Quote for the day

Real Statistics is not primarily about the Mathematics which underlies it:
common sense and scientific judgment are more important.
But, there is no excuse for not using the right Mathematics when it is available.

- David Williams

Doing Bayesian Data Analysis : Summary

image[image

Firstly, something about the puppies on the cover pagesSmile. The happy puppies are named Prior, Likelihood, and Posterior. Notice that the Posterior puppy has half-up ears, a compromise between the perky ears of the Prior puppy and the floppy ears of the Likelihood puppy. The puppy on the back cover is named Evidence. MCMC methods make it unnecessary to explicitly compute the evidence, so that puppy gets sleepy with nothing much to do. These puppy images basically summarize the essence of Bayes’ and MCMC methods.

Tech Adoption

A prediction for tech adoption in Indian markets via Celent:

image

Years ago, I read in John Naisbitt’s book that technology does not change as fast as you want / you predict. In fact things that we expect to happen always happen more slowly . The author shows a ton of examples to show how people were gung-ho about something but eventually it took years to pan out. In the case of HFT prediction too, I think the above time line is a overly optimistic estimate.

Why cling

In most of the universities in India , statistics curriculum followed at undergraduate and graduate level, is completely outdated. The content taught, is relevant to times where there were:

  • No computational capabilities - All computations had to be performed with paper and pencil.

  • No graphing capabilities - Either  All graphs had to be generated with pencil, paper, and a ruler. (And complicated graphs—such as those requiring prior transformations or calculations using the
    data—were especially cumbersome.)

Moneyball : Summary

image

This book by Michael Lewis delves in to the reasons behind the mysterious success of Oakland Athletics, one of the poorest teams in US baseball league . In a game where players are bought at unbelievable prices , where winning / losing is a matter of who’s got the bigger financial muscle, Oakland A’s go on to make a baseball history with rejected players and rookie players .

“Is their winning streak a result of random luck OR Is there a secret behind their winning streak ? “ , is a question that Michael Lewis tries to answer. Like a sleuth, the author investigates the system and the person behind the system , Oakland A’s manager Billy Beane.

How an Economy Grows & Why it Crashes : Summary

image  [image

These books are written by the Father-Son duo of Irwin & Peter Schiff, albeit at various points in time. The first book titled, ”How an economy grows and Why it doesn’t ?” was written by Irwin A Schiff in 1979 and his son has followed it up in 2010 with the second book titled,” How an economy grows and Why it crashes?”. Both books give a chance to ponder over the situation that US is in, financial crisis of 2008 which is still on with full force in 2011. I have summarized the first book here. In this post, I will try to summarize second book , that the author, Peter Schiff calls it “a riff of the original one”.

How an Economy Grows & Why it Doesn’t : Summary

image

This is a graphic novel explaining the growth of a general economy from its barebones structure to a full-fledged economic system. Through an allegory , the author shows the deep malaise in the functioning of US economy. It starts off with three men on an island Able, Baker and Charlie who merely catch fish,eat, sleep. There is no credit, no investments, no savings to begin with. Able gets a brainwave to make a fishing net that will save him some time to pursue other activities. He under consumes for a day, takes risk , comes up with a fishing net, thus managing to catch 2 fish per day instead of 1 fish. This is the first time the island has a saving and a capital equipment (net). It is also the first time when one of the islanders can do some other activity than merely taking care of survival.

The theory that would not die : Summary

image

The book by sharon bertsch mcgrayne, is about Bayes’ theorem stripped off the math associated with it. In today’s world, statistics even at a rudimentary level of analysis (not referring to research but preliminary analysis) comprises forming a prior and improving it based on the data one gets to see. In one sense modern statistics takes for granted that one starts off with a set of beliefs and improves the beliefs based on the data. When this sort of technique or thinking was first introduced, it was considered equivalent to pseudo-science or may be voodoo science. During 1700s when Bayes’ theorem came to everybody’s notice, Science was considered extremely objective, rational and all the words that go with it. On the other hand, Bayes’ was talking about beliefs and improving the beliefs based on data. So, how did the world come to accept this perspective of thinking? In today’s world, there is not a single domain that is untouched by Bayesian Statistics. In finance and technology specifically, Google Search engine algos + Gmail spam filtering , Net flix recommendation service in e-tailing space , Amazon book recommendations, Black-Litterman model in finance, Arbitrage models based on Bayesian Econometrics etc. are some of the innumerable areas where Bayes’ philosophy is applied. Carol Alexander in her book on Risk Management says that, world needs Bayesian Risk Managers and remarks that , “Sadly most of risk management that is done is frequentist in nature” .

UNTITLED : Summary

image

This book is a poor imitation of Steven Pressfield’s book , “Do the work”.The author is a professional actor who seemed to have settled in to some creative director gig at a Church. The book is called “Untitled” and metaphorically refers to the blank page that faces any person at the beginning of a project, be it writing an article/book/painting/creating a biz etc. Everyone has to start with a blank page. In an writer’s life though, every novel and every little story of the novel begins with a blank page and he/she must fill in characters , narrative as he goes along. In that sense, “blank page“ is a demon that a writer needs to fight every day. 

Dawkins Theory

I never had a chance to read Dawkins work and the following 5 lectures have motivated me immensely to read at least a couple of his famous books like “The Selfish gene” and “ Blind Watchmaker”.

Ep 1: Waking Up in the Universe - Growing Up in the Universe - Richard Dawkins
Ep2: Designed and Designoid Objects - Growing Up in the Universe - Richard Dawkins
Ep3: Climbing Mount Improbable - Growing Up in the Universe - Richard Dawkins
Ep4: The Ultraviolet Garden - Growing Up in the Universe - Richard Dawkins
Ep5: The Genesis of Purpose - Growing Up in the Universe - Richard Dawkins

Measuring the World : Summary

image

This book is more an historical account of two scientists (Alexander von Humboldt, Carl Friedrich Gauss) than a work of fiction.Daniel Kehlmann weaves interesting fiction around these two brilliant scientists exploring their lives, their view points, their personalities that seem to be completely opposite. Alexander von Humboldt believed that knowledge comes from exploring the world, while Gauss pretty much developed everything sitting in an observatory. 

clip_image002

clip_image004

Alexander von Humboldt
1769- 1859

Pomodoro Technique Illustrated : Summary

image. I tried GTD a looooong time ago and I gave up almost immediately. It surprises me though, that there is actually an outlook plugin for GTD system and a ton of softwares based on GTD. That means there is some section of people who are finding it useful .

Anyways this book is markedly different in its premise. It is based on chunking principle. If you want to do anything , focus on it for a specific period of time, take a break and repeat it !!. 

R Graphics : Summary

image image

To visualize data efficiently, one must make a transition from a state where you use point and click interfaces (a user’s view) to a state where you can code your own graphic( a developer’s view). At the outset, “User’s view” is very appealing as you can use a cutesy GUI to draw some graphs etc. However if you have discover something in the data, this “User’s view” is of no use and one needs to acquire developer skills to visualize graphics. The book is about a package “grid”, written by Prof.Paul Murrell. Well, one might think that base graphics is good enough. Why go in for grid ? Well, the power of grid lies in the fact that it gives complete control of various graphic elements, their position and their characteristics. Let’s say you have a scatterplot which is produced by base R. By using Grid package, you can rebuild the entire graphic piece by piece. This in itself is not the purpose of grid package, but it shows that any graphic that you have seen in base graphics/ lattice can be updated, created, modified at your heart’s content. This book gives a very detailed description of the grid package to create and explore various graphics.

Accidental Genius : Summary

image

One of my favorite quotes about writing/composition is,

“We write to know about that which we know”

- Grace Paley

By writing down the stuff that you have in your mind, you bring some structure in to your thought process and this structure in turn helps you think better.

Most of the times there is an internal editor in our mind which kind of censors our thoughts in forming connections. As everyone knows, the best ideas usually arise by combining themes from seemingly diverse fields. I guess this internal editor hinders us from connecting stuff unless we overwhelm it with some kind of constraint / create a situation where its power becomes shallow. There are many ways to escape from this internal editor.Is it any wonder that some of the best ideas we get are when we are away from work ,be it traveling/ running / playing some sport/ relaxing and thinking freely outside of our work / debating about something with a friend, etc. Internal editor is damn good when we are planning to execute stuff. But to ideate, I guess one must seek activities that are far removed from the work context. What does this book say ? Well firstly, something about the title. The author is a positioning consultant and no wonder the title smells like a marketing ploy. The book is about writing and a particular form of writing called FREEWRITING. What is freewriting ? Freewriting is a certain style of writing that we can use to get all our random/chaotic/semi-structured/exploratory thoughts on paper without our internal editor coming in our way.

Writing to Learn Math

I was introduced to writing math at a very late stage of my education. Somehow math meant solving bunch of problems, understanding theorems / proving theorems or translating math problems in to an algorithm that could be solved numerically on a computer. What has “Writing”, as in writing in plain English got to do with learning mathematics ?

My first brush towards this activity happened accidentally while I was teaching as an adjunct. One of the senior lecturers was preparing an assignment for the class with no problems/no equations to solve / nothing to calculate. I was curious about the assignment and asked her , the reason for giving such an assignment. That’s when I learnt that there is a ton of stuff that can be learnt about math, by writing about math in one’s own words. What are the kinds of writing that can be done ?

Writing Tips

Stumbled on to this piece via 13 Writing Tips- Chuck Palahniuk :


Twenty years ago, a friend and I walked around downtown Portland at Christmas. The big department stores: Meier and Frank… Fredrick and Nelson… Nordstroms… their big display windows each held a simple, pretty scene: a mannequin wearing clothes or a perfume bottle sitting in fake snow. But the windows at the J.J. Newberry’s store, damn, they were crammed with dolls and tinsel and spatulas and screwdriver sets and pillows, vacuum cleaners, plastic hangers, gerbils, silk flowers, candy - you get the point. Each of the hundreds of different objects was priced with a faded circle of red cardboard. And walking past, my friend, Laurie, took a long look and said, “Their window-dressing philosophy must be: ‘If the window doesn’t look quite right - put more in’.”

2009 KDD Cup entry – Model Description

http://www.vcasmo.com/swf/vcasmo.swf

Key Steps :

  1. Did not use R for data import operation - Used SPSS to read the data
  2. Feature Selection - Used R in this step
  3. Data Cleaning - Treatment of Categorical variables was a problem

Software used : SAS + R

Techniques used :  Gradient Boosting machine(gbm package)

Rationale :

  • Handling of missing values
  • Robustness against extreme values
  • Handling categorical and continous variables
  • Models interaction between predictors
  • Can model nonlinear dependencies

Fitting Time :  Couple of hours on a desktop

Quote for the day

“ When that time comes, I try to be alone and silent for several hours; I need a lot of time to rid my mind of the noise outside and to cleanse my memory of life’s confusion. I light candles to summon the muses and guardian spirits. I place flowers on my desk to intimidate tedium and the complete works of Pablo Neruda beneath the computer with the hope they will inspire me by osmosis. If computers can be infected with a virus there’s no reason why they shouldn’t be refreshed by a breath of poetry. In a secret ceremony I prepare my mind and soul to receive the first sentence in a trance, so the door may open slightly and allow me to peer through and perceive the hazy outlines of the story waiting for me.”

R Cookbook : Summary

image

Books on R are tricky to read especially when the sheer amount of things that R can do is mind-boggling. So, there are books that range from very specialized to very generic and there is no choice but to refer this gigantic range of collection based on one’s needs. The flip side to this vast amount of stuff is, “it is likely that a first timer would fail to see the forest for the trees”.

Fifty Days of Solitude : Summary

image

It has started raining in Mumbai and the pleasant climate after three months of scorching heat, enlivens the spirit. Will attempt to write a few words about this book.

Since I have been staying alone for the past few years in Mumbai, I have gone back to “Sitar” which I could not practice in NY for a couple of reasons: Firstly, I missed the space needed for practicing any instrument. Staying with two other guys in a flat was not particularly conducive to playing an instrument without distractions. You have to actively seek out “No Distraction time” so that you don’t cause any distraction to others -:) . Well, work + academics + programming left me with little energy to pursue “Sitar”. Secondly, I could not afford a teacher. Self-study / Self-training in any field requires you to be skillful up to a certain level/ be an apprentice for some time, after which you can be in a cruise-control mode in exploring stuff. I wasn’t anywhere close to that stage and a Sitar teacher was imperative to my practice. Some quick calls to a craigslisters revealed that they were too costly for me to even think of regular classes. Thanks to my decision to head back to India, I found two things in my life needed to play any instrument, a certain level of solitude & a teacher. The former, I deliberately opted for (don’t know how long I can be in this state), the latter,”finding a teacher”, happened out of a chance conversation with someone.

Probability Theory and its Applications ( Volume I ) - Feller

image image

A classic is a book that has never finished saying what it has to say –  Italo Calvino.

These words definitely apply to Feller’s books. Both the volumes by Feller on probability can be considered as classics. The first book deal with the discrete variables while second is far advanced as it deals with measure theory and continuous variables. Markov processes are one of the main sources for Martingales and I had to go back to Feller to work on Markov Processes and managed to go over the entire book instead of only looking up stuff on Markov Processes. Let me attempt to various probability topics that are covered in Volume I. This post is going to serve mainly as my reference to various ideas and thoughts that the book brings out. This is definitely one of the longest posts that I have written till date. It is a ~5500 word summary and hence this post will definitely have a list of some of the powerful ideas in probability.

The Housekeeper and the Professor : Summary

image

The story is about an old professor and a special bond that develops between the professor and his housekeeper. The professor has a peculiar problem that he cannot remember anything beyond 80 minutes. His memory loss caused by a certain accident in his middle age leads to a strange life that he leads. Since he cannot remember anything beyond 80 minutes, he is forced to write short notes and pin it up on his suit so that he can glance at them everyday and remember stuff. His sister-in-law generously takes care of him and provides him with basic amenities like food , shelter and a study room. The professor despite his old age continues doing math. To take care of the cooking,cleaning and other household activities, his sister-in-law appoints a housekeeper.

Project Euler

Colin Hughes shares his story on “why he built Project Euler? ”

  • It should be Playful, bottom-up learning
  • Programming is addictive and you can get feedback on customizing other’s code.

So, the code for becoming good at a language is –:)

         do ( 

        i  <- 1
        for ( i in 1:N )  print( “ You will fail ” ) ;    
        i <- N+1
        if ( i ==  N+1 )  print ( “ You will crack it ” )

Quote for the day

          
“ Parenthood is the opiate of the masses ”
       
- Chuck Palahniuk( Author of “Choke” )

Spot on , as far as India goes  –:) .

Finite Markov Chains : Summary

image

This book is an awesome resource for understanding finite Markov chains. The book is written in Pre-Latex era and hence one has to struggle to follow the notation. Is the Struggle worth it? IT IS, if you are looking at developing a Matrix perspective towards “Discrete Stochastic Process”. IT IS NOT, if you are looking at quick and dirty formulae. The first version of the book appeared in 1960 and second version in 1976. I wasn’t even born when these books came out –:) . However like many things in life, things that age have their own charm. The book serves as a precursor to understanding Markov chain Monte Carlo method which is an essential tool for any quant.