How Not To Be Wrong

Every year there are at least a dozen pop math/stat books that get published. Most of them try to illustrate a variety of mathematical/statistical principles using analogies/anecdotes/stories that are easy to understand. It is a safe assumption to make that the authors of these books spend a considerable amount of time thinking about the apt analogies to use, those that are not too taxing on the reader but at the same time puts across the key idea. I tend to read at least one pop math/stat book in a year to whet my “analogy appetite”. It is one thing to write an equation about some principle and a completely different thing to be able to explain a math concept to somebody. Books such as these help in building one’s “analogy” database so that one can start seeing far more things from a math perspective. The author of this book, Jordan Ellenberg, is a math professor at University of Wisconsin-Madison and writes a math column for “Slate”. The book is about 450 odd pages and gives a ton of analogies. In this post, I will try to list down the analogies and some points made in the context of several mathematical principles illustrated in the book.

Survivorship bias
- Abraham Wald’s logic of placing armor on engines that had no bullet holes
- Mutual funds performance over a long period
- Baltimore stockbroker parable
Linearity Vs. Nonlinear behavior
- Laffer curve
Notion of limits in Calculus
- Zeno’s Paradox
- Augustin-Louis Cauchy’s and his work on summing infinite series
Regression
- Will all Americans become obese? The dangers of extrapolation
- Galton Vs. Secrist – “Regression towards mediocrity” observed in the data but both had different explanations. Secrist remained in the dark and attributed mediocrity to whatever he felt like. Secretist thought the regression he painstakingly documented was a new law of business physics, something that would bring more certainty and rigor to the scientific study of commerce. But it was just the opposite. Galton on the other hand was a mathematician and hence rightly showed that in the presence of a random effect, the regression towards mean is a necessary fact. Wherever there is a random fluctuation, one observes regression towards mean, be it mutual funds, performance of sportsmen, mood swings etc.
- Correlation is non-transitive. Karl Pearson idea using geometry makes it easy to prove.
- Berkson’s fallacy – Why handsome men are jerks? Why popular novels are terrible?

Law of Large numbers
- Small school vs. Large school performance comparison
Partially ordered sets
- Comparing disasters in human history
Hypothesis testing + “P value” + Type I error ( seeing a pattern where there is none) + Type II error(missing a pattern when there is one)
- Experimental data from dead fish fMRI measurement: Dead fish have the ability to correctly assess the emotions the people in the pictures displayed. Insane conclusion that passes statistical tests
- Torah dataset (304,8500 letter document) used by a group of researchers to find hidden meanings beneath the stories, genealogies and admonitions. Dangers of data mining.
- Underpowered test : Using binoculars to detect moons around Mars
- Overpowered test: If you study a large sample size, you are bound to reject null as your dataset will enable you to see ever-smaller effects. Just because you can detect them doesn’t mean they matter.
- “Hot hand” in basketball : If you ask the right question, it is difficult to detect the effect statistically. The right question isn’t “Do basket players sometimes temporarily get better or worse at making shots? – the kind of yes/no question a significance test addresses. { Null – No “hothand”, Alternate : “Hot hand” } is an underpowered test . The right question is “How much does their ability vary with time, and to what extent can observers detect in real time whether a player is hot”? This is a tough question.
- Skinner rejected the hypothesis that Shakespeare did not alliterate!
- Null Hypothesis Significance testing, NHST,is a fuzzy version of “Proof by contradiction”
- Testing whether a set of stars in one corner of a constellation (Taurus) is grouped together by chance?
- Parable by Cosma Shalizi : Examining the livers of sheep to predict about future events. Very funny way to describe what’s going with the published papers in many journals
- John Ioannidis Research paper “Why most Published Researched Findings Are False”?
- Tests of genetic association with disease – awash with false positives
- Example of a low powered study : Paper in Psychological science( a premier journal) concluded that “Married woman were more likely to support Mitt Romney when they were in the fertile portion of their ovulatory cycle”!
- Low powered study is only going to be able to see a pretty big effect. But sometimes you know that the effect, if it exists, is small. In other words, a study that accurately measures the effect of a gene is likely to be rejected as statistically insignificant, while any result that passes the pvalue test is either a false positive or a true positive that massively overstates the effect
- Uri Simonsohn, a professor at Penn brilliantly summarizes the problem of replicability as “p-hacking”(somehow getting it to the 0.05 level that enables one to publish papers)

In 2013, the association for Psychological science announced that they would start publishing a new genre of articles, called Registered Replication Reports. These reports aimed at reproducing the effects reported in widely cited studies, are treated differently from usual papers in a crucial way: The proposed experiment is accepted for publication before the study is carried out. If the outcomes support the initial finding, great news, but if not they are published anyway so that the whole community can know the full state of the evidence.
Utility of Randomness in math
- “Bounded gaps” conjecture: Is there a bound for the gap between two primes? Primes get rarer and rarer as we chug along integer axis. Then what causes the gap to be bounded?
- How many twin primes are there in the first N numbers (Among first N numbers, about N/log N are prime)?
- Mysteries of prime numbers need new mathematical ideas that structure the concept of structurelessness itself
How to explain “Logarithm” to a kid? The logarithm of a positive integer can be thought as the number of digits in the positive number.
Forecast performance
- Short term weather forecasts have become a possibility, given the explosion of computing power and big data. However any forecast beyond 2 weeks is dicey. On the other hand, the more data and computing power you have , some problems might yield highly accurate forecasts such as prediction of the course of an asteroid. Whatever domain you work in, you need to consider where does your domain lie between these two examples, i.e. one where big data + computing power helps and the second where big data + computing power + whatever is needed does not help you get any meaningful forecast beyond a short term forecast.
· Recommendation Algorithms
- After decades of being fed with browsing data, recommendations for almost all the popular sites suck
- Netflix prize, an example that is used by many modern Machine learning 101 courses It took 3 years of community hacking to improve the recommendation algo. Sadly the algo was not put to use by Netflix. The world moved on in three years and Netflix was streaming movies online, which makes dud recommendations less of a big deal.
Bayes theorem
- Which Facebook users are likely to be involved in terrorist activities? Facebook assigns a probability that each of its users is associated with terrorist activities. The following two questions have vastly different answers. You need to be careful about what you are asking.
  1. What is the chance that a person gets put on a Facebook’s list, given that they are not a terrorist?
  2. What’s the chance that a person’s not a terrorist, given that they are on Facebook list ?
- Why one must go Bayes? P(Data/Null) is what frequentist answers , P(Null/Data) is what a Bayesian answers
- Are Roulette wheels biased? Use priors and experimental data to verify the same
Expected Value
- Lottery ticket pricing
- Cash WinFall : How a few groups hijacked the Massachusetts State Lottery ? Link : Boston Globe , that explains why it turned out to be a private lottery.
- Use the additivity law of expectation to solve Buffon’s Needle problem
Utility curve
- If you miss your flight, how to quantify your annoyance level?
- Utility of dollars earned for guy moonlighting is different from that of a tenured professor
- St Petersburg paradox
Error correction coding , Hamming code, Hamming distance, Shannon’s work :
- Reducing variance of loss in Cash WinFall lottery : Choosing the random numbers with less variance is a computationally expensive problem if brute force is used. Information theory and Projective geometry could be the basis on which the successful MIT group generated random numbers that had less variance while betting.
- Bertillion’s card system to identify criminals and Galton’s idea that redundancy in the card can be quantified, were formalized by Shannon who showed that the correlation between variables reduces the informativeness of a card
Condorcet Paradox
- Deciding a three way election is riddled with many issues. There is no such thing as the public response. Electoral process defines the public response and makes peace with the many paradoxes that are inherent in deciding the public response.

Quotes from the book:

Knowing mathematics is like wearing a pair of X-ray specs that reveal hidden structures underneath the messy and chaotic surface of the world
Mathematics is the extension of common sense. Without the rigorous structure that math provides, common sense can lead you astray. Formal mathematics without common sense would turn math computations in to sterile exercise.
It is pretty hard to understand mathematics without doing mathematics. There is no royal road to any field of math. Getting your hands dirty is a prerequisite
People who go into mathematics for fame and glory don’t stay in mathematics for long
Just because we can assign whatever meaning we like to a string of mathematical symbols doesn’t mean we should. In math, as in life, there are good choices and there are bad ones. In the mathematical context, the good choices are the ones that settle unnecessary perplexities without creating new ones
We have to teach math that values precise answers but also intelligent approximation, that demands the ability to deploy existing algorithms fluently but also the horse sense to work things out on the fly that mixes rigidity with a sense of play. If we don’t do teach it that way, we are not teaching mathematics at all.
Field Medalist David Mumford: Dispense plane geometry entirely from the syllabus and replace it with a first course in programming.
“Statistically noticeable” / “Statistically detectable” is a better term than using “Statistically significant”. This should be the first statement that must be drilled in to any newbie taking stats101 course.
If gambling is exciting, you are doing it wrong – A powerful maxim applicable for people looking for investment opportunities too. Hot stocks provide excitement and most of the times that is all they do.
It is tempting to think of “very improbable” as meaning “essentially impossible”. Sadly NHST makes us infer based on “very improbable observation”. One good reason why Bayes is priceless in this aspect
One of the most painful aspects of teaching mathematics is seeing my students damaged by the cult of the genius. That cult tells students that it’s not worth doing math unless you’re the best at math—because those special few are the only ones whose contributions really count. We don’t treat any other subject that way. I’ve never heard a student say, “I like ‘Hamlet,’ but I don’t really belong in AP English—that child who sits in the front row knows half the plays by heart, and he started reading Shakespeare when he was 7!” Basketball players don’t quit just because one of their teammates outshines them. But I see promising young mathematicians quit every year because someone in their range of vision is “ahead” of them. And losing mathematicians isn’t the only problem. We need more math majors who don’t become mathematicians—more math-major doctors, more math-major high-school teachers, more math-major CEOs, more math-major senators. But we won’t get there until we dump the stereotype that math is worthwhile only for child geniuses

The book ends with a quote from Samuel Beckett