The Cauchy – Schwarz Master Class : Summary

You see inequalities in any discipline. In math-fin area specifically, most of the times, the inequalities are derived from a basic set of inequalities such as Cauchy’s, AM-GM, Jensen’s,Holder’s, Minkowski’s Inequalities. One typically comes across inequalities in some math course where one is asked to prove a specific inequality. Typically one solves them just like one learns the grammar and syntax of a language. Sadly ,the principles behind them are conveniently forgotten as one does not actually use then in some practical context. So, the GAP between the time one is working on basic inequalities in a course work and the time where one needs to apply them to solve a practical problem, sometimes runs in to years/ decades. Sadly , all one can do is to go back and refer some books on inequalities that most often than not , appear consisting of pages with a laundry list of inequalities. You start to wonder, “ Are there any principles behind solving inequalities?”, “Are there any common strategies to work with inequalities ?” , “Is there a connection between a specific strategy / inequality to problems in various scientific disciplines ?”. The book provides these answers in a delightful way. Math books written in conversational style are few and this is one of them.

The word , “Master Class”, in title of the book attracted me to this book, as I had come across this term and the context in “Shop class as Soul craft”.The author at the very beginning of this book says :

In the fine arts, a master class is a small class where students and coaches work together to support a high level of technical and creative excellence.This book tries to capture the spirit of a master class while providing coaching for readers who want to refine their skills as solvers of problems, especially those problems dealing with mathematical inequalities.

The book starts off with the famous Cauchy’s Inequality. The beautiful thing about inequalities is that it will give you a chance to view them from different lens. You can wear a linear algebra lens OR inner product vector space lens OR a simple algebraic lens OR functional analysis lens etc. I mean whatever your area of work, you would have come across some variant of Cauchy inequality. When I first saw the inequality, I drew parallels to two processes which have finite variance and the inequality as the link between covariance of such processes and the individual variance. So, in one way, by studying inequalities you get to view stuff from various perspectives. You never know which perspective would help you to solve a problem in your area of work. Coming to the chapter on Cauchy’s inequality, the author proves it by induction. Then the author talks about a very important principle that one must keep in mind.

Mathematical progress depends on the existence of a continuous stream of new problems, yet the processes that generate such problems may seem mysterious. To be sure, there is genuine mystery in any deeply original problem, but most new problems evolve quite simply from well established principles. One of the most productive of these principles calls on us to expand our understanding of a quantitative result by first focusing on its qualitative inferences.

The above statement appears VAGUE until you see the connection between Cauchy inequality and

and

The latter statement is an inference from Cauchy’s inequality but when applied systematically gives rise to to an additive inequality

The chapter then introduces the concept of inner product space to look at the same inequality. It shows that the inequality is nothing but an expression in inequality for the cosine of angle between two vectors in an inner product vector space.

Moving from the discrete version of the inequality to continuous form took about 6 years and it appeared in a memoir written by Victor Yacovlevich Bunyakovsky. There was no proof given to the continuous form of inequality

The continuous version of the inequality was proved brilliantly by Hermann Amandus Schwarz (1843–1921) while working on the theory of minimal surfaces. He also came out with inner product version of the inequality. One brilliant point of this chapter is the derivation of Cramer Rao bound for the MLE problem using Cauchy’s inequality. Any guy who uses stats in his work will appreciate the derivation of this bound from a simple looking Cauchy’s inequality. This introductory chapter introduces 1-Trick and Splitting-Trick that are often used in the context of proving Cauchy’s variants.

My biggest takeaway from this chapter has nothing to do with inequalities but it is about learning a method to crank out inner product spaces. I used to always wonder whenever I came across an inner product definition , about its source .”How do people come up with the inner product definition which just satisfies all the properties of inner product ?”. From this chapter, I have learnt that the source is any positive definite matrix. If you take any positive definite matrix A, then for any two vectors x and y in the vector space, you can define inner product as Y transpose times A times X and that will get you an inner product definition.

< The AM-GM Inequality >

The next type of inequalities the book focuses on are Arithmetic Mean – Geometric Mean inequality.

This is one of the most popular inequalities. It is learnt in some form or the other either in high school/ college level/ undergrad education. The proof is usually shown by mathematical induction. However the book goes on to prove using a technique “ Leap-Forward Fall-Back technique”, a technique which works in lots of cases , where the usage of mathematical induction might be tedious.

Self generalizing quality of the above inequality is used to explain Generalized AM-GM inequality

One of the challenging questions before proving any inequality is , “What elementary inequalities should be used ?”. Just because something looks like a AM-GM inequality does not mean you can use raw version of AM-GM inequality and proceed. An example of this is shown using Carleman’s inequality where the usual AM-GM fails to go anywhere. The example illustrates the principle of maximal effectiveness whereby we conspire to use our tools under precisely those circumstances when they are at their best. The chapter ends with stating the three principles from George Polya which are extremely relevant for solving inequalities

Can you solve your problem in a special case?
Can you relate your problem to a similar one where the answer is already known?
Can you compute anything at all that is related to what you would really like to compute?

< Lagrange’s Identity and Minkowski’s Conjecture >

The author then sheds light on Lagrange’s Identity and Minkowski’s conjecture. The connection between Cauchy’s inequality and Lagrange’s Identity is shown. The identity focuses on the defect in Cauchy’s inequality.

The chapter then talks about the problem of writing a non negative polynomial as a sum of squares and derives interesting stuff. It is possible for a 1d case but is not possible for any 2d case.Minkowski’s Conjecture is discussed which basically states that any non negative polynomial in more than one variable cannot be written as the sum of squares of arbitrary polynomials. This conjecture was later proved by Hilbert

< On Geometry and Sums of Squares >

I loved this chapter as it was dealing with projection operators , householder reflectors and their connection with Cauchy-Schwartz Inequality. By illustrating a simple example of how our intuition fails sometimes when we move to higher dimensional world, the author reinforces the point that we need to have mathematical objects and properties well defined so that we can operate correctly in the slippery world of n dimensions. The chapter uses projection operator to prove Cauchy-Schwartz. It also derives a tighter bound for the product of two linear forms using projection and reflection operator. For a person well versed with linear algebra stuff, this chapter will be an absolute charm!. The takeaway from using these operators is that plain Euclidean geometry helps one to deepen the understanding of inequalities. The chapter ends with deriving Cauchy-Schwartz using a different geometric model, “Space-time geometry of Einstein and Minkowski.”. I found this part difficult to understand, despite clear exposition.

The highlight of this chapter is the connection between Gram-Schmidt and Inequalities. It turns out that one can use Gram-Schmidt to derive Cauchy-Schwartz inequality and a bunch of other inequalities like Bessel, Products of Linear Forms etc. If you use linear algebra at work, you are going to love this chapter.

< Consequences of Order >

Chebyshev’s Order Inequality and Chebyshev’s popular inequality relating to tail probabilities are covered in this chapter. I have skipped the Rearrangement inequality .. Will refer to it whenever I need it. As such, my naive mind could not relate it to any problem in the math-fin context .

< Convexity – The Third Pillar >

The chapter starts off by saying that

There are three great pillars of the theory of inequalities: positivity, monotonicity, and convexity. The notions of positivity and monotonicity are so intrinsic to the subject that they serve us steadily without ever calling attention to themselves, but convexity is different. Convexity expresses a second order effect.

What is Jensen’s Inequality ?

Jensen’s inequality is something which any math-fin student comes across in option valuation fundas, This chapter talks about Jensen’s inequality and a ton of inequalities which are variants of it. The basic approach is to crack a convex function , whose properties help you prove the desired inequality. So, you got to play around and check for various convex functions. However it is not all that difficult as it sounds. Basic functions like log, exponential , trigonometric most often than not, get you home.

< Integral Intermezzo >

Inequalities with integrand signs are explored. There are two themes or approaches which are explained using specific inequalities. The first one is to dissect the continuous form of integrand in to various intervals and prove the necessary inequality. The second approach is more involved and unique. It goes by the name “ Transform-Schwartz-Invert”, which basically means that any inequality needs to transformed( may be integration by parts), apply Cauchy Schwartz and then integrate by parts again to prove an inequality. The highlight of this chapter is proving the continuous form of Jensen’s inequality and showing the linkage between Jensen’s inequality and AM-GM inequality.

The continuous form of Jensen’s inequality shows up in a lot of areas in probability, where expectation of a random variable / moments of random variable are expressed in integral form. In all those places, Jensen’s continuous inequality is priceless. This chapter also shows an amazing connection between the fact that correlation and Centered version of Schwarz’s Inequality.

< The Ladder of Power Means >

The upper bound in Cauchy’s inequality can be generalized and this is precisely that is done in the chapter.

Generalized Power means are introduced and Power mean curve is explored . The curve shown below can be an effective tool for proving quite a number of inequalities, starting from the simple HM<=GM<=AM to intricate inequalities.

By having an understanding of generalized power means, you can identify power mean inequality lurking in a problem and exploit the properties.

< Holder’s Inequality >

From a math fin point of view, this chapter is probably the most important of all , as it talks about Holder’s and Minkowski’s inequality. Both these inequalities are very useful in dealing Lebesgue spaces. Holder’s inequality states that:

In the norm format, it is easier to recognize. Whenever Holder’s inequality is used, there is inevitable talk of Minskowski’s inequality. It arises mainly to prove that p-norm used in the Holder’s inequality makes sense. The triangular inequality that is need to justify p-norm is translated to Minkowski’s inequality.

for the given context. No wonder the book is subtitled, “ The art of mathematical inequalities”.

The last part of the book covers Hilbert’s Inequality, Hardy’s Inequality , Symmetric Sums and Schur Convexity. I found the book as one of those books that is overwhelming in the first read. Not at all in the negative connotation. Your mind is bombarded with so many wonderful ways of looking at inequalities that you have to PAUSE and actually take stock of the ways to look at different inequalities. For example an inequality like,

** etc …Basically , this book will change the way you think about inequalities. You no longer stare at an inequality wondering what the hell it is supposed to imply , but you get in to pattern oriented thinking. You start making connections between inequalities and real life stuff that you come across , like a centered version of Schwarz’s inequality has a direct connection with the concept of correlation, Chebyshev’s Inequality with tail probabilities, Holder’s Inequality with norms in Lebesgue measurable space etc.

I am hoping to use this as a reference to inequalities as and when required. But more than a reference, I would definitely be going over this book a couple of more times in my working life as it equips you to connect seemingly unrelated stuff. Also the inequality tricks mentioned in the book are priceless.

Takeaway :

Inequality, from a very practical point of view. gives you a handle on the bounds of a problem. Most of the problems in the math fin area are assumed to be cracked if you are able to form effective bounds, be it option pricing / hedge ratio / No arb bounds on implied vols,etc.

This book will give you a working knowledge of inequalities by providing a thorough explanation of 4 core principles, i.e Cauchy-Schwartz Inequality , AM-GM Inequality , Jensen’s Inequality and Holder’s inequality. Once you go think through these 4 principles and understand the tricks behind extending them, you will start noticing their variants in umpteen number of situations.