This blog post is about a book titled “Grokking Algorithms” written by Aditya Bhargava

Introduction to algorithms

Binary search is a lot faster than simple search as your array gets bigger
O(logn) is faster than O(n) as it gets a lot faster once the list of items you are searching through grows
Algorithm speed isn’t measured in seconds
Algorithm times are measured in terms of growth of an algorithm
Algorithm times are written in big O notation
Common big O run times
- log time
- linear time
- log linear time
- quadratic time
- factorial time

Selection sort

Your computer’s memory is like a giant set of drawers
When you want to store multiple elements, use an array or a linked list
With an array, all your elements are stored right next to each other
Arrays allow fast reads
Linked lists allow fast inserts and deletes
With a linked list, elements are strewn all over, and one element stores the address of the next one.

Recursion

There is a base case and recursive case
Recursive is when a function calls itself
Every recursive function has two cases; the base case and the recursive case
A stack has two operations: push and pop
All function calls go onto the call stack
The call stack can get very large, which takes up a lot of memory

Quicksort

Divide and Conquer is a programming technique that can be used for a large set of problems
Identify a pivot, partition the array based on pivot and use recursion
Divide and Conquer works by breaking down into smaller and smaller pieces.
Given two algorithms with the same big O running time, one can be consistently faster than the other.

Hash tables

A hash function needs to be consistent, and should map to different numbers
Use cases
- hash tables for lookups
- preventing duplicate lookups
- use hash tables as a cache
There will be collisions for any hash function you choose and hence one has to have strategies to handle collisions
Hash function and array is called a hash table. Load factor should be small
Hash tables are good for modeling relationships from one item to another

Breadth-first search

Graph models a set of connections. Each graph is made up of nodes and edges. Node are called in-neighbors or out-neighbors if the graph is acyclic
Operations on a queue are called enqueue and dequeue. They are also called as append and popleft if you are using python lingo
One can use the Python dictionary to store adjacent nodes
Topological sort is a linear ordering of vertices in a DAG such that for every directed edge u,v vertex u appears before v in the ordering
Queues are LIFO
Stacks are FIFO
Undirected graphs don’t have arrows and the relationship goes both ways
Directed graph has arrows, and the relationship follows the direction of the arrow
Breadth-first search tells you if there’s a path from A to B
If there’s a path, breadth-first search will find the shortest path

Trees

Trees are a type of graph
One can implement searching files in a directory using breadth first or depth first
Tree is a connected, acyclic graph
Binary tree is a special type of tree where nodes can have at most two children.
Use cases of Binary trees: Huffman coding

Balanced Trees

Balanced Trees achieves a performance that is better than array for inserts and better than linked lists for search
The height of a tree affects its performance
AVL trees are a type of self-balanced BST
Splay trees are a different take on balanced BSTs. The cool thing about splay trees is if you have recently looked up an item, the next time you look it up, the lookup will be faster
B-trees are a generalized form of binary tree. They are often used for building databases
Seek time is like travel time to a grocery store. B-trees try to minimize seek time by reading more data in one go
B-trees have nodes that can have multiple keys and multiple child nodes
AVL trees are a popular type of balanced BST. Like most balanced trees, AVL trees balance themselves through rotation

Dijkstra’s algorithm

Breadth-first search is used to calculate the shortest path for an unweighted graph
Dijkstra’s algorithm is used to calculate the shortest path for a weighted graph
Dijkstra’s algorithm works when all the weights are nonnegative
If you have negative weights, use the Bellman-Ford algorithm
Think it is better one should implement it in whatever language that one is comfortable before peeking in to the implementation code given by the author

Greedy Algorithms

At each step, you pick the locally optimal solution, and in the end, you are left with locally optimal solution
Knapsack problem and way to solve it via greedy algo
Set covering problem - There is no known algorithm that solves it fast enough
Greedy algorithms are called approximation algorithms
Greedy algorithms are easy to write and fast to run, so they make good approximation algorithms
if you have an NP-hard problem, your best bet is to use an approximation algo

Dynamic Programming

This chapter starts with the poster child of DP problems - knapsack problem. It goes on to visually explain the problem can be solved via DP. Along the way of solving the problem with the help of a grid, the author shows that

changing the order of rows in DP problem does not change the solution
filling in grid column-wise instead of row-wise does not change the solution

There is also a section on longest common substring. Frankly I think I have forgotten most of the aspects of DP and may be I will revisit this chapter later. DP at the heart of it has recursion in it. I think there are many problems that can be solved once you identify certain patterns in DP. It was a while ago that I did DP. May be one of these days, I should revisit DP as those skills seem very rusty.

k-nearest neighbors

KNN algorithm for regression and classification
Quick intro to ML
Feature extraction means converting an item in to a list of numbers that can be compared
Picking good features is an important part of successful KNN algo

Where to go next

A bunch of algos are covered at a high level urging the reader to follow their curiosities

Linear regression
Inverted indexes
Fourier Transform
Parallel algorithms
- map reduce
Bloom filters and HyperLogLog
Cryptography algo
Locally sensitive hashing
Min heaps and priority queues
Linear Programming

Takeaway

First edition of this book came out a decade ago and I had read this book back then. The second edition expands on trees and other topics but the spirit of the book remains the same - explaining beginner algorithms with visuals. One would not be an expert after reading this book but one would certainly get an intuitive understanding of some of the common algos used in Computer Science. With this primer, one can then explore one’s curiosities and go deep in to any of the algo topics mentioned in the book. Have read it second time after a decade and still feels that I have walked away with fresh eyes. Definitely worth a read for anyone who is curious about algos or who wants to understand how one could use visuals to explain algos.

Grokking Algorithms - Book Review

Contents